# nvgpu (NVIDIA GPUs)<no value>
## Introduction
nvgpu is a system module for analysing activity of NVIDIA GPUs running
CUDA-based programs. It looks into the runtime length of CUDA runtime and
driver APIs of user-specified code regions at the moment, but the module
will be improved over time.

Your feedback is more than welcome! Please see [here](/contact) how you can get
in touch with us.

## Installation
### Requirements
First of all, make sure you satisfy [the Adaptyst core requirements](/install/#requirements).

nvgpu requires CUDA with CUPTI (which should be included in CUDA by default). The
earliest tested version is 12.5, but this is a guideline only: the module may work
with older toolkits as well.

If you build from source, you **also** need:
* [nlohmann-json](https://github.com/nlohmann/json)
* CMake 3.20 or newer

### Setup from source
**Short version:**
```bash
git clone -b v0.1.0-dev.2026.03a https://github.com/adaptyst/adaptyst-nvgpu
cd adaptyst-nvgpu && mkdir build && cd build
cmake ..
cmake --build .
sudo cmake --install .
```

**Long version:**

Please clone [the GitHub repository](https://github.com/adaptyst/adaptyst-nvgpu) at the tag of your choice
(it's usually the newest one from [here](https://github.com/Adaptyst/adaptyst-nvgpu/releases)) and run
```cmake <path to your repository>``` **in a separate directory** (as either non-root or root, non-root
recommended) followed by ```cmake --build .``` (as either non-root or root, non-root recommended) and
```cmake --install .``` (as root unless you run the installation for a non-system module directory).

Here are the CMake options you can use/change for nvgpu:
* ```INSTALL_PATH```: indicates the path where nvgpu should be installed (default: the value provided
by Adaptyst via ```ADAPTYST_MODULE_PATH``` in CMake, this is usually ```<user install prefix>/opt/adaptyst/modules```)

### Adaptyst Analyser module setup
**Short version:**
```bash
git clone -b v0.1.0-dev.2025.11a https://github.com/adaptyst/adaptyst-analyser-nvgpu
adaptyst-analyser adaptyst-analyser-nvgpu
```

**Long version:**

The module for Adaptyst Analyser can be found on [GitHub](https://github.com/adaptyst/adaptyst-analyser-nvgpu). As for all modules, the Adaptyst Analyser part is independent of the Adaptyst one and can be installed via ```adaptyst-analyser``` by cloning the repository at the tag of your choice (it's usually the newest one from [here](https://github.com/Adaptyst/adaptyst-analyser-nvgpu/releases)) and running ```adaptyst-analyser <path to the cloned repository>```.

## Usage
nvgpu utilises [the code regionisation feature of Adaptyst](/docs/adaptyst/running-adaptyst/#code-regionisation)
to determine what parts of your program to analyse in terms of NVIDIA GPU activity.
Once you define your code regions and add nvgpu to your system graph, the module should do
its job automatically and produce results that can be inspected e.g. in Adaptyst Analyser.

## Options
| Name                        | Type                                               | Default value         | Explanation            |
|-----------------------------|----------------------------------------------------|-----------------------|------------------------|
| cuda\_api\_type             | One of: ```runtime```, ```driver```, or ```both``` | ```both```            | CUDA API type to trace |

## Features for Adaptyst Analyser
{{< callout context="note" title="Video demonstration" icon="outline/info-circle" >}}
nvgpu is featured towards the end of the video demo [here](/docs/adaptyst/checking-results-with-adaptyst-analyser#website-navigation).
{{< /callout >}}

When you open the module window in Adaptyst Analyser, you will see the timeline
of your code regions, where the time axis is synchronised with other modules so
that you can make direct time-based comparisons.

Each region block contains a number describing the percentage of its runtime spent
on CUDA API calls. When you right-click the block, you will see a breakdown of CUDA API
function runtimes there arranged in a stack-like way (as some CUDA driver API calls
are made inside CUDA runtime API functions).
