Running Adaptyst
Prerequisites
Before running Adaptyst for the first time, you need to set the maximum number of stack entries to be collected by running sysctl kernel.perf_event_max_stack=<value>
, where <value>
is a number of your choice larger than or equal to 1024. Otherwise, the off-CPU profiling will fail.
Important
Max stack sizes larger than 1024 are currently not supported for off-CPU stacks! The maximum number of entries in off-CPU stacks is always set to 1024, regardless of the value of kernel.perf_event_max_stack
.
If your machine has NUMA (non-uniform memory access), you should note that NUMA memory balancing in Linux limits the reliability of obtaining complete stacks across all CPUs / CPU cores. In this case, you must either disable NUMA balancing by running sysctl kernel.numa_balancing=0
or run Adaptyst on a single NUMA node.
Configuration files
If you haven’t provided a custom config file path when installing Adaptyst, the system-wide configuration file can be found in /etc/adaptyst.conf
. Additionally, in case you want to provide your own local settings, any values set system-wide can be overridden and any other extra values can be added by a local config file in .adaptyst/adaptyst.conf
inside your home directory. Both files follow exactly the same syntax.
If you want to use a different path for the system-wide config and/or local config at runtime, you can set the ADAPTYST_CONFIG
and ADAPTYST_LOCAL_CONFIG
environment variables respectively. Additionally, Adaptyst calls its Python scripts from inside /opt/adaptyst
by default, but this path can also be changed at runtime by setting the ADAPTYST_SCRIPT_DIR
environment variable. It is possible to vary most of these paths permanently as well when compiling Adaptyst from source, see the installation guide.
The currently-supported settings in the configuration files are shown below:Config file format with supported settings
# The local path to an installation directory of the Adaptyst-patched "perf"
# (with bin etc. directories inside). In case of no changes to the installation
# options, this is /opt/adaptyst/perf by default.
perf_path=<path>
# The local path to the CARM Tool repository. See "Cache-aware
# roofline profiling".
carm_tool_path=<path>
# The local path to a CSV file with the roofline benchmarking results produced
# by the CARM Tool. See "Cache-aware roofline profiling".
roofline_benchmark_path=<path>
Profiling
To profile your program, please run the following command as root:
adaptyst <command to be profiled>
Important
- If your command has whitespaces, you must run Adaptyst in one of these ways:
adaptyst "<command to be profiled>"
oradaptyst -- <command to be profiled>
. - If your command uses any shell-specific syntax (e.g. redirection using
<
or>
), you need to pass it through your shell as Adaptyst uses direct system calls to run programs. For example, to profileprogram123 < test.txt
run through bash, you need to executeadaptyst -- /bin/bash -c "program123 < test.txt"
.
Running Adaptyst as non-root
Adaptyst can be run as non-root as long as all of the requirements below are met:
- The Adaptyst-patched “perf” executable has the CAP_PERFMON, CAP_BPF, and CAP_IPC_LOCK capabilities set as permissive and effective (you can do it by running
setcap cap_perfmon,cap_bpf,cap_ipc_lock+ep <path to "perf">
, the default path is/opt/adaptyst/perf/bin/perf
). If you want to see kernel symbols in stack traces, the executable must also have the CAP_SYSLOG capability set as permissive and effective. - You are part of the
tracing
group. If it doesn’t exist, you must create it first. Thetracing
name is arbitrary here, you can give the group any name you want. /sys/kernel/tracing
is mounted as tracefs with permissions 750 or more lax and as thetracing
group.- Mount
/sys/kernel/tracing
in a standard way if not mounted yet (i.e. runmount -t tracefs nodev /sys/kernel/tracing
). - Once
/sys/kernel/tracing
is mounted in a standard way, remount the directory by runningmount -o remount,mode=0750,gid=<GID of the tracing group> /sys/kernel/tracing
. - If the above doesn’t work, you need to change group ownership of all contents inside
/sys/kernel/tracing
by running for examplechown -R root:tracing /sys/kernel/tracing
. You may also need to change file permissions in a similar way by running for examplechmod -R 750 /sys/kernel/tracing
. - You can also opt for automating the above in any way you like.
- Mount
Roofline profiling
By default, cache-aware roofline profiling is not performed. If you want to do this as well, see Cache-aware roofline profiling.
If you want to see what extra options you can set (e.g. an on-CPU/off-CPU sampling frequency, the quiet mode), run adaptyst --help
.Help message printed by
adaptyst --help
$ adaptyst --help
Adaptyst: a performance analysis tool
adaptyst [OPTIONS] [COMMAND...]
POSITIONALS:
COMMAND Command to be profiled (required)
OPTIONS:
-h, --help Print this help message and exit
-v, --version Print version and exit
-F, --freq UINT>0 Sampling frequency per second for on-CPU
time profiling (default: 10)
-B, --buffer UINT>0 Buffer up to this number of events before
sending data for processing (1 effectively
disables buffering) (default: 1)
-f, --off-cpu-freq UINT or -1
Sampling frequency per second for off-CPU
time profiling (0 disables off-CPU
profiling, -1 makes Adaptyst capture *all*
off-CPU events) (default: 1000)
-b, --off-cpu-buffer UINT Buffer up to this number of off-CPU events
before sending data for processing (0
leaves the default adaptive buffering, 1
effectively disables buffering) (default: 0)
-p, --post-process UINT Number of threads isolated from profiled
command to use for profilers and processing
(must not be greater than 13). Use 0 to not
isolate profiler and processing threads
from profiled command threads (NOT
RECOMMENDED). (default: 1)
-a, --address ADDRESS:PORT Delegate processing to another machine
running adaptyst-server. All results will
be stored on that machine.
-c, --codes TYPE[:ARG] Send the newline-separated list of detected
source code files to a specified
destination rather than pack the code files
on the same machine where a profiled
program is run. The value can be either
"srv" (i.e. the server receives the list,
looks for the files there, and creates a
source code archive there as well),
"file:<path>" (i.e. the list is saved to
<path> and can be then read e.g. by
adaptyst-code), or "fd:<number>" (i.e. the
list is written to a specified file
descriptor).
-s, --server-buffer UINT>0 Communication buffer size in bytes for
internal adaptyst-server. Not to be used
with -a. (default when no -a: 1024)
-w, --warmup UINT>0 Warmup time in seconds between
adaptyst-server signalling readiness for
receiving data and starting the profiled
program. Increase this value if you see
missing information after profiling (note
that adaptyst-server is also used
internally if no -a option is specified).
(default: 1)
-e, --event EVENT,PERIOD,TITLE
Extra perf event to be used for sampling
with a given period (i.e. do a sample on
every PERIOD occurrences of an event and
display the results under the title TITLE
in a website). Run "perf list" for the list
of possible events. You can specify
multiple events by specifying this option
more than once. Use quotes if you need to
use spaces.
-r, --roofline UINT>0 Run also cache-aware roofline profiling
with the specified sampling frequency per
second
-i, --filter TYPE:FILE Set stack trace filtering options.
deny:<FILE> cuts all stack elements
matching a set of conditions specified in a
given text file (use - for stdin).
allow:<FILE> accepts only stack elements
matching a set of conditions specified in a
given text file (use - for stdin).
python:<FILE> sends all stack trace
elements to a given Python script for
filtering. Unless -k is used, all filtered
out elements are deleted completely. See
the Adaptyst documentation to check in
detail how to use filtering.
-k, --mark Needs: --filter When -i is used, mark filtered out stack
trace elements as "(cut)" and squash any
consecutive "(cut)"'s into one rather than
deleting them completely
-m, --mode kernel OR user OR both
Capture only kernel, only user (i.e.
non-kernel), or both stack trace types
respectively (default: "user")
-q, --quiet Do not print anything (if set, check exit
code for any errors)
If you want to change the paths of the system-wide and local Adaptyst
configuration files, set the environment variables ADAPTYST_CONFIG and
ADAPTYST_LOCAL_CONFIG respectively to values of your choice. Similarly,
you can set the ADAPTYST_SCRIPT_DIR environment variable to change the path
where Adaptyst looks for its Python scripts.
After profiling is completed, you can check the results inside results
.
You can run adaptyst
multiple times, all profiling results will be saved inside the same results
directory provided that every adaptyst
execution is done inside the same working directory.
Structure of results
The structure of results
is as follows:
- (year)_(month)_(day)_(hour)_(minute)_(second)_(hostname)__(command): the directory for a given profiling session
- out: the directory with output logs
- perf_(record or script)_(event)_stdout.log, perf_(record or script)_(event)_stderr.log: stdout and stderr logs from perf-record/perf-script. (event) can be either “main” (on-CPU/off-CPU profiling), “syscall” (syscall profiling for tracing threads/processes), or a custom perf event specified by the user.
- stdout.log, stderr.log: stdout and stderr logs from the profiled command.
- processed: the directory with processed profiling information
- metadata.json: metadata (such as the thread/process tree and thread/process spawning stack traces) stored in JSON.
- (event)_callchains.json: mappings between compressed callchain names and uncompressed ones stored in JSON. (event) can be either “walltime” (on-CPU/off-CPU profiling), “syscall” (syscall profiling for tracing threads/processes, applicable to metadata.json), or a custom perf event specified by the user.
- (PID)_(TID).json: all samples gathered by on-CPU/off-CPU profiling and custom perf event profiling (if any) stored in JSON, per thread/process.
- event_dict.data: mappings between custom perf events and their website titles as specified by the user (it is not created when no custom events are provided).
- sources.json: mappings between executable offsets and corresponding source code files and line numbers.
- src.zip: the ZIP archive with all source code files detected by Adaptyst. This can be created manually by
adaptyst-code
if not present. - roofline.csv: if enabled, the CARM Tool roofline benchmarking results for a machine (they are not program-specific).
- out: the directory with output logs
Please note that the overall output format will change as part of implementing our roadmap. Once it happens, the documentation will be updated accordingly.
Docker
Adaptyst can run in a Docker container. For a quick start, the ready-to-use image is provided by us (see Installation).
Please note the following:
- Your container must have the CAP_PERFMON, CAP_BPF, and CAP_IPC_LOCK capabilities (and optionally CAP_SYSLOG, see “Running Adaptyst as non-root” above).
- eBPF-based context switch tracing needed for off-CPU profiling and system call tracing needed for tracing threads/processes may not work out-of-the-box. If this happens, see if running your container with
--pid=host
helps. - You may need to mount
/sys/kernel/tracing
manually, either when creating your container or inside your container. See “Running Adaptyst as non-root” above. - A user inside your container must belong to the group which owns
/sys/kernel/tracing
.
Apptainer/Singularity
Adaptyst can also run in an Apptainer/Singularity container. For a quick start, the ready-to-use image is provided by us (see Installation).
Please note the following:
- Your container must have the CAP_PERFMON, CAP_BPF, and CAP_IPC_LOCK capabilities (and optionally CAP_SYSLOG, see “Running Adaptyst as non-root” above). Make sure that Apptainer/Singularity supports this in your case (e.g. on AlmaLinux 9,
apptainer-suid
must be installed in addition to Apptainer alone). /sys/kernel/tracing
must be mounted in the container, e.g. by bind mounting in Apptainer/Singularity.
Profiled source code packaging
By default, Adaptyst looks for and packs source code files on the same machine where your profiled program is run. This behaviour can be changed by running adaptyst
with the -c
option.
You can either make adaptyst-server
look for and pack source code files instead (if the server runs on a different machine), save the list of detected source code file paths to a text file, or
print such list to a given file descriptor. In the two latter cases, you can supply the list to adaptyst-code
to create a ZIP archive with the code files. adaptyst-code
is a
Python script and has less strict setup requirements than the full Adaptyst suite ones.
The resulting ZIP archive should be placed as src.zip
inside results/<your profiling session>/processed
(see “Structure of results
” above).
Here’s an example of running Adaptyst with a custom source code packing mode:
$ ls
<initial contents>
$ adaptyst -c file:code_paths.lst -- <command>
...
$ ls
<initial contents>
code_paths.lst
results
$ adaptyst-code code_paths.lst
...
$ ls
<initial contents>
code_paths.lst
results
src.zip
$ mv src.zip results/<your profiling session>/processed/
$
If you want to see how you can configure adaptyst-code
to your needs, run adaptyst-code --help
.
Help message printed by adaptyst-code --help
usage: adaptyst-code [-h] [-o FILE] [-v] PATHS_FILE
Profiled code packaging tool for Adaptyst
positional arguments:
PATHS_FILE path to a code paths file generated by Adaptyst
(usually as code_paths.lst) or written manually with
one source code file path per line (use "-" for stdin)
options:
-h, --help show this help message and exit
-o FILE path to an output ZIP archive to be produced (src.zip
in the current directory by default, use "-" for stdout)
-v print path-processing-related errors to stderr (these
are not printed by default)