perf
From Gentoo Wiki
perf is a tool for profiling Linux with performance counters. It can instrument CPU performance counters, tracepoints, kprobes, and uprobes (dynamic tracing). It is also capable of lightweight profiling.
Installation
USE flags
USE flags for dev-util/perf Userland tools for Linux Performance Counters
+doc
|
Build documentation and man pages. With this USE flag disabled, the --help parameter for perf and its sub-tools will not be available. This is optional because it depends on a few documentation handling tools that are not always welcome on user systems. |
+libtraceevent
|
Enable dev-libs/libtraceevent support |
+libtracefs
|
Enable dev-libs/libtracefs support |
+python
|
Add optional support/bindings for the Python language |
+slang
|
Add support for the slang text display library (it's like ncurses, but different) |
audit
|
Enable support for Linux audit subsystem using sys-process/audit |
babeltrace
|
Enable dev-util/babeltrace support |
big-endian
|
Big-endian toolchain support |
bpf
|
Enable support for eBPF features with dev-libs/libbpf |
caps
|
Use Linux capabilities library to control privilege |
capstone
|
Use dev-libs/capstone for disassembly support |
crypt
|
Add support for encryption -- using mcrypt or gpg where applicable |
debug
|
Enable extra debug codepaths, like asserts and extra output. If you want to get meaningful backtraces see https://wiki.gentoo.org/wiki/Project:Quality_Assurance/Backtraces |
gtk
|
Add support for x11-libs/gtk+ (The GIMP Toolkit) |
java
|
Add support for Java |
libpfm
|
Enable dev-libs/libpfm support |
lzma
|
Support for LZMA compression algorithm |
numa
|
Enable NUMA support using sys-process/numactl (NUMA kernel support is also required) |
perl
|
Add support for Perl as a scripting language for perf tools. |
systemtap
|
Add support to define SDT event in perf tools. |
tcmalloc
|
Use the dev-util/google-perftools libraries to replace the malloc() implementation with a possibly faster one |
unwind
|
Use sys-libs/libunwind for frame unwinding support. |
zstd
|
Enable support for ZSTD compression |
Emerge
root #
emerge --ask dev-util/perf
Usage
System wide profile
To profile the entire system, one can simply run perf stat
, in the following example, it profiles the system for 5 seconds.
Note
The argument,
The argument,
-s 2
, is the SIGINT signal (the same one that C-c sends). Using timeout without this signal will result in no output being displayed upon exit.user $
timeout -s 2 5s perf stat
Performance counter stats for 'system wide': 119,873.70 msec cpu-clock # 23.996 CPUs utilized 608,052 context-switches # 5.072 K/sec 52,088 cpu-migrations # 434.524 /sec 54,966 page-faults # 458.533 /sec 149,286,631,432 instructions # 1.63 insn per cycle # 0.08 stalled cycles per insn 91,639,578,301 cycles # 0.764 GHz 12,251,888,547 stalled-cycles-frontend # 13.37% frontend cycles idle 19,086,150,804 branches # 159.219 M/sec 493,116,174 branch-misses # 2.58% of all branches 4.995528985 seconds time elapsed
Profiling a command
To see the profile statistics for a specific command, simply add the command after the perf stat argument. In this example, the command emerge -ep @world
will be used.
user $
perf stat emerge -ep @world
Performance counter stats for 'emerge -ep @world': 16,518.88 msec task-clock # 0.970 CPUs utilized 18,177 context-switches # 1.100 K/sec 15,651 cpu-migrations # 947.462 /sec 324,994 page-faults # 19.674 K/sec 94,669,490,050 instructions # 1.37 insn per cycle # 0.14 stalled cycles per insn 68,918,358,065 cycles # 4.172 GHz 13,046,547,573 stalled-cycles-frontend # 18.93% frontend cycles idle 18,809,738,679 branches # 1.139 G/sec 544,326,478 branch-misses # 2.89% of all branches 17.034427384 seconds time elapsed 14.761382000 seconds user 1.708188000 seconds sys