2 ===========================
9 - More Complicated Examples
10 - Differences From Linux
15 ===========================
17 ===========================
18 Akaros has limited support for perf_events. perf is a tool which utilizes CPU
19 performance counters for performance monitoring and troubleshooting.
21 Akaros has its own version of perf, similar in spirit to Linux's perf, that
22 produces PERFILE2 ABI compliant perf.data files (if not, file a bug!). The
23 kernel generates traces, under the direction of perf. You then copy the traces
24 to a Linux host and process using Linux's perf.
29 To build Akaros's perf directly:
31 (linux)$ cd tools/dev-libs/elfutils ; make install; cd -
32 (linux)$ cd tools/dev-util/perf ; make install; cd -
34 Or to build it along with all apps:
36 (linux)$ make apps-install
38 You will also need suitable recent Linux perf for the reporting of the data
39 (something that understands PERFILE2 format). Unpatched Linux 4.5 perf did the
40 trick. You'll also want libelf and maybe other libraries on your Linux
43 First, install libelf according to your distro. On ubuntu:
44 (linux) $ sudo apt-get install libelf-dev
46 Then try to just install perf using your Linux distro, and install any needed
47 dependencies. On ubuntu, you can install linux-tools-common and whatever else
48 it asks for (something particular to your host kernel).
50 Linux perf changes a lot. Newer versions are usually nicer. I recommend
51 building one of them: Download Linux source, then
53 (linux) $ cd tools/perf/
56 Then use your new perf binary. This all is just installing a recent perf - it
57 has little to do with Akaros at this point. If you run into incompatibilities
58 between our perf.data format and the latest Linux, file a bug.
63 Perf on Akaros supports record, stat, and a few custom options.
65 You should be able to do the following:
69 Then scp perf.data to Linux
71 (linux) $ scp AKAROS_MACHINE:perf.data .
72 (linux) $ perf report --kallsyms=obj/kern/ksyms.map --symfs=kern/kfs/
74 Perf will look on your host machine for the kernel symbol table and for
75 binaries. We need to tell it kallsyms and symfs to override those settings.
77 It can be a hassle to type out the kallsyms and symfs, so we have a script that
78 will automate that. Use scripts/perf in any place that you'd normally use
79 perf. Set your $AKAROS_ROOT (default is ".") and optionally override $PERF_CMD
80 ("default is "perf"). For most people, this will just be:
82 (linux) $ ./scripts/perf report
84 The perf.data file is implied, so the above command is equivalent to:
86 (linux) $ ./scripts/perf report -i perf.data
89 MORE COMPLICATED EXAMPLES
91 First, try perf --help for usage. Then check out
92 https://perf.wiki.kernel.org/index.php/Tutorial. We strive to be mostly
93 compatible with the usage of Linux perf.
95 perf stat runs a command and reports the count of events during the run of the
96 command. perf record runs a command and outputs perf.data, which contains
97 backtrace samples from when the event counters overflowed. For those familiar
98 with other perfmon systems, perf stat is like PAPI and perf record is like
101 perf record and stat both track a set of events with the -e flag. -e takes a
102 comma-separated list of events. Events can be expressed in one of three forms:
104 - Generic events (called "pre-defined" events on Linux)
108 Linux's perf only takes Generic and Raw events, so the libpfm4 is an added
111 Generic events consist of strings like "cycles" or "cache-misses". Raw events
112 aresimple strings of the form "rXXX", where the X's are hex nibbles. The hex
113 codes are passed directly to the PMU. You can actually have 2-4 Xs on Akaros.
115 Libpfm events are strings that correspond to events specific to your machine.
116 Libpfm knows about PMU events for a given machine. It figures out what machine
117 perf is running on and selects events that should be available. Check out
118 http://perfmon2.sourceforge.net/ for more info.
120 To see the list of events available, use `perf list [regex]`, supplying an
121 optional search regex. For example, on a Haswell:
123 / $ perf list unhalted_reference_cycles
124 #-----------------------------
126 PMU name : ix86arch (Intel X86 architectural PMU)
127 Name : UNHALTED_REFERENCE_CYCLES
130 Desc : count reference clock cycles while the clock signal on the specific core is running. The reference clock operates at a fixed frequency, irrespective of c
131 ore frequency changes due to performance state transitions
133 Modif-00 : 0x00 : PMU : [k] : monitor at priv level 0 (boolean)
134 Modif-01 : 0x01 : PMU : [u] : monitor at priv level 1, 2, 3 (boolean)
135 Modif-02 : 0x02 : PMU : [e] : edge level (may require counter-mask >= 1) (boolean)
136 Modif-03 : 0x03 : PMU : [i] : invert (boolean)
137 Modif-04 : 0x04 : PMU : [c] : counter-mask in range [0-255] (integer)
138 Modif-05 : 0x05 : PMU : [t] : measure any thread (boolean)
139 #-----------------------------
141 PMU name : hsw_ep (Intel Haswell EP)
142 Name : UNHALTED_REFERENCE_CYCLES
145 Desc : Unhalted reference cycles
147 Modif-00 : 0x00 : PMU : [k] : monitor at priv level 0 (boolean)
148 Modif-01 : 0x01 : PMU : [u] : monitor at priv level 1, 2, 3 (boolean)
149 Modif-02 : 0x05 : PMU : [t] : measure any thread (boolean)
151 There are two different events for UNHALTED_REFERENCE_CYCLES (case
152 insensitive). libpfm will select the most appropriate one. You can override
153 this selection by specifying a PMU:
155 / $ perf stat -e ix86arch::UNHALTED_REFERENCE_CYCLES ls
157 Here's how to specify multiple events:
159 / $ perf record -e cycles,instructions ls
161 Events also take a set of modifiers. For instance, you can specify running
162 counters only in kernel mode or user mode. Modifiers are separated by a ':'.
164 This will track only user cycles (default is user and kernel):
166 / $ perf record -e cycles:u ls
168 To use a raw event, you need to know the event number. You can either look in
169 your favorite copy of the SDM, or you can ask libpfm. Though if you ask
170 libpfm, you might as well just use its string processing. For example:
173 #-----------------------------
175 PMU name : hsw_ep (Intel Haswell EP)
181 Umask-00 : 0x01 : PMU : [DTLB_THREAD] : None : Count number of DTLB flushes of thread-specific entries
182 Umask-01 : 0x20 : PMU : [STLB_ANY] : None : Count number of any STLB flushes
183 Modif-00 : 0x00 : PMU : [k] : monitor at priv level 0 (boolean)
184 Modif-01 : 0x01 : PMU : [u] : monitor at priv level 1, 2, 3 (boolean)
185 Modif-02 : 0x02 : PMU : [e] : edge level (may require counter-mask >= 1) (boolean)
186 Modif-03 : 0x03 : PMU : [i] : invert (boolean)
187 Modif-04 : 0x04 : PMU : [c] : counter-mask in range [0-255] (integer)
188 Modif-05 : 0x05 : PMU : [t] : measure any thread (boolean)
189 Modif-06 : 0x07 : PMU : [intx] : monitor only inside transactional memory region (boolean)
190 Modif-07 : 0x08 : PMU : [intxcp] : do not count occurrences inside aborted transactional memory region (boolean)
192 The raw code is 0xbd. So the following are equivalent (but slightly buggy!):
194 / $ perf stat -e TLB_FLUSH ls
195 / $ perf stat -e rbd ls
197 If you actually run those, rbd will have zero hits, and TLB_FLUSH will give you
198 the error "Failed to parse event string TLB_FLUSH".
200 Some events actually rather particular to their Umasks, and TLB_FLUSH is one of
201 them. TLB_FLUSH wants a Umask. Umasks are selectors for specific sub-types of
202 events. In the case of TLB_FLUSH, we can choose between DTLB_THREAD and
203 STLB_ANY. Umasks are not always required - they just happen to be on my
204 Haswell for TLB_FLUSH. That being said, we can ask for the event like so:
206 / $ perf stat -e TLB_FLUSH:STLB_ANY ls
207 / $ perf stat -e r20bd ls
209 Note that the Umask is placed before the Code. These 16 bits are passed
210 directly to the PMU, and on Intel the format is "umask:event".
212 perf record is based on recording samples when event counters overflow. The
213 number of events required to trigger a sample is referred to as the
214 sample_period. You can set it with -c, e.g.
216 / $ perf record -c 10000 ls
219 DIFFERENCES FROM LINUX
221 For the most part, Akaros perf is similar to Linux. A few things are
224 The biggest difference is that our perf does not follow processes around. We
225 count events for cores, not processes. You can specify certain cores, but not
226 certain processes. Any options related to tracking specific processes are
229 The -F option (frequency) is loosely supported. The kernel cannot adjust the
230 sampling count dynamically to meet a certain frequencey. Instead, we guess
231 that -F is used with cycles, and pick a sample period that will generate
232 samples at the desired frequency if the core is unhalted. YMMV.
234 Akaros currently supports only PMU events. In the future, we may add events
235 like context-switches.
238 ===========================
240 ===========================
241 Akaros has basic support for mpstat. mpstat gives a high-level glance at where
242 each core is spending its time.
244 For starters, bind kprof somewhere. The basic ifconfig script binds it to
247 To see the CPU usage, cat mpstat:
250 CPU: irq kern user idle
251 0: 1.707136 ( 0%), 24.978659 ( 0%), 0.162845 ( 0%), 13856.233909 ( 99%)
255 / $ echo reset > /prof/mpstat
257 To see the output for a particular command:
259 / $ echo reset > /prof/mpstat ; COMMAND ; cat /prof/mpstat