0% found this document useful (0 votes)
6 views

A0-Class

The document outlines various tools for program profiling and analysis, including Pin, Valgrind, Perf, and Gprof. It discusses their functionalities, such as code instrumentation, memory error checking, performance counter statistics, and hotspot identification. An assignment is included that requires profiling and analyzing similar programs using the Intel PIN tool.

Uploaded by

Harshith Puram
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

A0-Class

The document outlines various tools for program profiling and analysis, including Pin, Valgrind, Perf, and Gprof. It discusses their functionalities, such as code instrumentation, memory error checking, performance counter statistics, and hotspot identification. An assignment is included that requires profiling and analyzing similar programs using the Intel PIN tool.

Uploaded by

Harshith Puram
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

CS701

pin, perf, valgrind, gprof

Assignment 0
Programmer Headaches

Why is the program slow?

How many page faults / context switches in the program?

Cache misses / Branch mispredictions

Where is the Hotspot in the program? Which function?

Memory leaks in the program?

gdb is too slow, working with breakpoints is cumbersome.
Outline

Pin

Valgrind

Perf

Gprof
Pin

Code instrumentation on x86 binaries
– insert arbitrary code (C or C++) in arbitrary places
in an executable

Pin

Code instrumentation on x86 binaries
Generated Executable
pintool
[[[[pin [[ pin init code ]]
pininit
initcode
code]]]] [[ pin init code ]]
[[[[analysis
analysiscode
codeforforLine
Line0]]
0]] Analysis code function() {
Analysis code function() {
Line 0
Line 0
Original binary Foreach Instruction:
Foreach Instruction:
[[[[analysis [[ analysis code for all Instructions ]]
Line 0 analysiscode
codefor
forLine
Line1]]
1]] [[ analysis code for all Instructions ]]
Line1 0
Line Line 1
Line 1 If ALU Instruction:
Line2 1
Line If ALU Instruction:
[[ analysis code for ALU Instruction ]]
...Line 2 [[[[analysis [[ analysis code for ALU Instruction ]]
...... analysiscode
codefor
forLine
Line2]]
2]] If Branch Instruction:
... Line
Line22 If Branch Instruction:
[[ analysis code for Branch Instruction ]]
[[ analysis code for Branch Instruction ]]
[[[[analysis
analysiscode
codefor
forLine
Line3]]
3]]
If Memory Access Instruction:
If Memory Access Instruction:
Line [[ analysis code for Memory Access Instruction ]]
Line33 ...
[[ analysis code for Memory Access Instruction ]]
...... ...
...... [[ pin final code ]]
[[ pin final code ]]

[[[[pin
pinfinal
finalcode
code]]]]
Pintool – Example code
...
...
////Pin
Pincalls
callsthis
thisfunction
functionevery
everytime
timeaanew
newinstruction
instructionisisencountered
encountered
VOID
VOIDInstruction(INS
Instruction(INSins,
ins,VOID*
VOID*v)v){{
////Insert
Insertaacall
callto
todocount
docountbefore
beforeevery
everyinstruction,
instruction,no
noarguments
argumentsare
arepassed
passed
INS_InsertCall(ins,
INS_InsertCall(ins,IPOINT_BEFORE,
IPOINT_BEFORE,(AFUNPTR)docount,
(AFUNPTR)docount,IARG_END);
IARG_END);
}}
VOID
VOIDdocount()
docount(){{icount++;
icount++;}}
...
...
Pin

Instruction stats, Register access patterns,
Memory access patterns, Branches stats
– Application Trace
Pin – Execution

$ pin -t pintool – binary


Pin Tools page
Assignment

Profile and analyze a family of 2 or more similar
programs using the Intel PIN tool.
– Collect instruction count, Instruction Address Trace,
Memory Reference Trace.
Valgrind

Valgrind is also an instrumentation tool.

memcheck - Check memory related errors

cachegrind - a cache and branch-prediction
profiler

callgrind - a call-graph generating cache and
branch prediction profiler
Valgrind

Valgrind is also an instrumentation tool

Can instrument most binaries – Intel, ARM,
PPC, ARM, Android on ARM

$ valgrind ./a.out

Valgrind Page, Tutorial, FAQ page.
Perf tool

Profiler tool. Performance analysis tool.

Prints out performance counters.
– Hardware counters from PMU: number of cycles,
instructions retired, L1 cache misses and so on
– Software counters - context-switches

For the full list of events do
– $ perf list
perf Example
$ perf stat -B dd if=/dev/zero of=/dev/null count=1000000
1000000+0 records in
1000000+0 records out
512000000 bytes (512 MB) copied, 0.956217 s, 535 MB/s

Performance counter stats for 'dd if=/dev/zero of=/dev/null count=1000000':


5,099 cache-misses # 0.005 M/sec (scaled from 66.58%)
235,384 cache-references # 0.246 M/sec (scaled from 66.56%)
9,281,660 branch-misses # 3.858 % (scaled from 33.50%)
240,609,766 branches # 251.559 M/sec (scaled from 33.66%)
1,403,561,257 instructions # 0.679 IPC (scaled from 50.23%)
2,066,201,729 cycles # 2160.227 M/sec (scaled from 66.67%)
217 page-faults # 0.000 M/sec
3 CPU-migrations # 0.000 M/sec
83 context-switches # 0.000 M/sec
956.474238 task-clock-msecs # 0.999 CPUs
0.957617512 seconds time elapsed
perf Assignment

generate stats with perf
GNU Profiler – gprof

Profile the program – identify hotspots, create
call graphs, ...

CS701

pin, perf, valgrind, gprof

Assignment 0
Programmer Headaches

Why is the program slow?

How many page faults / context switches in the program?

Cache misses / Branch mispredictions

Where is the Hotspot in the program? Which function?

Memory leaks in the program?

gdb is too slow, working with breakpoints is cumbersome.
Outline

Pin

Valgrind

Perf

Gprof
Pin

Code instrumentation on x86 binaries
– insert arbitrary code (C or C++) in arbitrary places
in an executable

Pin

Code instrumentation on x86 binaries
Generated Executable
pintool
[[[[pin [[ pin init code ]]
pininit
initcode
code]]]] [[ pin init code ]]
[[[[analysis
analysiscode
codeforforLine
Line0]]
0]] Analysis code function() {
Analysis code function() {
Line 0
Line 0
Original binary Foreach Instruction:
Foreach Instruction:
[[[[analysis [[ analysis code for all Instructions ]]
Line 0 analysiscode
codefor
forLine
Line1]]
1]] [[ analysis code for all Instructions ]]
Line1 0
Line Line 1
Line 1 If ALU Instruction:
Line2 1
Line If ALU Instruction:
[[ analysis code for ALU Instruction ]]
...Line 2 [[[[analysis [[ analysis code for ALU Instruction ]]
...... analysiscode
codefor
forLine
Line2]]
2]] If Branch Instruction:
... Line
Line22 If Branch Instruction:
[[ analysis code for Branch Instruction ]]
[[ analysis code for Branch Instruction ]]
[[[[analysis
analysiscode
codefor
forLine
Line3]]
3]]
If Memory Access Instruction:
If Memory Access Instruction:
Line [[ analysis code for Memory Access Instruction ]]
Line33 ...
[[ analysis code for Memory Access Instruction ]]
...... ...
...... [[ pin final code ]]
[[ pin final code ]]

[[[[pin
pinfinal
finalcode
code]]]]
Pintool – Example code
...
...
////Pin
Pincalls
callsthis
thisfunction
functionevery
everytime
timeaanew
newinstruction
instructionisisencountered
encountered
VOID
VOIDInstruction(INS
Instruction(INSins,
ins,VOID*
VOID*v)v){{
////Insert
Insertaacall
callto
todocount
docountbefore
beforeevery
everyinstruction,
instruction,no
noarguments
argumentsare
arepassed
passed
INS_InsertCall(ins,
INS_InsertCall(ins,IPOINT_BEFORE,
IPOINT_BEFORE,(AFUNPTR)docount,
(AFUNPTR)docount,IARG_END);
IARG_END);
}}
VOID
VOIDdocount()
docount(){{icount++;
icount++;}}
...
...
Pin

Instruction stats, Register access patterns,
Memory access patterns, Branches stats
– Application Trace
Pin – Execution

$ pin -t pintool – binary


Pin Tools page
Assignment

Profile and analyze a family of 2 or more similar
programs using the Intel PIN tool.
– Collect instruction count, Instruction Address Trace,
Memory Reference Trace.
Valgrind

Valgrind is also an instrumentation tool.

memcheck - Check memory related errors

cachegrind - a cache and branch-prediction
profiler

callgrind - a call-graph generating cache and
branch prediction profiler
Valgrind

Valgrind is also an instrumentation tool

Can instrument most binaries – Intel, ARM,
PPC, ARM, Android on ARM

$ valgrind ./a.out

Valgrind Page, Tutorial, FAQ page.
Perf tool

Profiler tool. Performance analysis tool.

Prints out performance counters.
– Hardware counters from PMU: number of cycles,
instructions retired, L1 cache misses and so on
– Software counters - context-switches

For the full list of events do
– $ perf list
perf Example
$ perf stat -B dd if=/dev/zero of=/dev/null count=1000000
1000000+0 records in
1000000+0 records out
512000000 bytes (512 MB) copied, 0.956217 s, 535 MB/s

Performance counter stats for 'dd if=/dev/zero of=/dev/null count=1000000':


5,099 cache-misses # 0.005 M/sec (scaled from 66.58%)
235,384 cache-references # 0.246 M/sec (scaled from 66.56%)
9,281,660 branch-misses # 3.858 % (scaled from 33.50%)
240,609,766 branches # 251.559 M/sec (scaled from 33.66%)
1,403,561,257 instructions # 0.679 IPC (scaled from 50.23%)
2,066,201,729 cycles # 2160.227 M/sec (scaled from 66.67%)
217 page-faults # 0.000 M/sec
3 CPU-migrations # 0.000 M/sec
83 context-switches # 0.000 M/sec
956.474238 task-clock-msecs # 0.999 CPUs
0.957617512 seconds time elapsed
perf Assignment

generate stats with perf
GNU Profiler – gprof

Profile the program – identify hotspots, create
call graphs, ...

You might also like