Tuesday, August 16, 2016

Profiling Multiprocess C programs with ARM DS-5 Streamline

The ARM DS-5 Streamline Performance Analyzer is a powerful tool for debugging, profiling, and analyzing multithreaded and multiprocess C programs.  Instructions can easily be traced between load and store operations.  Per process and per thread function call paths can be broken down by system utilization percentage.  Branch mispredictions and multi-level CPU caches can be analyzed. Furthermore, disk I/O usage, stack and heap usage, and a number of other useful metrics can quickly be referenced within the debugger. These are just a few of its capabilities.

In order to capture meaningful information from the DS-5 Streamline Performance Analyzer tool, a Linux, multiprocess, C program was modified to insert 1000 packets into a packet processing simulation buffer.  A code excerpt from the program is below.  The child processes were modified to sleep and then wake 1000 times in order to simulate process activity.  The program was analyzed using the DS-5 Streamline Performance Analyzer tool.  There are two screenshots below the code excerpt where the program is loaded into the DS-5 Streamline Performance Analyzer.

void *insertpackets(void *arg) {

struct pktbuf *pkbuf;
struct packet *pkt;
int idx;

if(arg != NULL) {

pkbuf = (struct pktbuf *)arg;

/* seed random number generator */
...

/* insert 1000 packets into the packet buffer */
for(idx = 0; idx < 1000; ++idx) {

pkt = (struct packet *)malloc(sizeof(struct packet));

if(pkt != NULL) {

/* set the packet processing simulation multiplier to 3 */
pkt->mlt=...()%3;

/* insert packet in the packet buffer */
if(pkt_queue(pkbuf,pkt) != 0) {

...
...
...
...
...
...

int fcnb(time_t secs, long nsecs) {

struct timespec rqtp;
struct timespec rmtp;
int ret;
int idx;

rqtp.tv_sec = secs;
rqtp.tv_nsec = nsecs;

for(idx = 0; idx < 1000; idx++) {

ret = nanosleep(&rqtp, &rmtp);

...
...
... 
 
ARM DS-5 Streamline - Profiling the process creation application

ARM DS-5 Streamline - Code View with C code in the top window
and ARM assembly instructions in the bottom window

https://github.com/brhinton/de0-nano-soc/blob/main/run.c