Tuesday, August 16, 2016

Profiling Multithreaded / Multiprocess Applications on the DE0-Nano-SoC with ARM® DS-5 Streamline

The ARM® DS-5 Streamline Performance Analyzer tool within ARM® DS-5 Development Studio is an optimal tool for profiling and analyzing the performance of multithreaded / multiprocess applications. Without modifying the kernel on the Terasic DE0-Nano-SoC board, the gator daemon can be compiled using the Linaro 4.8 GCC ARM Hard Float toolchain and then uploaded to the DE0-Nano-Soc board that is running the stock Terasic Yocto build off of the uSD card.

The ARM® DS-5 Streamline Performance Analyzer is a very powerful tool for looking at CPU clock cycles, instruction execution broken down between load and store operations, memory usage, register usage, disk I/O usage - read and write, per process and per thread function call paths broken down by system utilization percentage, per process and per thread stack and heap usage, and many other useful metrics.

To capture some level of meaningful information from the DS-5 Streamline tool, the process_creation project has been modified to insert 1000 packets into the packet processing simulation buffer, and the child processes have been modified to sleep and then wake up for 1000 times in order to simulate process activity.


void *insertpackets(void *arg) {
   
   struct pktbuf *pkbuf;
   struct packet *pkt;
   int idx;

   if(arg != NULL) {
   
      pkbuf = (struct pktbuf *)arg;

      /* seed random number generator */
      ...

      /* insert 1000 packets into the packet buffer */
      for(idx = 0; idx < 1000; idx++) {

         pkt = (struct packet *)malloc(sizeof(struct packet));

         if(pkt != NULL) {

            /* set the packet processing simulation multiplier to 3 */
            pkt->mlt=...()%3;

            /* insert packet in the packet buffer */
            if(pkt_queue(pkbuf,pkt) != 0) {
            
               ...
            ... 
         ...
      ...
   ...
...

int fcnb(time_t secs, long nsecs) {
 
   struct timespec rqtp;
   struct timespec rmtp;
   int ret;
   int idx;

   rqtp.tv_sec = secs;
   rqtp.tv_nsec = nsecs; 

   for(idx = 0; idx < 1000; idx++) {

      ret = nanosleep(&rqtp, &rmtp);

      ...
   ...
...


ARM® DS-5 Streamline - Profiling the process creation application.

ARM® DS-5 Streamline - Code View



No comments:

Post a Comment