Static Analysis Reliable Software Tools

    Program Analysis of Threaded Programs

    Race conditions lead to non-deterministic behavior when two threads modify a common piece of data (typically global variables) and the last one to modify it keeps it. In order to eliminate race conditions, it is important to control access to the critical sections which are the parts of the code that contain actions on global variables.

    Thread-safe routines A routine (function) is considered to be thread-safe (i.e. cannot participate in a race condition) if it does not modify global state. The parist program will flag functions which do not modify global state. If you have to debug a race condition, those routines are fine.

    Exclusive global variables Just because a routine makes use of a global variable, does not mean it can cause a problem. If no other routine uses that same global variable, while the routine itself cannot be called thread-safe, it is not a candidate for a race condition. The parist program will identify the global variables used in each function so you can determine if another routine also uses the same global variable. In order to integrate it into your development environment, the output from running parist on all your files can be combined to provide a big picture.

    Global L-values While global variables are the basic problem, it is really their use as l-values (on the left hand side of an assignment) that really causes problems. It is especially problematic when it modifies itself such as X++ or X=X+1. These statements go beyond a race condition, and produces values that are just plain wrong.

    Full-path critical section identification The final step to finding race conditions in code is to determine which critical sections (containing global l-value variables) are adequately guarded by monitors (semaphores).

    Common guards A priority inversion happens when a high priority task is prevented from running because of a guard it shares with a lower priority task. As a task spans many functions, it is necessary to trace the path of all-potential executions in order to determine which functions can be executed and what monitors are in place in each function. From that list, we can determine which tasks have which guards and which high priority guards are the same as low priority guards.

    Using PARIST (T=thread analysis)

    Step 1. Download the solaris parist program HERE. Right click and save it. You may have to chmod the file to give it execute permissions in order to run it.

    Step 2. Preprocess your source file. Most of the controller code is run through gcc. If your gcc build line is:
    gcc -c -I /usr2/mike/include ge_foo.c
    You want to specify that it should ONLY run the preprocessor (-E) and then specify an output file (-o). You sometimes specify an output file when doing a compile and link. Your modified command line would look like this:
    gcc -c -I /usr2/mike/include -E -o ge_foo.i ge_foo.c
    The .i convention is typically used for files that have passed through the compiler preprocessor but has not been actually compiled. You can look at the ge_foo.i file and see that it has reconciled all of your #include and #define statements.

    Step 3. Run parist and see what routine are considered to be thread-safe. The following example is the actual result of analysis of ge_send.c from the controller code as of Fall 2004.
    parist -s0 ge_send.i
    The result would look like:
    GEFudgeUnitTimeouts
    GESendKeepAlive
    buildAlarmAckMsg
    build_out_msg
    processOutputReply
    processOutputs
    GEHandleOutputAckNak
    shutDownGESend
    GEGetDropMode
    geSendSiguser1Handler*
    geSendSigAlrmHandler*
    The star means that there were no variables at all used in those routines.

    Step 4. Run parist and see what is considered to be not thread-safe. The following example is the actual result of analysis of ge_send.c from the controller code as of Fall 2004.
    parist -s1 ge_send.i
    The result would look like:
    geSend ge_tcp_ptr_G ge_version_G GETTPtr geRcv geSendFlg __GESigUsr1 __GESendSigAlrm __GECommError
    GEInitSendTask geSendFlg geSendSigAlrmHandler geSendSiguser1Handler WES_SEM_FULL GE_Di_Event_Debug GE_Alarm_Debug GE_Sw_Event_Debug
    geSendSiguser1Handler __GESigUsr1 __GECommError
    geSendSigAlrmHandler __GECommError __GESendSigAlrm

    The function name is listed first and then the list of global variables that are used in the routine are listed. I have made the function names bold because it is hard to see what lines wrap around on a web page.

    When trying to determine a race condition, the first thing you should do is generate -s1 for all files. Then look at the result and identify the common global variables. For instance, geSendFlg is used by both geSend and GEInitSendTask. This does not mean there is a race condition, but the liklihood is greater.

    Program Analysis for Code Inspection

    Code Inspection through the use of code reviews is a great way to have a second (or several) set of eyes to help review code. One of the biggest problems with code reviews is that many reviewers are not as familiar with the code. This

    Using PARISD (D=data analysis)

    Step 1. Download the solaris parisd program HERE. Right click and save it. You may have to chmod the file to give it execute permissions in order to run it.

    Step 2. Preprocess your source file. If your gcc build line is:
    gcc -c -I /usr2/mike/include ge_foo.c
    You want to specify that it should ONLY run the preprocessor (-E) and then specify an output file (-o). You sometimes specify an output file when doing a compile and link. Your modified command line would look like this:
    gcc -c -I /usr2/mike/include -E -o ge_foo.i ge_foo.c
    The .i convention is typically used for files that have passed through the compiler preprocessor but has not been actually compiled. You can look at the ge_foo.i file and see that it has reconciled all of your #include and #define statements.

    Step 3. Run parisd with the various passes to gather information about the data variables used in your program. The various passes are:

  • -p1 Showing L-values
  • -p2 Showing constants (exactly 1 assign, which is an initial declaration)
  • -p3 Showing unassigned variables
  • -p4 Showing effective constants (assigned exactly once, but anywhere)
  • -p5 Showing variables with a path of unassignment
  • -p6 Showing effective constants assigned only simple values
  • -p7 Showing effective constants assigned only from other effective constants
  • -p8 Showing variables/functions passed to functions
  • -p9 Showing variables/functions whose address is gotten
  • -p10 Showing variables/functions used to index arrays


    bigrigg@cmu.edu
    Last updated 21 March, 2005