DCSIMG
August 2008 - Posts - Asaf Shelly

August 2008 - Posts

Multicore Programming (Multiprocessing) Visual C++ Tips: Avoiding Exceptions
Thursday, August 28, 2008 9:11 AM

There are still too few online resources available that can help with multicore programming (multiprocessing) so I continue with another tip and this time it is about exceptions and exception handling. See also previous article "Prefer clear execution flow for member function".

The computer system as a whole is an extremely parallel system. Interrupts go into the CPU and the CPU handles them by priority. These interrupts can come either from the hardware or from software. When the keyboard has a new key to send to an application the keyboard device raises an interrupt with the CPU and the CPU in turn will ask the keyboard for the new key that was pressed. Software interrupts were most common in old DOS days, where system API (such as Win32 API) were called by use of the interrupt mechanism. There is another type of interrupt on the system, a third type, that is generated by the CPU.

When the CPU has an interrupt it calls an Interrupt Handler which is basically a function that has a specific signature and has to behave in a certain way - assuming that it can come at any point in time, under any thread. There are many types of interrupts and each has its own ID. Every ID has an Interrupt Vector which is a function pointer to the Interrupt Handler. Basically it is just a long table of function pointers were each index in the table is the interrupt ID. Some of the interrupt are hardware events such as "new network buffer is ready" so destination is some device driver or system mechanism. Other interrupts indicate user input and are sent to higher levels. Bottom line interrupts are hardware events and are eventually handled by some software.

Majority of hardware interrupts that interest us are sent to our application as hardware events. The application should wait for these events to receive them. An example is a mouse click event, or a Plug & Play event. If the application does not wait on the event it will not receive it. This may sound bad but it is actually very good because usually the application works with some resource when it is handling the event (some buffer, class members, file handle, etc.). It is safe to assume that the application will not start handling a new event during work on another event because it will only wait for a new event when processing is complete.

On the other hand there is a different type of interrupts which are CPU generated interrupts. There are several situations in which the CPU has to generate an interrupt to itself, effectively breaking the execution flow. The simplest one to understand is an endless loop. The CPU uses an internal loop to perform long division operations. Division by zero will cause the CPU to remain inside the loop so another internal element in the CPU will interrupt the operation. This is the Divide By Zero interrupt. This interrupt and other CPU interrupts are translated for the application as an Exception. This effectively means that hardware interrupts are translated to soft events, but CPU interrupts maintain behavior in the software also. An Exception breaks the execution flow unexpectedly but instead of using an Interrupt Vector as a global handler the code uses a local handler. (Vectored Exception Handling is a completely different concept and may be very valid for parallel computing).

Exceptions are massively used in C++ because developers did not verify return values as they should have according to the C libraries. Now, you can just throw an exception and make sure that the application stops when there is a bug, so bugs no longer remain hidden as before. The problem with exceptions is that you can throw just about anything. This means that you have to catch just about anything in order to make sure that a library did not crash the application when some meaningless operation failed (good design put a side). So now you catch all exceptions just to make sure that printing a document did not crash the application, on the other hand this means that you lost the main feature of an exception - the ability to stop the flow and break the application in critical cases because you might hide the problems now.

 Exceptions in general are extremely bad for parallel applications because they break the flow and the flow is the most important aspect in a parallel application.

What we should have is a way to define a simple return value of a single type and throw an exception only if the caller did not read the return value. An even better way is to throw an exception also if an error value was returned but the caller did not mark as 'handled'. This way we only catch specific types of exceptions: CPU exceptions, constructor exceptions, operator new out of memory, and a few other types. We don't catch an exception of type "return value not verified" and we also don't catch any by using  catch(...) {;} .

See below a very simplified example of a code that implements this methodology. The type RET is supposedly a global type that the entire solution uses. TestFunc is a function that is called by the user and we want the return value to be verified, main() demonstrates a way to call this function. Run the code to see that the exception is thrown only if the return value is not verified.

This is favorable for parallel code because fewer exceptions means less chances to accidentally break the code flow. Also when the flow is broken then it is because something is bad in applicaton design and we immediately want to know about it. An exception is not just a value returned from a function.

 

  1. #include <windows.h>
  2. #include <conio.h>
  3. #include <iostream>
  4.  
  5.  
  6.  
  7.  
  8.  
  9. template <class TplT>
  10. class RET
  11. {
  12.    TplT obj;
  13.    bool tested;
  14. public:
  15.    RET(const TplT val): tested(false)
  16.    {
  17.       obj = val;
  18.    }
  19.    void operator = (const TplT val)
  20.    {
  21.       obj = val;
  22.    }
  23.    operator TplT()
  24.    {
  25.       tested = true;
  26.       return(obj);
  27.    }
  28.    ~RET()
  29.    {
  30.       if (!tested) throw("Return value was ignored");
  31.    }
  32.  
  33. };
  34.  
  35.  
  36.  
  37. RET<int> TestFunc()
  38. {
  39.    int ret = 0;
  40.  
  41.    return(ret);
  42. }
  43.  
  44.  
  45.  
  46. void main()
  47. {
  48.    if (!TestFunc()) std::cout<<"error\n";
  49.    TestFunc();
  50.    if (12 == TestFunc()) std::cout<<"returned 12\n";
  51.    if (TestFunc() == 15) std::cout<<"returned 15\n";
  52.    getch();
  53. }

 

Multicore Programming (Multiprocessing) Visual C++ Tips: member function execution flow
Wednesday, August 27, 2008 12:58 AM

Hey all,

Long time no read... Well, I've been very busy. I thought that I'd start with a technical post (maybe cause there is nothing that exciting with my life, and maybe because I pretend that this is the case :)

As you all know I have beed busy in the are of parallel computing, what people call multi-core programming.

I have learned that Object Oriented programming is very problematic with parallel systems and that OO helps us at design time but has almost no benifit for run-time and many times even damages our ability to understand execution flow. See my post on my Intel blog here: http://softwareblogs.intel.com/2008/08/22/flaws-of-object-oriented-modeling/ 

It is very clear to me that people are used to OOP and will still be using it for many years to come so I have compiled a few basic guidelines for OO Programming to help us get the best of both worlds: manageable code, and clear execution flow. Here is the first:

Prefer clear execution flow for member function

The execution flow is the most important aspect in a parallel application. When the source code that has a clear and simple flow the application is simple to debug and the code is easy to follow. Once of the greatest problems of OOD is multiple short functions that call each other. Code review is almost meaningless and it is extremely difficult to follow the execution flow using a debugger or by going over the execution logs.
It would thus mean a great improvement in code usability when few long functions are used. Sometimes it is easier to read a function of five lines that looks like this:
 
void SampleFunction()
{
   int ch = ReadByte();
   if ( TestValue( ch ) )
   {
      Logger.WriteLine( "OK" );
   }
   else
   {
       throw ( "Error" );
   }
}
 
The code above is simple to read and is very clean, however there is a problem with this code. It is hard to tell if any of these functions are blocking, using locks, throw any exceptions, and worse than all usually an application that uses such functions will have many such short functions. Here is a possible execution flow for such an application:
 
- SampleFunction
-   ReadByte
-     TestFileOpen
-       GetInputFileHandle
-     ReadCharFromFile
-   TestValue
-     GetHashFromInt
-     FindHash
 
The same function could be implemented in the following illustrative way:
 
void SampleFunction()
{
   bool ok = false;
   HFILE hFile = GetInputFileHandle();
   if ( hFile > 0 )
   {
      int ch = fgetchar(hFile);
      if ( ch > EOF )
      {
         ch = (ch * 0x123 + 57 ) / 33;  // get HASH
         if ( lpHashTable[ ch ] != NULL )
         {
            ok = true;
         }
      }
   }
   if (ok)
   {
      Logger.WriteLine( "OK" );
   }
   else
   {
      throw ( "Error" );
   }
}
 
This may be a longer function and it is not "pure" object oriented programming but now you can read and code and understand what it does. You can also find bugs in the execution flow by a means of a simple code review and if you try to debug it step by step you can understand where you are in the flow of the code.
Coding this way helps in finding idiotic deadlocks and reduces the number of exceptions thrown between functions, which also helps in keeping a clear execution flow.


 
Your opinion is most welcome.

 Asaf