Asaf Shelly

Parallel Computing & Cloud

World Cup

Don't know if you are following the world cup these days but the finals are today.

It looks like there were high expectations but sometimes medium to low results...

Here is a list of bugs I found with the 2010 FIFA World Cup:

* Design bug: The referee can't see everything, actually there are only 3 of them and two are just standing on the lines. People can see replays but the referee has to ignore them. This bug causes teams to see that they are losing by mistake a few seconds after the decision was made. They see it on the big screen with multiple angles. The referee cannot consider the video to fix his decision and this bug causes teams to lose momentum and then lose the game. There is a patch however: the referee can go hard on the other team and make arguable decision in favor of the losing team to balance things up.

* Architectural bug: The architecture assumes that if you win then you are good and if you lost you are not. The problem is that you can win after playing with a bad opponent while a good team can lose after playing with a very good opponent. This is how Argentina lost to Germany while other teams advanced in a completely parallel path. Germany played Spain, so it was not possible for one to be first place and another to be second. This is by system design.

* Architectural bug: The least important games come first and the 2010 FIFA World Cup Final is the last game in the event. This means that by design the best game is played on the field with the worst conditions, after the grass got full of holes during the other less important games.

* Implementation Bug: It is hard for the goal keepers to catch the ball because it is too light and slippery. At first they tried to catch it and lost games. Now they just push the ball back. Sometimes they try to stop a good kick but even though they touch the ball it will keep going in to the goal.

* Random Bug: All the big stars started the event in a big bang and then one by one went away like dust in the wind.

All these bugs are actually there in purpose. We want to have a good game with many goals and we want the game to remain simple just as we play it back home without too much technology. Pretending to fall is part of the game...

The only problem I see here is that Spain didn't give us the best show, Germany did, but Germany lost to Spain because they were playing 'bunker' instead of attacking like we would like to see. We didn't see a good performance from the big stars, and Germany who had a good team-play already lost. And now we have to watch some average game as the 2010 FIFA World Cup Final.

I'll watch it anyway and even enjoy it, but for me the best performance was the German team, playing as a team, with many players whose families immigrated to Germany. The world has changed and we need a ball game to show us the true colors.

In respect to Germany. Here is for the true hero of the Would Cup, Schweinspider:

(need speakers)

 

All in good atmosphere.

Enjoy the game.

My Conclusions on Parallel Computing

After practicing parallel computing for a long while I have decided that it is time to summerise things as I see them so far. See the post in the link below:

My Conclusions on Parallel Computing

Asaf

Exception by target of an invocation

Parallel Programming means that we use invocations more often than we did before. Visual Studio 2008 debugger will bring you to the correct line if you get an exception, but if you invoke a method within your own code the debugger will show you the location in your own thread, which means the line that invoked the code that eventually threw an exception. The exception would say "Exception has been thrown by the target of an invocation.".

To solve this and really find the location of your exception you can drill down into the inner-exception member of the exception or use the following coding style:

void MyInvokedFunction( ){   try   {      ... some code ...   }   catch (Exception exp)   {      if (Debugger.IsAttached) Debugger.Break();      else throw;   }

}

This will cause the debugger to stop at the source of the exception even if it was thrown by another thread.

 

C# Activate Previous Application Instance

Continuing the following post and as an answer to Jasper:

http://blogs.microsoft.co.il/blogs/asafshelly/archive/2010/03/02/c-close-previous-application-instance.aspx

 

The following code will make sure that only one instance is running by exiting if an instance is already running, and activating that instance.

[DllImport("user32.dll")]
static extern bool SetForegroundWindow(IntPtr hWnd
);

static bool ActivateApplicationAlreadyRunning
()
{
  
string proc = Process.GetCurrentProcess().ProcessName
;
   Process[] processes = Process.GetProcessesByName(proc
);
   if (processes.Length < 2) return (false
);
   foreach (Process process in processes
)
   {
      if (process.Id != Process.GetCurrentProcess().Id
)
      {
         SetForegroundWindow(process.MainWindowHandle
);
         return (true
);
      }
   }
   return false
;
}

 

[STAThread]

static void Main()
{
   if (ActivateApplicationAlreadyRunning()) return;

.....

Posted: Mar 09 2010, 05:50 PM by AsafShelly | with 2 comment(s)
תגים:,

Verify Installation of a custom device using C#

Hi all,

Using my WinUSB C# component, I also needed to automatically install the driver if it is not already installed. The INF file defines a new device class by its GUID. This means that the class does not exist on the machine if the device is not installed.

Here is the code:

[DllImport("setupapi.dll", SetLastError = true, CharSet = CharSet.Auto)]
internal static extern bool SetupDiGetClassDescription(ref Guid ClassGuid,
StringBuilder classDescription, Int32 ClassDescriptionSize, ref UInt32 RequiredSize);


public static bool IsDeviceClassInstalled(string deviceClassGuid)
{
   return (IsDeviceClassInstalled(new Guid(deviceClassGuid)));
}

public static bool IsDeviceClassInstalled(Guid deviceClassGuid)
{
   StringBuilder deviceClassDescription = new StringBuilder(256);
   UInt32 retLength = 0;
   SetupDiGetClassDescription(ref deviceClassGuid, deviceClassDescription, deviceClassDescription.Capacity, ref retLength);
   return (deviceClassDescription.Length > 0);
}

 

http://blogs.microsoft.co.il/blogs/asafshelly/archive/2009/12/29/winusb-net-component.aspx

C# Close previous application instance

Continuing this post:

http://kseesharp.blogspot.com/2008/11/c-check-if-application-is-already.html

It is possible to close the currently running instance of the application:

 

      static bool CloseApplicationAlreadyRunning()
      {
         string proc = Process.GetCurrentProcess().ProcessName;
         Process[] processes = Process.GetProcessesByName(proc);
         if (processes.Length < 2) return (false);
         foreach (Process process in processes)
         {
            if (process.Id != Process.GetCurrentProcess().Id)
            {
               process.CloseMainWindow();
               if (process.WaitForExit(6000)) return (false);
               else return (true);
            }
         }
         return false;
      }

 

Modify the timeout appropriately.

 

Asaf

http://asyncop.com/

Posted: Mar 02 2010, 05:44 PM by AsafShelly | with 2 comment(s)
תגים:

OS Support for Locks

I'll start with saying that locks are bad and should mostly be used by infrastructure.

The operating system support for locks is mostly by allowing the system to clean up a MUTEX that was locked by a thread and the thread terminated without unlocking.

I would expect some more support for example detecting that two threads are using the same lock object over and over again, for example when going over a linked list. The operating system can force the threads to run on the same core. This will actually improve performance. Here is a simple scenario:

Record Func()
{
   Record Item;
   while (!found)
   {
      lock (list)
      {
         Item = list.GetNext();
      }
      if (MY_TEST_DATA == Item.Data) return (Item);
   }
}

It is possible to increase performance by using two thread if list.GetNext() is full of computation, but if it is not then most of the time is spent on mutual-locking. This is because the two threads are locking each other from accessing the list. It would be faster to have both threads on the same core to reduce lock overhead. Failed lock will probably cause a context switch which will dramatically slow down performance. The best way to implement this function is using a single thread. The problem is that sometimes we can't predict the performance on a target machine, for example an application that is massively using the CPU today could mean no effort at all for the CPU 5 years from now. The OS should be aware of that.

The best implementation would be for the OS to use NUMA internally and make sure that two threads massively working with the same RAM module will be moved to the same core. A simple scenario is scanning a list in memory, for example getting a record from database or sorting. Most of the actual work is reading the memory and writing back to it. Two cores will only slow the work down because the RAM hardware is slower than the CPU. A CPU of 3GHz can have a memory module of 800MHz on the board. If all cores are working with the same RAM module then the CPU will face a bottleneck to the memory and the entire system will have dramatic performance impact.

http://AsyncOp.com

FTP Configuration for Server 2008

I have just upgraded my server from Server 2003 to Server 2008. With all other porting problems I found out that you cannot just take the FTP configuration on the disk and use it as is.

First of all you need to download the recent version of FTP Services because Server 2008 is released with the older version 6.0. http://www.iis.net/expand/FTP

I had a problem with user isolation on server 2008 FTP 7. On IIS summary I kept getting the event "User XXXXXX failed to log on, could not access the home directory /.". There is also another event in the system event log saying that the user name / password is incorrect.

The resolution to this is to move all directories of isolated users to the folder "LocalUser" under the ftp root. So for example if the FTP site is directed to C:\FTP_ROOT then the user MyUser should be located under: C:\FTP_ROOT\LocalUser\MyUser.

 

Posted: Feb 11 2010, 01:07 PM by AsafShelly | with 1 comment(s)
תגים:

When to use the IsBadXxxPtr Win32 API

This post is a follow-up for two previous posts. The first is my own called Verifying C++ Buffers which was replied with a second post by Sasha titled IsBadXxxPtr Is Really Harmful – Please Don’t Use These Functions. The first post mentioned the advantages of using IsBadXxxPtr API over verifying pointers by using ASSERT (or comparing to zero). The second post mentioned the problems with this set of API in a multi-threaded application. Just to clarify my comment here is not personal and Sasha and I are having an academic discussion which will benefit us all.

Let me start by saying that the information provided in the first post is valid and works well with a single threaded application. As Sasha explained the API does not consider parallel operations. This is noted on the MSDN page for the API and at the end of my first post. The reason I did not elaborate more on this is because it is not such a clear cut. Sasha has explained reasons not to use the API in a parallel environment and I will try to argue the reasons for using this API in a parallel environment.

Here are the scenarios I will be examining:

Scenario A: The pointer value is zero / NULL. This is usually because we initialized the pointer correctly or because a structure was never initialized in debug compilation

Scenario B: The pointer value is above zero but is very close to zero, for example address 1, 23, or 32K. This is usually because we are accessing a member of a struct which address is NULL. The address of the struct is zero so the address of a member is its offset from zero. Another possible reason for this scenario is an item in an array when the array is NULL.

Scenario C: A random number is used for the pointer. This is because we are using an uninitialized memory (in the stack for example), or because of buffer over-run / buffer under-run which caused memory corruption in the storage of the pointer.

Scenario D: The pointer is initialized correctly but the buffer was already deleted and therefore the pointer is pointing at an invalid buffer.

ASSERT or IsBadXxxPtr?

When we use ASSERT or test the buffer whether it is NULL or not we are actually testing to see whether or not we are using a buffer that was initialized to zero. This only works if the pointer was correctly initialized. Some functions would not expect NULL, such as printf( ) or puts( ), while other functions may consider this as a valid value for example the UserContext parameter when creating a thread - if it is NULL then there is no context.

Going over the scenarios comparing the pointer to zero or using ASSERT: Scenario A works fine. Scenario B will not find the bug and your function will try to access the buffer. Scenario C: same as B, and it is also the same for Scenario D.

Testing the same scenarios with IsBadXxxPtr: Scenario A works fine. Same for Scenario B. These two scenarios happen even if pointers are correctly initialized! Scenario C will work if the random value happens to fall on a Heap page out of the 4GB virtual space (pages are usually 4K for 32 bit systems). Scenario D will work if the buffer was the last on that page. When the last page on the buffer is deleted the page is marked as inaccessible.

When do we see problems with multi-threading?

Unmapping a page in a second thread: This means that one thread deleted an object while another thread was working on it. This is because you shared the resource (buffer) but forgot to lock or used the locks incorrectly. ASSERT will never detect that because the pointer has an address that it more than zero and ASSERT has no way of telling if the page is valid.

IsBadWritePtr corruption of memory: is because a resource was shared but the thread used the lock only after calling IsBadWritePtr because the programmer did not know that IsBadWritePtr is accessing the memory. This is a problem and you should know that calling IsBadWritePtr will modify memory even though the name suggests that it is only testing a pointer.

Guard Pages are used in an internal implementation which does not expect IsBadXxxPtr API to access these pages. This may happen for example when a thread is sharing a buffer which is located on its own stack, which is a terrible bug. Such a bug will randomly damage Guard Pages, or corrupt the other thread's stack, or cause the other thread to access bad memory (by overwriting pointers located in the other thread's stack). I can't tell you which is better, but if your thread found an invalid pointer of an internal buffer then something is very wrong and you can't find this using ASSERT.

There is a long and painful argument whether exceptions are good or bad. In a multithreading environment I consider exceptions to be always bad because an exception is a bursting Event that will be initiated even when you are in some execution flow. Multi-threading environments are all about synchronizing execution and flow control. The possibility of a flow breaking because of an event is really bad. It is a system wide problem if a thread is using a Semaphore object and then stopping in the middle of work because of an exception because some other thread did something wrong 5 minutes ago. It is simpler to verify pointers than to check all possible exceptions. An exception is something special, not a return value. When exceptions became return values functions started swallowing them and hide the real problems.

For conclusion, with all problems that come with using the IsBadXxxPtr Win32 API, I would still prefer using this API set than using ASSERT which allows bugs to happen. If the code has no bugs then IsBadXxxPtr causes no damage. If your code has bugs then using IsBadXxxPtr will solve more problems than cause them, unless you are going to use this to hide the problems.

If my application supports plug-ins then I would prefer to verify interface buffers than to let buffers from an external module damage my application.

 

 

Visual C++ Compiler ERROR C2016

Compiling a C file with Visual Studio 2008 compiler I got this error:

"error C2016: C requires that a struct or union has at least one member"

I am writing this post because it is not documented, and C2016 on MSDN is defined as "A newline character was found before the closing single quotation mark of a character constant". Either it is not well documented or I don't know where to look for it. Whatever the reason I can only guess that someone else will find the same problem.

This is the cause:

struct
{
   int    u8Type;
   int    u8LenExtended;
   int   u16Length;
} Base_Request;

struct Request_Ex
{
   Base_Request Request;
   int Address;
   int Length;
};

This is the fix:

typedef struct
{
   int    u8Type;
   int    u8LenExtended;
   int   u16Length;
} Base_Request;

struct Request_Ex
{
   Base_Request Request;
   int Address;
   int Length;
};

This is over simplifies intentionally.

Asaf

WinUSB .Net Component

I have uploaded a WinUSB Component for .Net with full source code here: WinUSB .Net Component in C#.

The component comes with full source code and has the following features:

 

Pacific Software & HPC

מזה מספר שנים שחברת פסיפיק תוכנה מובילה בתחום הפיתוח המקבילי בארץ. בקרוב מיקרוסופט מוציאה את גרסת Server 2008 HPC R2 ופסיפיק מחפשת מומחי IT שמתעניינים בתחום הקמה ותחזוקה של מערכות HPC

 אסף

Verifying C++ Buffers

This post is following this post: Verifying Pointers in C++ which demonstrated the problem in testing a pointer for NULL after allocation like this:

char* ptr=new char[1];
if (NULL == ptr) // fail

There is another issue with verifying pointers for NULL. The value NULL is #defined as 0 (zero) which is agreed to be an invalid memory address. Testing a pointer to see if it is NULL means testing to see if the pointer is pointing to address zero or not. As we all know, when we allocate a buffer and then delete it we must zero the pointer. Verifying a pointer is relying on the fact that every pointer that does not point to a legal buffer is manually reset. Like this:

main()
{
   char* ptr = NULL;
   MyFunc(ptr);
   ptr = new char[20];   MyFunc(ptr);
   delete[] ptr; ptr = NULL;
   MyFunc(ptr);
}

void MyFunc(char* ptr)
{

   if (NULL == ptr) return;
   // ...do something...
}
 

You may assume that verifying the pointer in MyFunc is fine because the code is verified to make sure that every pointer is correctly initialized to zero and so if a NULL pointer is used the function is protected from it. This is not really the case. To demonstrate the problem we can look at the following scenario.

We have a class called Label. This class has several members including the text string to display. Here is the class:

class Label
{
protected:
   int X;
   int Y;
   char* text;
public:
   char* Text();
};

void UpdateDisplay()

{
   // ... more
   Label* label1 = GetSomeLabel();

   char* ptr = label1->Text();
   if (NULL == ptr) // ERROR
   // ... more
}

Now, here is the problem:

class Label
{
protected:
   int X;
   int Y;
   char text[256];
public:
   char* Text();
};

In the first version of the class the string was a pointer to a memory buffer. In this version the string is stored as part of the object. This means that in the original version the call to GetSomeLabel() in function UpdateDisplay() can return NULL if there is no label and calling the member label1->Text() will also return NULL. When we use the second version of the class we find a problem. If the call to GetSomeLabel() in function UpdateDisplay() returns NULL then the calling the member label1->Text() will return 8. Verifying the pointer for NULL will find no problem and using the macro ASSERT will also no find the bug!

void UpdateDisplay()
{
   // ... more
   Label* label1 = GetSomeLabel(); // Got label1 = NULL // label1 = 0 

   char* ptr = label1->Text();     // ptr = label1 + 8  // ptr = 8
   if (NULL == ptr) // ERROR       // can't see the problem, ignoring
   // ... more
}

When label1 is NULL the data of label1 starts at address zero; X starts at address zero, Y at address 4 which is sizeof(X), after X, and text starts at address 8. Making sure that the string text is not zero is a bug.

When we try to access the buffer text the system will get an exception. This is a CPU exception for accessing the invalid page stating on address zero. If the page size is 4KB then accessing anywhere within the range is invalid and not just address zero, so address 8 is also invalid because it is in the same page. Actually the entire block of memory ranging from address 0 and up to 64KB is marked as inaccessible for this exact reason.

The way to do this correctly is not to verify that the address is under 64K. Instead you can use the Win32 API IsBadxxxPtr. Just like checking the pointer has to be thread-safe, so does calling these API functions. It is possible that between the verification and the usage some other thread deleted the buffer if you do not make sure that this will not happen. Also this does not replace the need to reset pointers to zero after deleting them. If there is another valid buffer on the same page then the entire page is valid.

* Note that this API is not recommended by Microsoft for a parallel environment.

For code example see the wrapper code as it is used internally by the cpp files in this sample code.

 

WinUSB Read Problem

Hi All,

I have decided to move forward from my implementation of BulkUSB.sys driver to using the formal implementation called WinUSB. It is a package of a kernel driver and a user-mode API documented as part of the MSDN library.

When I tried to do asynchronous reads I discovered two undocumented / confusing failures:

The first is GetLastError returns 997 - ERROR_IO_PENDING. This is fine for overlapped IO. The function fails and the error code means that there is more data to follow.

The second is somewhat confusing: GetLastError returns 121 - ERROR_SEM_TIMEOUT "The semaphore timeout period has expired". This means that the timeout has expired before the data was ready. The confusing part is that it looks like the call to WinUsb_ReadPipe will also fail with error code 121 if there is data in the buffer.

The read function will wait for the buffer to be full or timeout to expire. This means that the call to read will fail with error code 121 even if some data was read from the device and you must check the number of bytes read to know if read really succeeded or not. The logic behind this is that you don't call GetLastError if the function returns true.

* to set enpoint timeout use WinUsb_SetPipePolicy

 

 

Parallel Computing Tutorial

With some delay I am publishing a video about parallel programming. This tutorial is loaded with advanced concepts mainly dealing with parallel software design and architecture.

As some of you may know I have presented "Parallel Programming for Embedded" TechEd 2009 Europe. This is the video. There are no code samples in this presentations because there are too many languages and operating systems that it is irrelevant to show any code. Instead this presentation shows how the hardware has always been parallel and therefore it is a good source for parallel design. Windows Kernel has always been working in parallel as well, but Windows NT kernel is an embedded operating system and has its own compiler.

The concepts in this video are covered in over 20 days of lectures that I have so it is packed with many advanced concepts.

The video is currently at the home page of http://AsyncOp.com. If you can't see it there then look for it under the Video section, or under Events.

Asaf

More Posts Next page »