DCSIMG
Super Smart Pointer - Asaf Shelly

Super Smart Pointer

Published Sunday, September 07, 2008 4:17 PM
Memory management is one of the key elements in the world of software. So many bugs come from misusing memory pointers, over-writing buffers, racing over buffers (several threads use the same buffer), deleting a buffer while it is still used, running out of memory, and so many more. To solve this problems we have seen so many methods and technologies used starting with Locks which are flags that help manage access to a buffer, Garbage Collector that helps manage buffer deletion from the Heap, the user of a Stack, Smart Pointers, Reference Counting, Page Protection flags, and so many other techniques and technologies. The focus of this article is memory management for the parallel world. When a buffer is allocated it has a single owner and only one thread that has a pointer to it. This thread, as the only owner and the only one who knows the buffer, is also the only one that deallocate it. We say that the pointer to the buffer is also the buffer's ID. If you know the pointer then you can use the buffer. When multiple threads are used a synchronization method is usually required. Locks are used as flags that manage access to buffers. Only a single thread owns the lock and only the owner of the lock should access the buffer. However any thread that knows the buffer can, at any time, access the buffer and potentially damage the operation of the buffer that got a successful lock and was assumed to be protected. Memory deallocation is a big problem in both parallel and serial world. One of the preferred models that improve manageability is Auto-Pointers or Smart-Pointers. Such an object will hold on to the buffer for as long as there are users to it and deallocate it when it is no longer in use. Such as object can count the users and also manage locks. A user is a part of the code that is holding on to the buffer and needs to know the buffer. There can be multiple users for a single thread, or several threads in a pool can execute several parts that are using the thread. Buffer users are not related to the number of threads in the system and are more dependant on the number of objects and functions (mechanisms) that need to know the buffer. Between all the users there can only be a single owner. This owner is the only one that is allowed to work with the buffer. Most implementations of Smart Pointers are designed to help manage the users and allow appropriate buffer cleanup and do not really consider multiple threads using the objects. Around the time multicore CPUs are introduced we also have 64 bit CPUs. These allow the use of more than 2GB or application memory (3GB with global system configuration). The problem today is that most applications were not designed for 64' CPUs and software companies hesitate from the move to 64 bit. With that we also know that the drivers available for 64 bit systems do not cover a wide verity of devices as available for the 32 bit version of the operating system.The problem that many software companies are facing today is that the 3GB limit is almost reached. An even bigger problem is that during normal system operation buffers are allocated and deallocated. This means that eventually the memory is highly fragmented and an application that uses 1.2GB for its normal operation consumes 2GB of memory for normal operation just because it ran for long enough. The solution to these problems is a Super Smart Pointer that supports:
  1. Internal reference count of buffer users
  2. Locking support associated to the buffer
  3. Automatic memory management that will move buffers in memory to allow memory defragmentation and allow more than 4GB of memory address space in a 32 bit application.
 The basic object interface (to be wrapped with operators) should support:User Interface:
  1. Allocate Buffer
  2. Duplicate Use (Add Reference)
  3. Stop Use (Release, and last release will delete)
 Owner Interface:
  1. Lock Buffer
  2. Unlock Buffer
  3. Test Lock
 Heap Operations:
  1. Create Heap
  2. Destroy Heap
  3. Lock / Unlock Heap
  4. Crash Dump
 The technology that we will use for our Heap is not a memory heap. Instead we will use a file. Files are not limited in size of 4GB and are relatively easy to use. The system also supports advanced access management to files and file sections though the File System as an Object Store. The Heap operation Create Heap with therefore be CreateFile. The file is of course created with the temporary flag so that the system tries not to write to the physical file. This way the system will only write to the physical file when we run out of available RAM.Equivalently we use CloseHandle in the Heap operation Destroy Heap. We can also use the creation flag of CreateFile that tells the system to destroy the file automatically when the last handle is closed.We could use a file for every allocated buffer but we won't because it increases complexity and damages runtime. The Heap operations of Lock and Unlock for the Heap are used to block access to all uses and completely prevent the Heap from being modified (or read). This is useful in the case that a single thread detected a serious bug. We want to freeze the system as soon as possible to save the memory state. Here comes the last Heap operation called Crash Dump. This is fairly simple: close the file without deleting it. If the file was marked for automatic deletion then copy it entirely to a crash-dump file that can be later investigated. User operation of Buffer Allocation is as simple as saving the place in the file. It is enough to find the next available location and return this location as the pointer to the buffer (as buffer ID). Buffer deletion is as simple as marking the location as empty.Every buffer starts with the size of the buffer and with a reference count. The pointer returned is to the data past these two values so for example a pointer with the value of 0x1208 has reference count at address 0x1204 and size at 0x1200. User operation of Duplicate Use is simply incrementing the reference counter. Operation of Stop Use is decreasing the counter and when the counter is at zero mark the object as deleted (see existing memory managers for realworld implementations). Last but not least of this implementation is the use of Memory Mapped Files.The Owner operation of Lock Buffer will lock an internal object and only if the lock is successful a pointer is returned. If the lock fails then the user is not the owner and is not allowed to write to the buffer or read from it.When the pointer to the buffer is returned only when we perform the operation Lock then until that point we don't really care where the buffer is in real memory or even if it is in real memory at all. We use the Memory Mapped Files API MapViewOfFile to map the buffer's data into the process's address space. We lock the buffer ownership and we lock the buffer into memory. At this point it is safe to use the buffer. When we are done we use the Unlock Buffer operation that will unlock the buffer from memory and unlock the buffer ownership. Unlocking the buffer from memory will also allow virtual memory defragmentation. By using memory mapped files we can also use the same application design with multiple processes instead of multiple threads, which would increase application stability because threads can damage each other's memory but processes cannot touch local memory that belongs to another process. A thread can accidentally erase the stack that belongs to another thread and thus damage local scope variables. A process that did not map a memory buffer because it is not the owner cannot damage this buffer. Basically this means that unless there are really exceptional reasons any buffer is mapped to a single process at a time. Files are not limited to 4GB, we get automatic memory buffer swapping mechanism that will defrag the virtual memory for us, and it is very easy to get a crash-dump file that can be analyzed after the application crashed.              

 

Comments

# Asaf Shelly said on Thursday, September 11, 2008 10:09 AM

In continuance to my last post about memory allocations from mapped files http://blogs.microsoft.co.il

Leave a Comment

(required) 
(required) 
(optional)
(required) 

Enter the numbers above: