Synchronized access to shared memory for i86

Description
Convenient locking mechanism
Synchronization primitives
Based pointers
Application template
Advantages and disadvantages
How to use
Distribution

Description

Class shared_memory provides effective and operating system independent interface to shared memory. This class makes it possible for several applications to access objects in shared memory. This class provides methods for allocation/deallocation of objects in shared memory, setting exclusive or shared locks. Locks can be nested and can be used to synchronize access to shared objects by different threads within process as well as by different processes.

The primary idea of this class is to support effective mechanism of interprocess data exchange and data sharing. The assumption was made that most of the time there are no data access conflicts between concurrent processes. So class shared_memory was designed to optimize this case: locking of an object by process without waiting. Using of special atomic instructions of i486 (and higher) processor XADD and CMPXCHG makes it possible to avoid context switching for locking resource, which is not locked by other process. Only storage locks are supported by shared_memory class. So locking of separate objects is not possible.

It is possible to map shared memory section on file and use it as persistent storage for objects. Transactions are not supported by this class. If atomicity and fault tolerance are required for you application, please look for Generic Object Oriented Database System or Persistent Object Storage for C++.

As far as objects allocated in shared memory can contain references to another objects in shared memory, it is necessary to guaranty that shared memory section is mapped to the same virtual address range in all application using this section. This address can be specified in open() method or, if not specified, is defined by the system. When section is mapped to another process address space, open() method tries to map the section to the same virtual address. If it not possible (for example if some other section is mapped to this address range), then open() will fail (POST++ allows to move section to different location and adjust references, but it is possible only when section is used by only one application).

Allocation of objects in shared memory is performed by first feed algorithm using list of free holes with walking pointer of current position in the list. Two extra words are allocated before and after the object, making it possible efficient merge of subsequent holes during free() operation (constant time object deallocation). Once created, shared memory section can't be exceeded. Maximal size of section should be specified in open() parameters.

Interface

static shared_memory* find_storage(void* obj);

Given object pointer, this static methods returns pointer to the shared memory section, to which this object belongs. If object was not allocated by shared_memory class or section is already closed, then this method returns NULL.

status lock(lock_descriptor& lck, unsigned msec = INFINITE);

Lock storage either in shared or exclusive mode. Lock mode is specified by lock_descriptor object provided by application. This object should not be deleted until unlock() method for this lock descriptor is called. Locks can be nested, so that calling lock() method twice will require two calls of unlock() to take off lock from the storage. It is an error to use one lock descriptor in more than one lock requests.

Optional parameter msec specifies value of timeout for waiting until lock request will be granted. If timeout is expired before lock can be granted, lock() method returns shared_memory::timeout_expired error code. If msec is set to 0, lock() will return immediately if lock is not possible. See section Convenient lock mechanism for alternative way of setting locks.

Using of locks can cause deadlock problem. Consider the following situation: application A and B lock storage in shared mode, then both of them try to upgrade their lock to exclusive. But both of this locks can not be granted, because of shared lock of other application. So these two application can't continue execution and will wait for each other forever. To avoid deadlocks you should use lock upgrades with care and if you know that method of shared object which doesn't change the object can call another method which modifies the object, it is better to set exclusive lock in first method.

status unlock(lock_descriptor& lck);

Take off lock previously set by lock() method. Locks can be nested so calling unlock() method will not necessary unlock the storage. Lock descriptor, passed to unlock() method should be the same object, as passed to lock().

void* allocate(size_t size, bool initialize_by_zero = true);

Allocate object in shared memory section. If there is not enough space in the storage for allocation of new object, NULL is returned. If second parameter initialize_by_zero is true, then object will be initialized by zeros.

void free(void* ptr);

Deallocate specified object. Storage file can be truncated as a result of deallocation objects at the end of the storage.

static void deallocate(void* obj);

This method does the same as previous method, but can be called without pointer to shared_memory object. This method calls find_storage() method to find out storage to which object belongs and then deallocate object's memory by free() method.

status open(const char* file_name, const char* shared_name, size_t max_size, open_mode mode = read_write, void* desired_address = NULL);

Create or open shared memory section. Parameter shared_name specifies system name of the object and should not conflict with names of other objects in the system (events, semaphores, mutexes,...). More precisely, shared_name is used to generate a collection of identifiers, which are assigned to syncronizational objects used by class shared_memory and file mapping object itself. These identifiers are produced from shared_name appended by decimal digit (1,2,...).

If file_name parameter is not NULL, then memory section is mapped on file, allowing to save data between session. If file_name parameter value is NULL, then anonymous memory mapping object is created with storage allocated from swap file.

Parameter max_size specifies size of created memory section. This parameter is used only when open() is called first time to create shared memory object. All subsequent calls to open() by other processes will make attachment to existing memory object and can't change it's size. If size of mapped file is greater than value of max_size, then size of created section is set to the size of the file (it can't be extended in this case). If max_size is greater than size of the file, then Windows will first extend file to the size of memory mapping object. File will be truncated to actual used size by close() method (but it is necessary to have enough free space on disk to hold all max_size bytes of mapped file).

Parameter mode can be used to choose read-only or read-write access mode to memory section. In both cases file is opened for read/write and memory object is created with all access rights. But if read_only mode is used, the section is mapped on virtual memory with read only access permnission. So any attempt to modify object in section opened in read_only mode will cause access violation.

If parameter desired_address is not null, section will be mapped to specified virtual address. Otherwise system will find suitable address itself. To keep references between shared objects valid, it is necessary to map memory section to the same virtual address in all application. As far as application can map some other memory mapping objects, this address range can be already used. To avoid such conflict, you can explicitly specify address at which section should be mapped. If only one memory mapping object is used in all application or sizes of theses objects and order of their creating are the same in all applications, let system choose address for you.

status flush();

Flush modified pages on disk. This method is meaningful only if mapping on file is used.

void close();

This method will close the storage. If there are no more processes using this shared memory object, then it will be deallocated. If shared memory section is mapped on file, then before deallocation all modified pages will be saved on disk and then file will be truncated to it's actual size (used by allocated objects).

char* get_error_text(status code, char* buf, size_t buf_size) const;

Given status code returned by shared_memory method, this method copies in supplied buffer text of message for this code.

void set_root_object(void* root);

This method stores pointer to the root object in the storage header. Reference to the root object then can be extracted using get_root_object() method in following sessions (certainly if shared section is mapped on the file). All other objects from the storage can be accessed by normal pointers from the root object.

Usually creating of root object is done at the moment of storage creation (but root can be changed in any time). Do not forget to exclusively lock database when performing storage initialization if several processes can concurrently try to open the storage.

void* get_root_object() const;

Extract reference to the root object in the storage. This reference should be previously saved by set_root_object(void* root).

void check_heap();

Check consistency of shared objects heap. This function iterates through all objects and holes in shared memory section and checks offset fields before and after each segment. Assert statement checks that values of this fields are consistent. Inconsistency of heap can be caused by "walking pointer", writing to array element with out of range index value or as a result of system (program) fault. This method should be called with storage locked in shared or exclusive mode.

Convenient lock mechanism

When using methods lock(), unlock() you have to keep balance of lock/unlock calls and also create lock descriptor objects. But usually locks are used in structural way, protecting programs blocks of code. There are two classes, which can safe you from writing extra code and reduce probability of making error: exclusive_lock and shared_lock. This classes set lock from constructor and take off lock from the destructor. So if you want to protect program block, the only thing you have to do, is to create local (automatic) object of exclusive_lock or shared_lock class on stack. Compiler will do all other work for you, calling constructor before entering the block and destructor after exiting from the block. See example in section Application template

Synchronization primitives

In addition to the storage locking mechanism, there are two other synchronization primitives: semaphore and event. These classes are defined in shmem.h, provide system independent interface for synchronization operations. Tables below contain description of methods of these classes:

semaphore: classical Dijkstra semaphore
Method Description

bool open(const char* name, unsigned init_value = 0); Cerate semaphore with specified global name and set it's value to init_value. If name is NULL, then semaphore is local within process. Method returns false if semaphore can't be created.

bool wait(unsigned msec = INFINITE) Wait for specified period of time until semaphore will be signaled (value of semaphore becomes non-zero). Method returns false if timeout is expired before semaphore is signaled, and true otherwise

void signal(unsigned inc = 1); Increment semaphore value by inc

void close(); Close semaphore (this method do nothing in Unix)

**semaphore**: classical Dijkstra semaphore
Method	Description
bool open(const char* name, unsigned init_value = 0);	Cerate semaphore with specified global name and set it's value to `init_value`. If `name` is NULL, then semaphore is local within process. Method returns false if semaphore can't be created.
bool wait(unsigned msec = INFINITE)	Wait for specified period of time until semaphore will be signaled (value of semaphore becomes non-zero). Method returns false if timeout is expired before semaphore is signaled, and true otherwise
void signal(unsigned inc = 1);	Increment semaphore value by `inc`
void close();	Close semaphore (this method do nothing in Unix)

event: event with manual reset
Method Description

bool open(const char* name, bool signaled = false); Create event with specified global name and set it's state to the value of signaled parameter. If name is NULL, then event is local within process. Method returns false if event can't be created.

bool wait(unsigned msec = INFINITE) Wait for specified period of time until event will be signaled. Method returns false if timeout is expired before event is switched to signaled state, and true otherwise

void signal(); Set the state of the event to signaled.

void reset(); Reset the state of the event to non-signaled.

void close(); Close event (this method do nothing in Unix)

**event**: event with manual reset
Method	Description
bool open(const char* name, bool signaled = false);	Create event with specified global name and set it's state to the value of `signaled` parameter. If `name` is NULL, then event is local within process. Method returns false if event can't be created.
bool wait(unsigned msec = INFINITE)	Wait for specified period of time until event will be signaled. Method returns false if timeout is expired before event is switched to signaled state, and true otherwise
void signal();	Set the state of the event to signaled.
void reset();	Reset the state of the event to non-signaled.
void close();	Close event (this method do nothing in Unix)

There is also implementation of Posix semaphores for Unix based in IPC semaphores. Interface of Posix semaphores as well as semaphores placed in shared memory is in the file posixsem.h and implementation in the file posixsem.c. There is only one difference between Posix specification of method sem_init() and one defined in posixsem.h; instead of int pshared parameter, char* name parameter with global name of semaphore is used. If name == NULL then semaphore is private. Semaphores in shared memory are implemented using GCC inline assembler facility and i486 XADD, CMPXCHG instructions. Non-blocking operations with such semaphore are very fast and require no context switching. Their interface is similar with one in Digital Unix.

Based pointers

Using of __based() qualifier, supported by Microsoft Visual C++ compiler, makes it possible to map shared memory section to different virtual addresses in different applications. Static pointer is used to point at the beginning of mapped section (so only one section can be mapped in each moment of time). To use this scheme, you should declare all reference fields of shared objects with REF(type) macro instead of TYPE* and compile your application with -DUSE_BASED_POINTERS option.

Application template

shared_memory shmem;
class tree { 
  public:
    tree* left;
    tree* right;
    int   val;

    void* operator new(size_t size) { 
	return shmem.alloc(size);
    }
    void operator delete(void* p) { 
	shmem.free(p);
    }
    tree(int key) { val = key; left = right = NULL; }
};

class root_object { 
  public:
    tree* root;

    void  insert(int key) { 
	exclusive_lock x_lock(shmem);
	...
    }
    tree* search(int key) { 
	shared_lock s_lock(shmem); 
	...
    }
    void  remove(int key) { 
	exclusive_lock x_lock(shmem);
	...
    }
    void* operator new(size_t size) { 
	return shmem.alloc(size);
    }
    void operator delete(void* p) { 
	shmem.free(p);
    }
    root_object() { root = NULL; }
};

main()
{
    shared_memory::status rc;
    root_object* root;
    rc = shmem.open("test.odb", "test", max_size);
    if (rc != shared_memory::ok) { 
	shmem.get_error_text(rc, buf, sizeof buf);
	fprintf(stderr, "Field to open file: %s\n", buf);
	return EXIT_FAILURE;
    } else { 
	exclusive_lock x_lock(shmem);
	root = (root_object*)shmem.get_root_object();
	if (root == NULL) { 
	    root = new root_object;
	    shmem.set_root_object(root);
	}
    }
    root->insert(0);
    ...
    shmem.close();
    return EXIT_SUCCESS;
}

Advantages and disadvantages

Advantages	Disadvantages
Is very simple and can be used for various purposes	Requires mapping of memory segments to the same virtual address in all applications.
Provides very fast access to shared object with almost no runtime overhead.	It is not possible to place objects with virtual functions in shared storage.
Support different lock modes: shared, exclusive, nested, with timeout.	Has no fine grained locking facilities.
Provides persistence for shared objects.	Doesn't support transactions and fault tolerance.

How to use

File shmem.h contains interface and shmem.cpp - implementation of shared memory class.

Class shared_memory is implemented for MS-Windows-95/NT and Unixes for i86 platform with GCC compiler (only Linux is tested at the current moment). You can compile test application tstshmem.cpp using GCC under Unix by makefile. Under Windows you should use makefile.mvc for Microsft Visual C++ or makefile.mvc for Borland C++. This application starts N threads each of which inserts 10,000 elements in the tree, then searches 10 times for each element and then removes all inserted elements. Each insert and remove operation is protected by exclusive lock, while each search in tree is protected by shared lock. Parameter N should be specified in command line and should not exceed 32. It is possible to run several tstshmem programs simultaneously. Tables below contains results for some values of N:

Pentium-II 233 Windows-NT
Use locking Processes Threads per process Time (sec)

no 1 1 4.406

yes 1 1 6.609

yes 1 2 14.591

yes 1 4 36.011

yes 2 1 12.387

yes 2 2 50.172

yes 4 1 59.435

Pentium-II 233 Windows-NT
Use locking	Processes	Threads per process	Time (sec)
no	1	1	4.406
yes	1	1	6.609
yes	1	2	14.591
yes	1	4	36.011
yes	2	1	12.387
yes	2	2	50.172
yes	4	1	59.435

Pentium-II 233 Linux
Use locking Processes Threads per process Time (sec)

no 1 1 3.720

yes 1 1 5.707

yes 2 1 14.210

yes 4 1 45.007

Pentium-II 233 Linux
Use locking	Processes	Threads per process	Time (sec)
no	1	1	3.720
yes	1	1	5.707
yes	2	1	14.210
yes	4	1	45.007

Test program fifo.cpp illustrates how shared_memory class can be used for interprocess communication. Also semaphore and event classes are tested in this program. This program contains classical example of consumer-producer task, based on FIFO queue. Any number of producers can insert elements in queue and any number of consumers extract them from the queue in First In First Out order. To run producer you should specify in command line number of messages, which producer should insert in the queue. After insertion of specified number of message producer will terminate. Consumer is started without no arguments. It tries to extract message from the queue during some period of time. If no messages arrive during this period (10 seconds) consumer terminates.

There is also one utility program for Unix semstat, which extends functionality of ipcs program and allows you to dump semaphore values. Source code of this utility is in semstat.c file.

Distribution

SHMEM is freeware and is distributed in the hope to be useful. Your are free to use this class in your applications and modify the sources. Also feel free to ask me any questions about SHMEM. Freeware status doesn't mean lack of support. I will do my best to fix all reported bugs and add desired functionality. E-mail support is always guaranteed.

Look for new version at my homepage | E-mail me about bugs and problems