Introduction

GOODS is a general purpose distributed multilingual object-oriented database management system. It supports a client-server model with active clients and passive servers. The database consists of a set of storages. Each storage is managed by a separate server. The distribution of objects between storages is mostly transparent for the client, but it can attach objects to a concrete storage. All object methods are executed by clients. Objects are stored in the storage in a language and operating system independent format. So it is possible for applications written in different programming languages to work simultaneously with the same database. GOODS provides a distributed garbage collector to guarantee the integrity of all references. All features of a classical DBMS are provided by GOODs: ACID transactions, concurrency control, online backup, crash recovery and server monitoring. GOODS also supports online schema modification without stopping client applications. Moreover, it is possible for several clients, having different class definitions, to work simultaneously with the same database. The conversion of object instances will be done automatically in lazy mode, when the object is loaded and modified by some client.

Currently interfaces for the C++ and the Java language are provided. The C++ interface is based on using smart pointers. It is discussed in the "readme.htm" document. The interface for the Java language is based on using a special Java byte-code preprocessor. It is described in the following sections.

Metaobject Protocol for Java

Transparent persistency and concurrency control

There are three main requirements to the GOODS language interface: transparency, flexibility and efficiency. Depending on the language, different approaches should be used to reach this goal. But the main idea is the same: the main advantage of object oriented databases is the elimination of the gap between application and database data models, so that the programmer can use a single paradigm and approach to design applications and database objects. To achieve this goal we need transparent persistency: there should not be any difference between code working with transient and persistent objects. But database applications have some aspects, which are not present in other applications. They have to control concurrency, security, data consistency. So it will not be possible to completely exclude this database specific code from the application. The idea is to separate the application code itself from code responsible for controlling concurrency and consistency. These two separate aspects can be combined together at compile time by using a Metaobject Protocol. Such approach makes the development of application code much easier as well as the development and debugging of synchronization code. Also using a MOP provides high flexibility, which is not possible in systems supporting only few predefined concurrency control disciplines.

One of the desired features of an OODBMS programming language interface is the support of orthogonal persistency. This means that the property of the object to be persistent or transient is orthogonal: independent from the class and the constructor used for object creation. This purpose can be achieved through persistency by reachability. The object becomes persistent when it is reachable (when there exist references to the object) from persistent roots or from other persistent objects. But complete support for orthogonal persistency causes significant runtime overhead. That is why the GOODS language interface provides the semi-orthogonal approach: only objects of some selected classes (they should be derived from special base classes) can become persistent. We will call them persistent capable classes. The principle of persistency by reachability is still used here: not all objects of the persistent capable class become persistent, but only those which are reachable from persistent roots. Such approach allows not to pay a performance penalty for normal (transient) objects, and provides efficiency of the OODBMS language interface implementation.

JavaMOP utility

Since the Java language supports only structural reflection and doesn't provide any facilities for behavioral reflection, we need some preprocessing tool to implement metaobject protocols for Java. A possible alternative is the development of a special Java Virtual Machine. But this is not so simple a task and moreover will restrict the sphere of applications using. I prefer to develop an interface compatible with the JVM implementations of different vendors.

It was decided to use a byte-code preprocessor. The development of such a preprocessor is much simpler than the development of a source-level preprocessor (no language parser must be constructed). Also the absence in the Java language of such constructions like the C "#line" directive, makes compiling and debugging applications with a source-level preprocessor very inconvenient. That is why a byte-code preprocessor JavaMOP was developed. It is implemented in C++, to increase the speed of processing and makes the development more comfortable.

JavaMOP receives as a parameter a list of Java classes and/or directories or archives of Java classes (.JAR or .ZIP without compression). If an archive file is specified, JavaMOP is able only to read the definition of classes in this archive, but is unable to preprocess these classes. To preprocess some class library, you should first extract all files from the archive and pass the root directory of the extracted tree to JavaMOP. If the directory name is passed, JavaMOP will handle all .class files in the directory and recurse into all subdirectories.

JavaMOP looks for classes containing a "metaobject" field as an instance variable. All such classes and classes derived from them are considered to be controlled by metaobjects. JavaMOP wraps bodies of methods of all such classes with calls of the metaobject methods preDaemon() and postDaemon(). As it is clear from their names, the first is invoked before execution of the first method statement; the last is invoked after returning - normally or as a result of exception - from the method. JavaMOP also detects accesses to non-self instance variables of such classes. JavaMOP wraps such accesses in all methods (not only in instance methods of objects controlled by metaobjects) with invocations of preDaemon() and postDaemon() methods.

To implement a transparent access to the database, it is necessary to detect the moment when the object is modified; and distinguish methods, which access the object in read-only mode, from methods, which can modify the object. Instance methods, which can (but not necessarily do) modify self objects, are called in GOODS mutators. Knowledge of whether a method is a mutator or not is very significant to select the proper object locking mode. The naive approach is to lock an object in shared mode when it is first accessed; and to upgrade the lock to exclusive mode, when it is going to be modified. But such approach is deadlock prone. Consider two applications accessing the same object. Both set a shared lock and then try to modify the object. But none of them can grant exclusive lock since other application has shared lock, preventing from locking object in exclusive mode. But none of them can grant an exclusive lock since the other application shares the lock, preventing both from locking the object in exclusive mode.

JavaMOP uses a two-pass algorithm to correctly calculate the set of all mutator methods. All methods modifying their object instance variables are considered to be mutators. If some method f() invokes method g() for the same object, and method g() was marked as a mutator, then method f() should be also marked as a mutator. The situation becomes more complicated because of polymorphic calls (virtual methods). We should mark method f() as a mutator if any of the implementations of method g() in the self class or a derived class was marked as a mutator.

During the first pass, JavaMOP generates a methods call graph and marks as mutators all methods with instructions assigning some values to self instance components. Then JavaMOP propagates mutator attributes, iteratively scanning the methods call graph and marking as a mutator each method, which invokes some mutator method. After that, code is generated for all methods with known status of each method.

To detect the moment of object modification, JavaMOP allocates an extra local variable and assign to it a non-zero value before each store instruction to a self instance field. When the postDaemon() metaobject method is called, the value of this variable is passed to it to determine whether the object was modified by the method. JavaMOP reduces the size of generated code by doing basic block optimization: if there are several store instructions within one basic block, the modification marker variable can be assigned only once.

Constructors should be handled in a special way since it is not possible to wrap a constructor body with invocations of metaobject methods (because the first statement of a constructor should be the invocation of the constructor of the superclass). So JavaMOP inserts a call of the postDaemon() method only after the top-level construction invocation statement, following the object creation by the new operator. The method preDaemon() can be called by applications explicitly from the constructor of some base class.

Although JavaMOP will do its best to set the mutator and the modified attributes, it is possible for an application to explicitly mark some method as a mutator or an object as modified. This can be done by invocating two special static methods: mutator() and modify(). JavaMOP removes code invoking these methods and, instead of this, changes the correspondent attributes. So a metaobject protocol, provided by JavaMOP, is very simple and consists only of two primary methods:

public abstract class Metaobject { 
    //
    // Invocation of this method is inserted by MOP generator before each
    // method invocation or object component access.
    //
    abstract public void preDaemon(Object obj, int attr);
    //
    // Invocation of this method is inserted by MOP generator after each
    // method invocation or object component access.
    //
    abstract public void postDaemon(Object obj, int attr, boolean modified);


    //
    // Attributes for daemons 
    //
    final static int MUTATOR    = 1; // object can be changed
    final static int VARIABLE   = 2; // access to non-self instance variable 
    final static int CONSTRUCTOR= 4; // wrapped method is constructor
    final static int EXCEPTION  = 8; // method is terminated by exception 

    //
    // This method is only hint to the MOP preprocessor to consider a method
    // invoking Metaobject.mutator() as beeing a mutator. No actual code
    // will appear in the preprocessed class file.
    //
    public static void mutator() {} 

    // 
    // This method provides the  MOP preprocessor with the information that objects
    // were (or will be ) modified. No method invocation will appear in the
    // preprocessed code.
    //
    public static void modify() {} 
};

JavaMOP provides some other useful facilities. For each class, controlled by a metaobject, it generates a special constructor with a single Metaobject parameter, which is passed to the constructor of the base class. Such constructor is used in GOODS to create an object instance when it is loaded from the database. But using this generated constructor makes it impossible to declare final instance components in classes controlled by metaobjects (JavaMOP is not able to insert correct initialization code for such components in the generated constructor, causing a verification error). The attribute final can not be used even with instance fields declared as transient.

The second JavaMOP feature can be used to overcome the limitations of the reflection facilities of JDK 1.1, which violates access to non-public components of non-public objects (in JDK 1.2 it is possible to set a bypass flag and access all components of all objects). As far as there is only a beta release of JDK 1.2 available at this moment (and even after JDK 1.2 will be available, a lot of users will continue to use JDK 1.1), JavaMOP has a special option -public, which tells JavaMOP to set the public attribute to all MOP classes and their fields. Using this option is not needed with JDK 1.2.

The JavaMOP option "-classes" allows you to specify a list of classes, which should be controlled by metaobjects. In this case exactly one class with a "metaobject" instance field should be provided to JavaMOP. We will call this class the MOP root class. JavaMOP will transform the definition of the classes, specified in the list, to make them be derived from the MOP root class. The option "-classes" should be followed by a file name. This file should contain a list of full class names, separated by space characters (new line, space, tabulation...). The class name should be written in the same format as in the Java import statement. For example: "goodslib.ArrayOfChar" or "goodslib.*". If the '*' symbol is placed at the end of a class name, then all classes from the specified package and nested packages are implicitly derived from the MOP root class. Classes specified in the "-classes" option list should not be archived, otherwise JavaMOP will not be able to preprocess them.

When JavaMOP is used for the GOODS client interface, the class goodsjpi.Persistent, containing a metaobject instance field, is available. All persistent capable classes should be derived from this Persistent class (it can be done implicitly by JavaMOP, see paragraph below). So the set of persistent capable classes is equal to the set of MOP classes (i.e. the classes controlled by metaobjects).

It is possible to use the existing Java class library without changing sources by putting the list of persistent capable classes in some file and using JavaMOP with the "-classes" option. The following rules should be observed carefully when using this option:

  1. A class should not have more than one instance array component.
  2. If the instance array component is modified (by storing array elements), the method Metaobject.modify() should be called explicitly to mark the object as modified.
  3. All classes, which objects can be referenced from some other persistent capable class, should also be persistent capable (so persistent capable classes should form closed sets in the sense of interclass references).

It is significant to notice that although JavaMOP was designed for the GOODS Java interface, it is not GOODS specific and can be used in any application requiring non-standard manipulation with object instances.

JavaMOP performs the following byte code transformations:

  1. Add prologue/epilogue to each method of persistent capable object For example, method
    void foo() { 
            // do something
    } 
    
    with be transformed to
    void foo() { 
            metaobject.preDaemon(this, 0);
            try { 
                    // do something
                    metaobject.postDaemion(this, 0);
            } catch (Throwable x) { 
                    metaobject.postDaemion(this, Metaobject.EXCEPTION);
            }
    }
    
  2. All accesses to the not-self instance variable of persistent capable class are wrapped with metaobject method:
    void func(PObject po) { 
            po.x = 0;
    }
    
    will be transformed to
    void func(PObject po) { 
            po.metaobject.preDaemon(po, Metaobject.MUTATOR|Metaobject.VARIABLE);
            po.x = 0;
            po.metaobject.postDaemon(po, 0);
    }
    
  3. Invocation of constructor of persistent capable object is ended with postDaemon call:
    PObject po = new PObject();
    
    will be transformed to
    PObject po = new PObject();
    po.metaobject.postDaemon(po, Metaobject.CONSTRUCTOR);
    
So while patch 1) influence only class files with persistent capable classes, patches 2 and 3 are made in any method which is dealing with persistent capable objects. So not only persistent capable classes themselves should be preprocessed by JavaMOP but also all classes which are using these persistent capable classes.

When it is possible to avoid preprocessing of class files with JavaMOP? Lets say we have two classes PObject, which is persistent capable class, and TObject which is not persistent capable class (transient class):

class PObject extends Persistent { 
        PObject next;
        int     x, y;

        void f(TObject to) { 
                to.g(this);
        }
}

class TObject { 
        void  g(PObject po) {
                // do something with po object but do not modify it  
        }
}
If g() method of class TObject can be invoked only from instance methods of class PObject, and it doesn't modify PObject, then it is possible to skip preprocessing of TObject.class with JavaMOP.

But if the program contains code like this

class X {
        void foo() { 
                PObject po = new PObject();
                TObject to = new TObject();
                to.g(po);
        }
}
or g() method creates instances of PObject class:
class TObject { 
        void  g(PObject po) {
                PObject anotherPO = new PObject();
                ...
        }
}
or access component of some other instances of persistent capable objects:
class TObject { 
        void  g(PObject po) {
                while (po != null) { 
                        po.f(this);
                        po = po.next;
                }
                ...
        }
}
then you will have to preprocess TObject.class file with JavaMOP.

You can specify all files which should be preprocessed by JavaMOP in command line (may be using wildcards, for example *.class), or you can specify only the name root directory (or names of several directories) which will be recursively traversed by JavaMOP. Only the files with *.class or *.CLASS extension will be preprocessed by JavaMOP while traversing the tree.

If you want to create archives with persistent capable classes (ZIP or JAR), they should be build without compression (-0 flag for zip or jar) in order to let JavaMOP to read this files. JavaMOP currently is not able to unpack compressed ZIP files. JavaMOP will not patch files extracted from archive. Also JavaMOP adds special string to the preprocessed class file, so it will not patch it next time (in other words it is possible to apply JavaMOP several times to the same file - result will not be changed). JavaMOP doesn't use CLASSPATH and process only the classes specified in command line. Moreover, as far as JavaMOP should build persistent close to perform correct bytecode transformation (i.e. detect set of all classes which are persistent capable), it is necessary to specify ALL class files which are persistent capable OR access persistent capable objects (so if you apply JavaMOP idividually for each class file, then produced code will not be correct)

Basic metaobjects

The GOODS interface for the Java language provides a set of predefined metaobjects, implementing most popular disciplines of concurrency control. By deriving from these classes, the programmer can develop own metaobjects, thus implementing sophisticated synchronization disciplines and fitting requirements of concrete applications.

Most of the metaobject functionality is implemented in the BasicMetaobject class. This class is responsible for managing the object cache, controlling the transaction processing and handling the object invalidation messages from the servers. GOODS provides an implicit model for fixing transactions: each invocation of the method of an object controlled by the BasicMetaobject is considered as a nested subtransaction. The transaction is automatically committed when all nested subtransactions are terminated (in other words, when there are no more active methods of classes, which are controlled by a metaobject). The BasicMetaobject metaobject invokes the beginReadAccess(), beginWriteAccess() and endAccess() methods, which implement the concrete synchronization policy by setting shared or exclusive locks. Three policies are supported now and so three metaobjects, derived from the BasicMetaobejct, are defined:

OptimisticMetaobject
Accessed objects are not locked at all. When the transaction is committed, the server checks if the client has the most recent version of the object. If some other client modified this object before this client, then the transaction is aborted by the server. It depends on the application how to handle this situation. Objects, which are only read and not modified by the application during the transaction, are not checked by the server to be up-to-date. This strategy is efficient, when conflicts are rare and it is possible to restart the transaction. It also can be used when the object is always accessed through some other object (container), which is responsible for synchronization.
PessimisticMetaobject
This metaobject sets an exclusive lock on the object, when it is accessed by the mutator method, to prevent loss of modifications as a result of a transaction abort. Locks are released at the end of the transaction. This metaobject doesn't lock objects accessed in read-only mode; such objects are not checked by the server to be up-to-date.
PessimisticRepeatableReadMetaobject
This metaobject is derived from the PessimisticMetaobject. In addition to setting exclusive locks for modified objects, it also locks read-only objects in shared mode to prevent the object from modification by other clients until the end of the transaction. This metaobject is used by default for all persistent capable classes. It provides the highest level of consistency, but at the expense of extra message passing, thus reducing concurrency. Also it can cause deadlocks by upgrading shared locks to exclusive locks.

The basic metaobject controls the work of the object cache. Since loading objects is the most frequent operation in a OODBMS, the performance of a database application mostly depends on an effective cache mechanism. Therefore special attention was paid to the implementation of the cache object replacement discipline in the GOODS client library. The most popular discipline for cache management is LRU (replacing the least recently used object). But it is not smart enough for database applications. For example, a sequential search through a large number of objects can completely flush the cache by replacing all objects in it although most of the replacing objects will not be used by the application in the future.

To solve this problem, an extension of the standard LRU scheme was proposed in GOODS. The object cache is divided into two parts: a primary cache and a cache for frequently used objects.The first time the object is accessed, it is placed in the primary cache. The object will be moved to the "frequently used objects" cache only when it is accessed several times. Both parts of the cache are managed by a standard LRU discipline separately: when the total size of objects in one part exceeds some limit, all least recently used objects from this part are thrown away from the clients memory, not affecting objects in the other part of the cache. It is possible to set limiting values for both parts of the cache by the BasicMetaobject.setCacheLimits(int l0, int l1) method. The parameter l0 specifies the limit for the primary cache, and l1 for the cache of the frequently used objects.

How to develop Java application for GOODS

Developing Java applications for GOODS is very simple. First of all, you should import the package goodsjpi and derive all classes, objects of which you are going to store in the GOODS database, from the goodsjpi.Persistent class (you can use the "-classes" option of JavaMOP to let JavaMOP do this work for you). We call such classes persistent capable classes. All objects referenced from persistent capable classes should also be persistent capable. The only exception exists for fields declared as transient or static. GOODS Java interface doesn't store values of transient fields in the database. After loading a persistent object from the database, all its transient fields are set to default values. Also values of static fields are not saved and will be lost when the application terminates. In addition to reference fields, it is also possible to use all Java primitive types for declaring components of persistent objects. But you can't use the final modifier for instance fields, because of the presence of a special constructor generated by JavaMOP.

Special conventions apply to array types as components of persistent capable objects. Since Java arrays are derived only from the Object class, they are not persistent capable. Instead of this, GOODS interprets a persistent object with a single array component as an object with varying length (corresponding to the C++ type with varying length of the last component: struct string { int length; char body[1]; }; ). So it is not possible to have more than one array component in a persistent capable class. Furthermore you have to explicitly use the Metaobject.modify() and/or Metaobject.mutator methods, because JavaMOP is not able to detect modifications of array components. It is better to avoid using Java arrays for components of persistent objects. Instead use the dynamic array classes from the GOODS persistent class library for Java.

You should choose a synchronization strategy to control concurrent access to the objects of your classes. By default, the pessimistic repeatable read scheme is used in GOODS. This scheme provides the highest level of isolation, but causes significant communication overhead and is deadlock prone. You can use a pessimistic or even an optimistic scheme or develop your own metaobject class. It is possible to assign a metaobject to the object instance explicitly (but this binding will remain only until the object is in the client's cache); or associate a default metaobject with the object class (so that it will be used for all instances of the class). If the class definition has static final components of Metaobject type, then the GOODS client library treats the value of this field as a default metaobject for this class and all derived classes (unless they specify their own default metaobjects).

Database servers are accessed by client applications through the Database class. A database is opened by the Database.open(String config_File) method, which takes the name of the database configuration file as its parameter. This file can be accessed via NFS from a central server or can be distributed to client computers through some other mechanism. The first line of this file contains the number of storages in the database. All successive lines specify locations of storage servers. Each line consists of three parts, separated by a colon: storage identifier, host name and port number. The storage identifier can range from 0 to number-of-storages -1.

After the successful database opening, it is necessary to extract the root object. Each GOODS storage has a single root object. It can be retrieved by the Database.getRoot(int sid) or Database.getRoot() method. The last one gets the root of the storage 0. If the storage was not yet initialized, the getRoot() method returns null. In this case it is necessary to create a root object and assign it to the storage by means of the Database.setRoot() method. To prevent concurrent initialization of storage by several clients, the root object is locked by the Database.getRoot() method (only if null is returned) and the lock is released only by the Database.setRoot() method. The method Database.setRoot() can be invoked only once and should not be used for already initialized storage.

Having a reference to the root object, you can access any other object in the database using standard facilities of the Java language. Newly created objects of persistent capable classes will become persistent when you store references to them in some other persistent object. By default the new object will be placed in the same storage as the persistent object, which references it. But it is possible to explicitly assign the storage to the object by means of the Persistent.attachToStorage(Database db, int sid) method. This method can not be applied to an already persistent object in order to reallocate it in another storage. The object will become persistent after commit of transaction.

GOODS interface for Java uses a model of implicit transaction commit: each invocation of a method of a persistent capable class is considered as a nested subtransaction (without the possibility to undo changes). The transaction is automatically committed when all nested subtransactions are terminated (in other words, when there are no more active methods of persistent capable classes). It is possible to explicitly start and finish a nested transaction by means of the BasicMetaobject.beginNestedTransaction() and BasicMetaobject.endNestedTransaction() methods.

GOODS can notify the application about the deterioration of an object instance (if it was modified by some other client). Notification is done through an object of the goodsjpi.CondEvent class. Clients wishing to receive notification about the object modification should invoke the Database.notifyOnModification(Persistent obj, CondEvent event) method and create a separate thread, which will wait until the event object will be signaled using the CondEvent.waitSignal method. To cancel the delivery of notifications, pass a null value as the event parameter.

If the optimistic synchronization strategy is used, it is possible to receive a notification about transaction abortion. This can be done by the Database.notifyOnTransactionAbort(CondEvent event) method.

After finishing work with the database, you should call the Database.close() method to finish the database session. If a client session was aborted by a server (due to server shutdown or server/communication failure), then the Database.disconnected(int sid> method is called, which throws the goodsjpi.SessionDisconnectedError exception. Other errors, detected by the GOODS client interface, are handled by the Database.handleError(String text) or the Database.handleException(IOException x) method. By default, these methods just raise an exception signaling a server communication failure. The application programmer can redefine these methods by creating her/his own class derived from the Database class.

So the template scheme of a Java GOODS application is the following:

import goodsjpi.*;

class MyRoot extends Persistent { 
    ...
}

class Application { 
    static public void main(String args[]) { 
        Database db = new Database();
	if (db.open("app.cfg")) { 
	    MyRoot root = (MyRoot)db.getRoot();
	    if (root == null) { 
	        root = new MyRoot();
		db.setRoot(root);
	    }
	    ... // do something with the database
	    db.close();
	} else { 
	    System.err.println("Failed to open database");
	}
    }
}

After compiling your application, you should run the JAVAMOP preprocessing utility. You should specify a complete list of all Java class files, used in your application, together with goodjpi.jar (and goodslib.jar, if used). You can see examples of running the JAVAMOP tool in GOODS' makefiles. JAVAMOP also has two optional parameters. The option "-package package-name" specifies the name of the package containing the definition of the Metaobject. You should always use "-package goodsjpi" for GOODS applications. The second option is -public and should be used only with JDK 1.1. With this option present, JAVAMOP will change access attributes of classes and fields to PUBLIC. The reflection package in JDK 1.1 doesn't allow to access non-public components: thus loading objects from the database is impossible. You can certainly declare all classes and their fields as public in the Java sources yourself; but it is better to preserve the class encapsulation at source level and use the -public option to solve problems with the JDK 1.1 reflection restrictions. The following command line illustrates the usage of JAVAMOP:

javamop -public -package goodsjpi *.class \goods\goodsjpi.jar \goods\goodslib.jar

resp.
javamop -public -package goodsjpi *.class /goods/goodsjpi.jar /goods/goodslib.jar
To run your application you should first prepare the configuration file, for example "app.cfg":

1
0: localhost:6100

Then you should start the database server(s) at the specified net node(s). The server is started by the command "goodsrv database-name [storage-identifier]". If the storage identifier is not specified, the storage 0 is used by default. For example, the GOODS server for this template application can be started by the command "goodsrv app". More information about the goodsrv program can be found in "readme.htm" document. Having started the server(s), you can run the application(s).

For some applications client-server architecture is not needed. They are using GOODS as embedded database engine. For such applications GOODS provides goodsjpi.Server class which makes it possible to start/stop GOODS server from the application. You should create this class with specifying database configuration file name to the constructor and then use start/stop methods. You can also send any command to the server process using BufferedReader Server.command(String cmd) method. When you are launching your Java you should not forget to add GOODSRV to PATH environment variable. See Guess.java example.

It is possible to change the definition of classes without loosing instances of these classes already stored in the database. GOODS supports automatic online schema evaluation. You can add new fields to the class, delete some fields, change primitive types of fields (any conversions between primitive types are allowed). Reformatting the object instances according to the new format will be done in lazy mode. When the client loads the object, the descriptor of the application class is compared with the descriptor of the class in the database. If the descriptors are different, the GOODS client library performs automatic conversion of the object to the new format. After being modified , the new object will be stored in the storage in the new format. It is possible for several clients to have different class definitions and work simultaneously with the same database. Unfortunately, GOODS' automatic update mechanism is not able to rename fields, to change the types of references and to change the inheritance graph.

Using Interface types and the PersistenceFactory

As of 2.51, to increase the flexibility of the GOODS implementation you can use Interface types in your objects, such as

  Set foo;
instead of
  PersistentHashSet foo;
The consequences of this are that you no longer need to import packages goodsjpi and goodslib into each class. To assign objects to these fields it is however necessary to use PersistenceFactory.get();
     import persistence.PersistenceFactory;

     public class Bar
       {
       Set foo;

       public Bar()
         {
         foo=PersistenceFactory.get(Set.class);
         }
       }
The PersistenceFactory class is configured via a properties file and may use different concrete implementations based on the contents of that file, relaxing the dependency on GOODS and allowing other persistence mechanisms such as JDO to be switching in more easily. (Note that JDO needs explicit transaction boundaries).

Example properties file persistence.properties

persistence.dbname=goodstest
persistence.roottype=test.GoodsTest
persistence.java.util.Set=goodslib.PersistentHashSet
persistence.persistence.PersistenceFactory=goodsjpi.GoodsFactory
goodsjpi.meta=goodsjpi.PessimisticMetaobject Optional GOODS specific parameter

Example implementations of PersistenceFactory and GoodsFactory are supplied; you may also implement your own for additional flexibility. Note that all persistence capable objects still need to be processed with JavaMOP.

Sometimes using the default constructor is not sufficient so the PersistenceFactory supplies a method which allows arguments to be supplied. This method looks like

     Set foo=PersistenceFactory.get(Set.class,new Class[] {Integer.TYPE, Float.TYPE}, new Object[] { new Integer(100), new Float(0.6f) });
For more information on this method, read the documentation for java.lang.reflect.Constructor.newInstance(...);

Per-thread transaction

Starting from 2.36 version of GOODS, per-thread transactions are supported. In prior versions of GOODS only all threads share the same transaction. The transaction is committed only if there are no more active methods of persistent capable object in any thread. The counter of nested transaction invocations was static variable of BasicMetaobject class, so all persistent capable objects (even if these objects belongs to the different database) share the single counter of nested transaction.

In 2.36 version of GOODS, special class CacheManager was introduced to contain static variables from BasicMetaobject class. By default the behavior is consistent with old versions of GOODS - all threads share the same transaction. But it is now possible to assign some particular instance of CacheManager to a thread or group of threads. Database class has now two new methods: attach() and detach(). Each Database class can have its own CacheManager. But unless attach() method is executed by the thread, default cache manager is used. The method Database.attach associate cache manager of the particular database connection with the current thread. The method detach destroy this association. If detach method was not called and thread is terminated, then association will be automatically destroyed after some time.

Per-thread transaction model is especially useful for servers, when one process handle remote connections with multiple clients. The actions performed by each client should be treated as separate transaction. To implement this model in GOODS version 2.36 and higher, it is necessary to create separate database connection for each thread (create instance of goodsjpi.Database class and open the database). Then thread should be attached to the database by goodsjpi.Database.attach() method. Each thread will have its own object cache, so there can be several instance of the object with the same OID in the application. Standard GOODS synchronization mechanism based on locks set by metaobjects is used to avoid conflict of accessing the same object by several concurrent threads. There are the following drawbacks of the proposed model:

  1. Inefficient use of client memory due to duplication of information - several copies of one object can be present in the application.
  2. Redundant load operations. Object will be fetched from the server even if instance of the object is available in the cache of other thread.
  3. Instead of having the single connection with server, multiple connections (one per thread) are maintained.
  4. Necessity of explicit attach and detach methods invocations - it is responsibility of programmer to associate database connection with thread.

This model of per-thread transaction is experimental and may be changed in future. Example illustrating per-thred thransactions:

class MyThread extends PThread { 
    public void run() { 
	Database db = new Database();
	db.attach();
	try { 
	    if (db.open("myserver.cfg")) {
	        ...
                db.close();
	    }
	} finally { 
	    db.detach();
	}
    }
}

class MyServer { 
    static public void main(String args[]) throws Exception { 
	int nThreads = nThreadsByDefault;
	Thread[] threads = new Thread[nThreads];
	for (i = 0; i < nThreads; i++) { 
	    threads[i] = new MyThread();
	    threads[i].start();
	}
	for (i = 0; i < nThreads; i++) { 
	    threads[i].join();
	}
    }
}
For real servers, it is better to maintain pool of threads. So the server will not spawn new thread and open database connection each time it receives request from the client (and destroying thread and connection after requests has been processed). Instead of it threads for handling client request are taken from the pool and return to the pool after the end of request processing. Each thread has permanently opened database connection. Such scheme can significantly increase total server throughput because of eliminating overhead of spawning new thread and especially establishing new database connection.

Many database applications do no create threads themselves, instead of it threads are spawned by some other server. Example is Web server servlets. Servlet is initialized only once but its service method can be invoked concurrenly from different threads by Web server. In this case the approach described above doesn't work - we don't know when create, open and close database because threads are created by Web server on clients requests. In this case the following schema is proposed:

class MyServelet extends HttpServlet { 
    public void init() throws ServletException {
        // no database initialization is done at this stage
    }
   
    public void service(HttpServletRequest req, HttpServletResponse resp) 
      throws ServletException, IOException 
    {
         Database db = Database.getDatabase("mydb.cfg");
         try { 
	      // work with database db
         } finally {  
 	      db.close();
         }
    } 

    public void destroy() {
         Database.closeAll(); // close all opened databases
    }
}
In this example static methods Database.getDatabase and Database.closeAll are used. Database.getDatabase method checks whether he current thread already has associated database (method Database.getDatabase was already invoked by this thread). If thread has no associated database, then database is created, opened andd attached to the current thread. The method Database.closeAll closes all databases opened by application. There is also Database.closeAllDead static method, which close only those database which threads are already terminated. It is not neccessary to invoke Database.closeAllDead method expliciltly because is is called by Database.getDatabase to limit number of opened connections.

If you want to have per-thread transaction but also you prefer to share the single connection to the database by all threads, then GOODS has solution even for such situation. The method Database.setIsolationLevel(int level) can be used to set isolation level for the cache manager. By default it is Database.PER_PROCESS_TRANSACTION and behavior is the same as described above. But if isolation level is Database.PER_THREAD_TRANSACTION then cache manager will block attempt of any other thread to access the database before the current thread finish its transaction. The method Database.setIsolationLevel(int level) sets isolation level to the cache manager associated with connection (if you invoke Database.attach method) or call CacheManager.getCacheMananager method to get cache manager associated with current thread or default cache manager. The method Database.setIsolationLevel should be called before any access to the persistent capable objects.

Raw binary types

Starting from version 2.61 GOODS supports raw binary types. Components of such types are not stored as separate GOODS objects, but instead are packed inside persistent object to which them are belong. Raw binary types are handled by RawBinaryFactory. This interface provides methods for packing/unpacking raw binary objects and calculating their sizes. To allow support of raw binary classes, programmer should create its own implementation of this interface and assign it to Database.rawBinaryFactory static component. So only one raw binary factory can be used in the application.

RawBinaryFactory interface contains supports method which is used by GOODS to check if static type of class component is supported by the factory. So persistent capable objects can include components of the following types:

GOODS provides one implementation of RawBinaryFactory interface: SerializableObjectFactory. This factory supports all classes implementing java.io.Serializable interface. It uses java.io.ObjectOutputStream to pack any serializable object and java.io.ObjectInputStream to unpack it. Please notice, that Java serialization mechanism cause about 20-30 bytes overhead (for example empty array of char is packed to 27 bytes)

Persistent class library

The GOODS client interface provides a library of persistent capable classes implementing efficient algorithms for searching/retrieving objects from a database. Using these container classes will simplify program development of database applications for GOODS. These classes are available from the goodslib package.

Dynamic arrays

Since Java's builtin array types can't become persistent objects in the GOODS Java interface model, special dynamic array classes, providing direct access to their elements by index, were developed. The size of such array can be changed at runtime and new elements can be inserted or removed from the array. Dynamic array classes for all Java primitive types are defined, as well as a dynamic array for object references:

ArrayOfByte.java
ArrayOfChar.java
ArrayOfShort.java
ArrayOfInt.java
ArrayOfLong.java
ArrayOfFloat.java
ArrayOfDouble.java
ArrayOfObject.java
ArrayOfBoolean.java

Class ArrayOfChar is the dynamic analogue of Java String and provides basic methods for string manipulation. The implementation of ArrayOfBoolean provides a space efficient way to store bitmaps. The definition of dynamic array classes in Java's persistent class library is compatible with the correspondent array definition in the C++ class library (file "dbscls.h"). In C++, dynamic arrays are implemented as a template class, and only the following instantiations of this template are defined in C++: ArrayOfByte, ArrayOfInt, ArrayOfDouble and ArrayOfObject.

All dynamic arrays can be constructed from the corresponding Java array type (ArrayOfChar and ArrayOfByte can be also constructed from objects of String type). The method asArray() constructs the Java array from the dynamic array class. The method copy(int dstIndex, byte[] src, int srcIndex, int count), which is the analogue of the System.arraycopy method, is available for all dynamic arrays. Dynamic arrays also implement a stack protocol by providing such methods as Push(T value), Pop(), Top(). The methods insert(int index, int count, T value) and remove(int index, int count) can be used for insertion/removal of dynamic array components.

Starting from version 2.02, GOODS also supports persistency for the Java String type. String bodies, preceded by counters, are stored together with other class fields. So no objects are created in the database storage for strings (this is possible because String is an immutable class in Java). Using strings is more space and time efficient than using dynamic arrays (i.e. ArrayOfChar). Storing an object header and fetching the object from the storage can be done without overhead. There is a single disadvantage with using String components in persistent objects - they are Java specific and have no analogue in C++: so no C++ application can access objects with String components. There is one restriction on using String components: a class with components of String type should not have Java array components (it can certainly contain references to objects of dynamic array classes defined in the persistent class library).

Ordered and unordered sets

A set container is implemented as an L2 list. The owner of the set is represented by the SetOwner class: it contains an L2-list header and a counter for elements in the set. Object-members of the set should be derived from the SetMember class and implement the Ordered protocol. The Ordered protocol contains methods for comparison between set members and the key object.

The class SetOwner provides methods for inserting new members at the beginning/at the end of the set, before/after specified set members and for removing members from the set. Furthermore, a simple sequential search method is implemented in this class.

The class OrderedSetOwner extends the SetOwner class and keeps members of its set in increasing order (using comparison methods from the Ordered interface). It redefines the insertion methods of the SetOwner class to preserve the order of the set members.

B-Tree

The B-tree is a classical data structure for DBMS. It minimizes the number of disk read operations needed to locate an object by key, and it preserves the order of elements (range requests are possible). Also the maintenance of a B-tree can be done efficiently (insert/remove operations have log(N) complexity).

In a classical implementation of the B-tree, each B-tree page contains a set of pairs <key, page-pointer>. The items at the page are ordered by key, so a binary search can be used to locate an item with greater or equal key. In a B*-tree, pointers to members are stored only in leaf pages of the B-tree. All other pages contain pointers to child pages. But in the Java language there is no structural type with value semantic: so objects containing sets of pairs <key, page-pointer> can not be defined in this language (certainly it is possible to represent it by M+1 objects, where M is the number of items at the page: but this is definitely not what we want). So we decided to store the reference to the key in the B-tree page itself. Now the B-tree page contains only arrays of pointers and one extra pointer to the largest key at this page. Leaf pages contain pointers to the objects inserted into the B-tree, which should be inherited from the SetMember class. All other pages contain pointers to the child pages. Such decision significantly simplifies the algorithm. But it increases the number of objects loaded by the client from the server during search/update operations. The Btree class is derived from the SetOwner class, such allowing sequential access to set members.

Blob

Most modern database applications have to deal with large objects, used to store multimedia and text data. The GOODS class library has a special class Blob to provide efficient mechanisms for storing/extracting large objects. Since loading large objects can consume significant time and memory, the Blob object allows subdividing large objects into parts (segments), which can be accessed sequentially. Moreover, a Blob object takes advantage of Java threads and makes it possible to load the next part of the Blob object in parallel while handling (playing, visualizing,...) the current part of the Blob. Such approach minimizes delays caused by loading the object from the storage.

Closure

The class ObjectClosure provides a way to store transient objects in GOODS storage using Java serialization mechanisms. Objects implementing the java.io.Serializable interface can be packed into an array of bytes using java.io.ObjectInputStream. This array can be stored with an ObjectClosure object in a GOODS database and later retrieved and unpacked by java.io.ObjectOutputStream. So this mechanism can be used to store objects of classes, which can't be derived from a Persistent base (i.e. are not persistent capable). But be careful with this approach. The semantic of object serialization is quite different from persistency by reachability: if the same object A is accessed from two different objects B and C and closures with root B and C are stored in the database, then after restoring objects from this closures, objects B and C will refer to two different instances of the original object A!

Hash table

A hash table provides fast random access to the object by key. The implementation of hash table for GOODS is an almost one-to-one copy of the java.util.hashtable class. This class provides automatic resizing of the hash table when the number of elements in the hash table becomes too large. Objects placed in a hash table and their keys should belong to persistent capable classes. It is possible however to pass a key of type java.lang.String, which is automatically converted to an ArrayOfChar type. The methods get(Object key) and remove(Object key) accept a key of any type.

An object of a persistent capable class used as a hash table key should define its own hashCode() method, if no persistent identifier is already assigned to this object. The method hashCode() is redefined in the goodsjpi.Persistent class to return the same value for all database sessions. But this is possible only if the object already has a persistent identifier (i.e. was made persistent in one of the previous transactions). Invoking the hashCode() method for a transient object of a persistent capable class, which does NOT redefine the hashCode() method, will cause an assertion failure.

H-Tree

The class Htree is a combination of a hash table and an index tree. It can be used when the size of the hash table is too large to represent the hash table as one single object (as an array of pointers). The H-Tree object first calculates a normal hash key and then divides it into several groups of bits. The first group of bits is used as an index in the root page of the H-Tree, the second group of bits as an index in the page referred from the root page, and so on... If e.g. the size of the hash table is 1000003, then the H-tree with pages, containing 128 pointers, requires access to three pages to locate any object. Since a reference in GOODS is 6 bytes long, the total size of the loaded objects is 2304 bytes (128*6*3) instead of 6Mb when using the hash_table class.

Class library

The class ClassLibrary can be used to store Java classes in a GOODS database. This class uses a Java class loader mechanism to provide loading of class files not from a file system, but from GOODS persistent objects.

The method ClassLibrary.storeClass(String className) stores a class with the specified name in a GOODS storage. The parameter className should be a fully qualified Java class name separated by the period character. The file name of the class file is produced by replacing the period character with an OS file separator symbol and appending the suffix ".class". Then the contents of the file is placed in the GOODS storage and a reference to it is inserted in the hash table. This class can throw IOException if it fails to read the class file data.

The method ClassLibrary.loadClass(String name) searches in the hash table for an ArrayOfByte object with class file data. The name parameter should be the same as in the correspondent storeClass method. The class is loaded and resolved using a special persistent class loader and a pointer to the created Class object is returned. If no class definition was found in the hash table, than null is returned.

Building GOODS Java client library

The GOODS Java client interface is implemented in pure Java and that is why it is portable and can be used with JDK 1.1 or JDK 1.2 and JVM of different vendors. Only the JAVAMOP utility is implemented in C++ and compiled and installed like other GOODS utility programs. The GOODS Java client interface consists of two packages:

goodsjpi.jar
Java programming interface for GOODS
goodslib.jar
Java persistent class library for GOODS

To build these packages, you should specify a target "java" for the "make" utility. "make" should be invoked from the GOODS root directory. For example, the following command will build all the GOODS staff, including the server library, the utilities, the client library for C++, the client library for Java, and all C++ and Java application examples:

make all java

There is one unportable item in the implementation of the GOODS client interface for Java. To support the cache of loaded persistent objects we need weak references functionality (weak reference provides a way to access an object without preventing the garbage collector from deallocating the object). Weak references are included in JDK 1.2, but they are absent in JDK 1.1. In JDK 1.1, an undocumented class sun.misc.Cache exists: it is actually a hash table, which allows the GC to collect unreferenced elements. Unfortunately, this class doesn't work correctly with JVM's of all vendors. For example JDK shipped with Microsoft IE 4.0 contains this class, but it has lost the "weak" property: objects placed in the Cache can not be collected by GC. Instead of this, Microsoft provides a class com.ms.vm.WeakReference (which is not present in JDK for IE 3.0). So GOODS tries to do its best to guess, which implementation of weak references should be used. It contains a special package goodsjpi.weak, which encapsulates system dependent details of the weak reference implementation. But as far as JDK 1.1 has no standard support for weak references, not all JVMs using JDK 1.1 can be used for running the GOODS Java client interface. At least the following environments are tested to be compatible with GOODS: Sun JDK 1.1.1 - 1.1.6, Sun JDK 1.2, Microsoft Jview from IE 4.0, Borland Jbuilder 2.0.

Compatibility with C++

Storing objects in the GOODS storage is done in a language independent format. So applications written in different languages can access the same database. The GOODS object model was chosen to be flexible enough to support object models of many existing object oriented languages. Currently only interfaces for C++ and Java are supported. Unfortunately, the object models in these languages contain a lot of differences, so there are some restrictions on the object formats, if you want to make them accessible from C++ and Java applications.

First of all, the Java language has no structured types with value semantic. In other words, a Java object can't contain instances of other objects, it can only have references to them. The GOODS object model allows structure components of objects, but if you want to access objects of this class from a Java application, you should avoid to use such structures.

The second restriction is caused by different representations of an object component with varying length. In C++, this object is defined in the following way:

      class string { 
         protected:
	   int  size;
	   char body[1]; // component with varying size 
      };
So space for a varying component is allocated after the end of fixed part of the object and it is not possible to derive some class with instance variables from a class containing a varying component. In the Java language, a varying component of an object is represented by an array, and it is possible to derive any class from it:

     class string {
         protected int  size;
         protected byte body[];
     } 

     class substring extends string { 
         protected int  offs;
     }
You should avoid such usage of arrays in Java if you want to access the objects from a C++ application.

One more issue is caused by different representations of strings in Java and C++. In Java, strings use Unicode characters and counters, while in C++, strings are zero terminated and consist of 8-bit characters (which can be certainly represented by the Java byte type). If you prefer to store in a database ASCII zero terminated strings instead of Java Unicode strings, then you can use methods of the goodsjpi.Converersion class to convert an array of bytes into a zero terminated ASCII string and vice versa.

The GOODS Java API makes it possible to store an object with components of class String. String bodies will be packed together with other object fields. But such object can not be accessed from C++ applications, because string types are not supported by the C++ API.

Due to historical reasons, only dynamic arrays and the Blob class from the GOODS persistent class library have compatible representation in Java and C++.

Native socket library

Java client is connected with GOODS server by sockets. Unfortunately, Java sockets implementation (at least under windows) is very bad - I write special test which sends large objects (64kb) through IP sockets between Java and C. Throughput was about 10 objects per second, CPU usage almost 0%. When I run the same test but between two C applications, speed is almost 100 times higher and CPU usage is 100%. I tried to tune some socket parameters (size of send and receive buffer) but have no luck. Looks like some timeout is used in Java socket implementation (it takes place only for large objects, for small everything works ok). It is certainly not standard TCP IP delay for sending message through socket, because it was explicitly disabled.

So I have implemented my own native socket library (it is actually wrapper to GOODS C++ socket library). Performance of these sockets at Windows is ten times better comparing with wsock32.lib. So total increase of performance (comparing with Java sockets) is about 1000! I think that it compensate some inconvenience with building and using some extra DLL.

To build native socket dynamic library you should run buildlocalsocket.bat file in goods\java directory. It will compile with Microsoft Visual C++ compiler jnilocalsock.dll and copy it to goods\bin directory. You should append goods\bin to PATH environment variable. That is all - GOODS will automatically try to load and use jnilocalsock.dll library for local connections (when destination address is "localhost".

Examples

It seems to me that the easiest way to become familiar with some product is to look at the examples. That is why I prepared a set of example Java applications for GOODS.

Guess an animal

This is a very small and simple program, showing all benefits of using an OODBMS for storing objects in a database. In spite of its simplicity, the program behaves as if it has elements of "artificial intelligence" (really it is just collecting information obtained from the user and the more information the user enters the cleverer the program becomes).

This application is an example for using an optimistic model of synchronization. If two or more users simultaneously answer the same question, then only one of them (who finishes first) will succeed. The other will see the message "Lets try again..." and information, she/he has entered, will be lost.

This test example also illustrates how C++ and Java applications can work with the same database. The GOODS root directory contains the implementations of the same game for C++ "guess.cxx", which is compatible with this Java application. You can run both of them simultaneously.

Start this application by starting the server goodsrv guess in the GOODS root directory and run the application itself by calling java Guess in the java subdirectory.

Graphic editor

This example is a very simple graphic editor, which nevertheless provides cooperative team work with the same document. You can start several GraphEditor applications, open the drawing with the same name and edit the same document concurrently. This application uses the classical ACID transaction model based on a repeatable read pessimistic metaobject.

Start this application by starting the server goodsrv graphedt in the GOODS root directory and run the application itself by calling java GraphEditor in the java subdirectory.

Test of Blob

This example is a test for Blob objects. It was also used to measure the effect of using a parallel load algorithm. This test creates a very large object (100 Mb) and then retrieves it from the database. The first time, the object is loaded using the Blob.play() method, which starts a separate thread to load the next part of the object in parallel while handling the current part. The second time, the object is loaded sequentially part-by-part without any parallelism. Results of running this test at my computers show that the time for parallel loading (134 seconds) is 10 seconds less than the time for sequential loading (145 seconds). Handling of the object in this example consists only of one sleep(10) method, which delays handling the thread for 10 milliseconds.

Test of B-Tree

This is also mostly a performance measuring test. The program inserts a specified number of records into the B-tree; then does several search passes to make sure that they are still in the B-tree; and then removes all inserted records. It is possible to run several programs in parallel, to check how synchronization works. Or you can split the data between two or more storages to test the work of distributed algorithms.


Look for new version at my homepage | E-Mail me about bugs and problems