NYI: Object-Oriented Database Tools for Supporting Evolutionary Software Systems
Elke A. Rundensteiner
Computer Science Department, Worcester Polytechnic Institute
100 Institute Rd, Worcester, MA 01609
Phone: (508) 831-5815, Fax: (508) 831-5776
E-mail: rundenst@cs.wpi.edu

Project Award Information
Award Number: IIS-9796264; Duration: 9/01/1994 - 8/31/2000 (5 years with one year extension), in fourth year; Title: NYI: Object-Oriented Database Tools for Supporting Evolutionary Software Systems

Object-Oriented Databases, Object-oriented Views, Schema Evolution, Interoperability, Transparent changes.

Project Summary
This project investigates transparent schema change technology that allows for on-line modification of databases without disturbing existing applications. The methodology to tackle this problem is to integrate schema evolution and view support into one system. The resulting tool supports schema changes through a view schema rather than on the global schema, and preserves existing views through such schema change. Within this proposed framework of transparent schema change, this project explores the following research issues: (1) develop object-oriented view technology, (2) integrate view and schema evolution concepts into one mechanism; (3) develop algorithms for complex schema transformations as well as for transparent evolution; (43) develop and compare different OODB implementation architectures that provide modeling features useful for view and schema evolution support (such as multiple type instantiations, dynamic type changes, multiple classification, etc.), (5) develop optimization strategies for view evolution, (6) perform experimental studies to evaluate the relative effectiveness of the proposed techniques, (8) build tools integrating the techniques into one system; and (9) lastly extend the concepts and tools developed above to federated environments.

Goals, Objectives, and Targeted Activities
Selected Accomplishment and Project Activities Thus Far
Our general goal is to increase interoperability between evolving software applications, ease migration of software over time, as well as increase the possibility of interfacing to legacy databases more effectively by means of OODB technology. Highlights of our accomplishments achieved this past year include:
SERF Framework for Extensible Schema Evolution Support. While current OODB systems only offer a fixed set (taxonomy) of primitive schema evolution operations, there has been work in recent years to try to define more complex evolution operations, such as merge-classes, transform-object-to-value. We have now succeeded in developing the first solution approach for enabling the support of arbitrarily complex schema evolution customized to user's needs. Our solution framework, called SERF (Schema Evoluton using Extensible, Re-usable, and Flexible Framework), is based on the general strategy to integrate a fixed set of primitive change operations with ODMG's object-query language (OQL) as vehicle for flexible object migration. The SERF framework has been designed, and we are now in the process of implementation in order to test its feasibility and limitations.
CHOP: Batched Optimization of Schema Evolution Sequences. A schema evolution operation thus far is executed as one single atomic task, possibly requiring to access all objects in the to-be-modified class or even class hierarchy. Optimization has been proposed in the context of commercial DBMS systems, namely the O2 system, to apply deferred object evolution - by only modifying objects if and when accessed anyways for retrieval purposes. Within our SERF system, complex schema evolution templates can correspond to a sequence of such basic schema evolution operations. We have developed a simple technique, called CHOP, for optimizing the performance of such sequences of pure schema evolution by first reordering them subject to data dependency constraints, then by either merging, canceling or eliminating operations in the sequence, whenever possible. We have proven that our optimization strategy is guaranteed to terminate after a finite number of steps (polynomial complexity) and also to find the minimal sequence of evolution operators. While we have run very preliminary studies to assess the potential gains of our CHOP optimization strategy, we are now incorporating CHOP into our experimental OODB transformation system developed on top of PSE (Persistent Storage Engine) by Object Design Inc.

Targeted Project Activities
Implementation and Evaluation of the SERF Framework. Our immediate goal is to complete the development of the SERF Framework prototype, as this will not only serve as a testbed for additional enhancements but it will also serve as a proof of concept. We are building SERF on top of the Persistent Storage Engine (PSE) written by Object Design Inc. We have developed both an an OQL query engine as well as a schema evolution manager for PSE (CASCON'98). The SERF system has been accepted for demonstration at ACM SIGMOD 1999.
Extensions and Refinements of the SERF Framework. We plan to explore several extensions of the SERF framework. This includes the coverage of a larger part of the ODMG object model, e.g., relationship support, keys, or other types of constraints. We are also exploring the concept of "contracts" from the software engineering literature as a means to clearly characterize the intended behavior as well as final expected outcome of all evolution operators, both primitive and complex ones. This would allow our SERF system to reason over the correct behavior of one as well as over the composition of several evolution operators into more complex ones. Increased consistency of user-defined evolution operators can been guaranteed and roll-backs could be avoided.
Optimization of SERF Templates. While our optimization work for SERF thus far has focussed on pure sequences of schema evolution primitives, we also intend to study the optimization of complete SERF templates. A SERF template is composed of a mixture of OQL retrieval statements with embedded object manipulations and transformations as well as schema level retrieval and modification operations. Given that schema evolution operators, which ultimately correspond to function calls in a template, can be very expensive to execute, optimization of multiple queries with such expensive functions could have major potential performance gains.
Towards Transparent Evolution Support for Complex Transformations. Now with the SERF system supporting not only a set of primitive simple evolution operators but rather complex ones as well, our goal is to apply the concepts of transparently performing evolution by operating upon views (instead of the base schema shared by multiple users) as developed during the first two years of this NSF project to the SERF framework.

Indication of Success
Increased interoperability, ease in migration of software over time, as well as the possibility of interfacing to legacy databases more effectively are critical problems faced by software industry. Our project promises to provide practical solutions to these important goals. In particular the invention of the SERF framework this past year is a significant result as it is the first approach to providing extensible yet consistent schema evolution support. We are developing the SERF framework using freeware so that we can release our code to the general public.

Project Impact
Impact on Human Resources
This project has partially funded four Ph.D. students in my database research groups: Harumi A. Kuno (female), Summer 1996, is working at HP   Research Labs, Palo Alto in Cal. Young-Gook Ra (male), 1997 is working at Samsung Data Systems, Korea. Data Sun.  Jyh-Liang Amy Lee (female), has accepted a  tenure track faculty position at the Hong Kong University of Science and Technology this year.  Currently one new Ph.D.  student at WPI, Kajal Claypool, several Master's students and undergraduate students  are involved.
Impact on education and curriculum development at all levels.
This project has increased education at the undergraduate level by providing small projects into which we actively can involve undergraduate students via REUs and directed study projects. It has also enhanced our graduate courses, e.g, the Object-Oriented database course at the Univ.of Michigan and in the Advanced Database course at WPI.
Collaboration with Industry.
Finally, we have had several interactions, including an on-site visit, with members of the PSE team at Object Design Inc., located in Burlington, Mass.  They have provided us with a copy of the PSE software, and have developed a customized patch of the software with support features to  enable on-line evolution of the database.

GPRA Outcome Goals
This project, the first system of its kind,  is resulting in the development of techniques for the support of complex transformations that can be customized by users and yet guaranteed to be consistently executed within the context of our SERF framework. SERF has multiple important applications, ranging from schema evolution of one central system, to mapping between two heterogeneous databases or even data models (migration), to functioning as middle-layer technology by integrating multiple data sources and producing views.

Project References
The source code, documentation, and sample data sets for the SERF system will be made available at the SERF project web page, once the system is stable. The following is a selective subset of more recent publications:

  • E.A. Rundensteiner, K. Claypool, M. Li, L. Chen, X. Zhang, C. Natarajan, J. Jing, S. De Lima, S. Weiner, ``SERF: ODMG-Based Generic Re-structuring Facility,'' ACM SIGMOD'99 Conf, Demo Session, May 1999.
  • Crestana, V. M., Lee, A. and Rundensteiner, E. A., ``Consistent Schema Version Removal: An Optimization Technique for Object-Oriented Views'', accepted for IEEE Trans. on Data and Knowledge Eng., subject to minor revisions. 1999.
  • Jones, M. and Rundensteiner, E. A., ``Database Support for the Implicit Unfolding of Hierarchical Structures'', accepted for IEEE Trans. on Data and Knowledge Eng., subject to minor revisions. 1998.
  • Kuno, H. A. and Rundensteiner, E.A., ``Incremental Maintenance of Materialized Object-Oriented Views in MultiView: Strategies and Performance Evaluation'', IEEE Trans. on Data and Knowledge Eng., Vol. 10, No. 5, Sept/Oct. 1998, pp. 768 - 792.
  • Lee, A. J., Koeller, A., Nica, A., and Rundensteiner, E. A., Data Warehouse Evolution: Trade-Offs between Quality and Cost of Query Rewritings, Poster paper, Int. Conf. on Data Eng., Sydney, Australia, March 23-26, 1999,
  • Claypool, K. T., Rundensteiner, E. A., Chen, L., and Kothari, B., ``Re-usable ODMG-based Templates for Web View Generation and Restructuring'', CIKM'98 Workshop on Web Information and Data Management (WIDM'98), Washington, D.C., Nov. 6,1998.
  • Claypool, K. T. and Rundensteiner, E. A., "OQL-SERF: An ODMG Implementation of the Template-Based Schema Evolution Framework", CASCON'98 Conference, Nov. 30 - Dec. 3, Mississauga, Ontario, Canada.
  • Claypool, K.T., Jin, J., and Rundensteiner, E.A., SERF: Schema Evolution through an Extensible, Re-usable and Flexible Framework, Seventh Int. Conf. on Info. and Knowledge Management (CIKM'98), Washington, D.C., Nov. 1998.
  • Rundensteiner, E. A. Lee, A. J. and Ra, Y.-G., ``Capacity-Augmenting Schema Changes on Object-Oriented Databases: Towards Increased Interoperability'', Object-Oriented Information Systems (OOIS'98), Paris, Sept. 1998.
  • Rundensteiner, E. A., Kuno, H. A. and Zhou, J. ``Incremental Maintenance of Materialized Path Query Views'', Object-Oriented Information Systems (OOIS'98), Paris, Sept. 1998.
  • Lee, A. J. and Rundensteiner, E. A., ``Data Warehouse Evolution: Consistent Metadata Management'' Session: ``Data Mining and Data Warehousing'', 1998 IEEE International Conference on Systems, Man, and Cybernetics, San Diego, California, October 10-14, 1998.

  • Area Background
    The general area of this project is object-oriented database systems. OODBs offer powerful and flexible modeling support to handle the needs of diverse application domains, including modeling constructs, such as classes, abstractions, inheritance, reuse, and behavioral modeling. Thus, typically OODBs are being used as foundation when developing (re-usable) database support for application-specific features, such as, temporal data, spatial data, and history management. One important issue for databases in general and for OODBs in particular is schema evolution, the modification of the schema and the associated application data during the lifetime of a database system. Most work in the literature focussed on the realization of schema evolution operations in the context of a particular data model or even OODB, such as, Objectstore, Gemstone, Ithasca, or O2. Support is generally restricted to primitive evolution support only. Our project extends this previous work by now providing complex as well as extensible evolution support, as well as transparency of executing such schema changes on a shared OODB while minimally impacting existing applications.

    Area References

  • R. G. G. Cattell et al.", The Object Database Standard: ODMG 2.0, Morgan Kaufmann Pub, 1st ed, 1997.
  • J. Banerjee, W. Kim, H.-J. Kim, and H.F. Korth, "Semantics and Implementation of Schema Evolution in Object-Oriented Databases," ACM SIGMOD   1987, pp. 311-322.
  • F. Bancilhon and C. Delobel and P. Kanellakis, Building an Object-Oriented Database System: The Story of O2," Morgan Kaufmann Pub., 1992.

  • Potential Related Projects

  • Object-Oriented Database Systems.