Partner Organizations:Excelon Inc.: In-kind Support
provided software and even development of software patches for enabling the development of SERF system. also, we had several interactions in the form of on-site meetings to establish colloborative efforts.
IBM has provided corporate IBM fellowship to support one of my graduate student to work on issues related to the NSF project. One graduate student has typically gone to IBM for the summer as intern to work on said research efforts.
Mainly, we had collaborations with Dr. Gail Mitchell and Dr. Wang-chien Lee, both at Verizon labs. They funded one of the Ph.d students Xin Zhang in our research group via a year-long internship, allowing him to make contributions in the area of XML to relational mapping and view management. Several joint publications have resulted from this collaboration.
We had collaborations with Dr. Harumi Kuno, employee at HP Labs in California site the E-services group. HP supported one of my Ph.d. students here, Hong Su, on a grass roots program, allowing us to collaborate on issues related to DTD mappings and transformation generation, Hong to get some exposure to the HP research lab as well. A joint publication on this work has resulted.
Other collaborators:
We have had several interactions, including an on-site visit, with members of the PSE team at Excelon Inc. (formerly called Object Design Inc.), located in Burlington, Mass. They have provided us with a copy of the PSE software, and have developed a customized patch of the software with support for features to enable on-line evolution of both schema and persistent objects. They have continued to interact with us in terms of updating their software as we discussed additionally required fixes.
Activities and findings:
Research and Education Activities: Project Overview. This project investigates transparent schema change technology that allows for on-line modification of databases without disturbing existing applications. The methodology to tackle this problem is to integrate schema evolution and view support into one system. The resulting tool supports schema changes through a view rather than on the global schema, and preserves existing views through such changes. Within this proposed framework of transparent schema change, this project explores the following issues: (1) develop object-oriented view technology, (2) integrate view and schema evolution concepts into one mechanism supporting complex transformations; (3) develop algorithms for complex schema transformations as well as for transparent evolution;(4) develop and compare different OODB-based frameworks suitable for view and schema evolution support, (5) develop optimization strategies for view evolution, (6) perform experimental studies to evaluate the relative effectiveness of the proposed techniques, and (7) lastly extend the concepts and tools developed above as middle-layer services to distributed environments. Recent specific Project Tasks: * Implementation and testing of the SERF (Schema Evolution through an Extensible, Re-usable and Flexible Framework) into a working prototype system. Demonstration of SERF at ACM SIGMOD'1999. * Application of SERF to the problem of flexible web site management and restructuring to assess its limitations and its strength. For this, we have developed XML to OO mapping and loading strategies. This activitiy resulted in a system called ReWeb. Demonstration of ReWeb at ACM SIGMOD'2000. * Research into making SERF transformations consistent without limiting their expressive power nor the user's flexibility in expressing any transformation semantics they wish. For this, we have explored the notion of software contracts for assuring the correctness of a transformation before its application. Application of this idea in particular to the transformation support of the OO relationship modeling constructs, such as unary and binary relationships, has been completed. * We have designed and developed an extended framework for cross-model integration, called SANGAM. SANGAM has been applied to the integration of XML and relational data sources, and then to the evolution of their transformation to allow the two systems to stay in synch. The first prototype of SANGAM framework has been demonstrated at ACM SIGMOD'01.
Findings: Our project goal is to increase interoperability between evolving software applications, ease migration of software over time, as well as increase the possibility of interfacing to legacy databases more effectively by means of OODB technology. Highlights of our accomplishments achieved towards that end include: SERF Framework for Extensible Schema Evolution Support. While current OODB systems only offer a fixed set (taxonomy) of primitive schema evolution operations, there has been work in recent years to try to define more complex evolution operations, such as merge-classes, transform-object-to-value. We instead have developed the first solution approach for enabling the support of arbitrarily complex schema evolution customized to user's needs. Our solution framework, called SERF (Schema Evoluton using Extensible, Re-usable, and Flexible Framework), is based on the general strategy to integrate a fixed set of primitive change operations with ODMG's object-query language (OQL) as vehicle for flexible object migration. This is the first framework focussing on extensiblity of evolution support in the database community. The SERF framework has been designed and implemented to test its feasibility and limitations. SERF has been formally demonstrated at ACM SIGMOD 1999, and the XML-based web management system build using SERF technology has been accepted for demonstration at ACM SIGMOD 2000 again. Initial research into making SERF transformations consistent without limiting their expressive power nor the user's flexibility in expressing any transformation semantics has resulted in a software-contract based solution. This approach has been applied to designing the first transformation methodology for correctly handling the evolution of the semantic relationship construct, such as unary and binary relationships. This represents a solid treatment for relationship evolution. Beyond flexible XML-based web site management, applications that can benefit from our SERF technology include for example e-business applications and heterogenous systems integration. Sangam: A Flexible Meta Model Integration System. A another contribution is the development of our generic cross-model framework, called SANGAM. Two features of novelty currently underdevelopment are: 1. a cross-data-model algebra that can be utilized to algebraically describe the integration of two heterogeneous data sources, and 2. a meta model of not only the two data models (XML and relational data model) to be integrated but rather also a meta model of the actual mapping strategies supported between two such data models. This allows for putting together mapping strategies at the meta level as well as for proving certain properties about the mappings as well as generated output of our mappings (such as type conformance). Both concepts remain to be implemented, integrated into our system, and tested.
Training and Development: This project has partially funded several Ph.D. students in my database research groups: Harumi A. Kuno (female), graduated Summer 1996 from the University of Michigan, is now working at HP Research Labs, Palo Alto in Cal. Young-Gook Ra (male), graduated 1997 from University of Michigan, is now working in a university in Korea (after two years in industry; Samsung Data Systems, Korea). Jyh-Liang Amy Lee (female) is employed as a Assistant Professor in Hong Kong Science and Technology University starting 1999. She now has plans to relocate back to the United States. Anisoara Nica (female), May 1999, has started a position at Sybase Corporation. Kajal Claypool (female) is completing her Ph.d. in 2001 on this project. Kajal has accepted a tenure-track faculty position for 2001/2002 in the University of Massachusetts, Lowell. Several other Ph.D. at WPI have been involved in this NSF project in one form or another. This includes Andreas Koeller (male), who will be completing this Phd. in Fall 2001. Andreas is also considering an academic career. Xin Zhang (male), Hong Su (female), and Songting Chen (male) -- all of whom are likely to go either into academia or into some advanced industrial lab, once graduated. In addition, numerous Master students and undergraduate students have been involved in our project activities. This project has provided all of them with an array of training; including design of larger software, development of software, teamwork, testing of software, experimental evaluation and performance studies, as well as development of reports and presentations on the project. In fact, through weekly project meetings, several of the students had the continued opportunity to practice their presentation and discussion, and hence also their teaching skills. Impact on education and curriculum development has been at all levels. This project has increased education at the undergraduate level by providing small projects into which we actively can involve undergraduate students via directed study projects, REUs and CRA summer students. It has also enhanced our graduate courses, e.g, the Advanced Database course at WPI.
Outreach Activities: This project has involved several female undergraduate students via the CRA summer mentorship program for several summers to provide them a fruitful environment to learn both about graduate school in general, about working in a group, as well as about conducting research and development. We hope that this will help to contribute to attracting females to the Computer Science discipline and to aim for higher education.
Journal Publications:
Other Specific Products:
SERF (Schema Evolution using an Extensible, Reusable and Flexible Framework), a generic restructuring facility, has been designed and implemented by several students at WPI over the past several years. It's an ODMG-compliant system that is 100% java-based and implemented on top of the commercial object server PSE (Persistent Storage Engine) by ObjectDesign Inc. (now called Excelon Corp.) as its basic platform. It is composed of an OQL query engine, an ODMG-compliant meta repository, a library manager, the restructuring layer, as well as a Graphical User Interface. This system has also been selected for demonstration and publication at ACM SIGMOD'99, one of the premier conferences in the field in May 1999. [ The corresponding publication is E.A. Rundensteiner, K. Claypool, M. Li, L. Chen, X. Zhang, C. Natarajan, J. Jing, S. De Lima, S. Weiner, ``SERF: ODMG-Based Generic Re-structuring Facility,'' {\it ACM SIGMOD'99 Conf}, Software system demonstration and paper, Philadelphia, USA, May 1999. ]
We have made the system available to anyone that has requested it. In particular, we are developing an on-line client-server application of the system so that visitors to our site can play with the demo without having to download and install the system themselves.
EVE, the Evolvable View Environment system, is a distributed relational data warehousing system that has been developed at WPI over the past two years. EVE maintains materialized data warehouses for high-performance analysis under both schema and data changes of its underlying data sources. EVE is a middle-layer service in Java that interfaces to different relational database base engines such as Oracle 7.0 and Microsoft Access via a JDBC and a JDBC-ODBC bridge, respectively. Different versions of this system have been presented at a several conferences over the past two years. Most notably, the EVE prototype system has recently been selected for demonstration and publication at ACM SIGMOD'99, one of the premier conferences in the [ The corresponding publication is Elke A. Rundensteiner, Andreas Koeller, Xin Zhang, Amy J. Lee, and Anisoara Nica, Amber VanWyk, and Yong Li, ``Evolvable View Environment (EVE): Non-Equivalent View Maintenance under Schema Changes,''ACM SIGMOD'99 Conf, Software system demonstration and paper, Philadelphia, USA, May 1999.]
We have made both a life demonstration of the EVE system as well as the actual source code available on our project page.
MultiView, the first object-oriented view system that dealt with integrated class and view hierarchies and that had extensive optimizaton strategies for incremental view maintenance. MultiView was build in Smalltalk on top of the Gemstone OODB system by Servio Logic.
We have made the system available as down-load source code at the University of Michigan.
TSE (Transparent Schema Evolution Technology). TSE was build on top of the MultiView OODB system. TSE combined schema evolution with view support in order to enable transparent evolution of schemata by diverse user groups that were cooperating on related OODB data sources.
TSE was released as freeware software together with MultiView System at the University of Michigan in Ann Arbor.
DyDa: Data Warehouse Maintenance under Fully Concurrent Environments --- Songting Chen, Jun Chen, Xin Zhang, Andreas Koeller and Elke A. Rundensteiner, Software system demonstration, Proceedings of SIGMOD'01, Santa Barabra, CA USA, May 2001. Dyda features one strategy towards maintaining a data warehouse under both concurrent data updates and concurrent schema changes in distributed data sources.
Dyda system was demonstrated at ACM SIGMOD'01. A release on our webpage is planned.
SANGAM Model Management - A Solution to Support Multiple Data Models, Their Mappings and Maintenance, K. Claypool, E. A. Rundensteiner, X. Zhang, H. Su, H. Kuno, G. Mitchell, and W.C. Chen, software demonstration at ACM SIGMOD, May 2001.
the Sangam system was demonstrated at ACM SIGMOD'01.
http://davis.wpi.edu/dsrg/OOSE
Please see also: http://davis.wpi.edu/dsrg
Contributions:
Contributions within Discipline:
The area of this project is object-oriented database systems (OODBs). OODBs offer powerful and flexible modeling support to handle the needs of diverse application domains, including modeling constructs, such as classes, abstractions, inheritance, reuse, and behavioral modeling. Thus, typically OODBs are being used as foundation when developing (re-usable) database support for application-specific features, such as spatial data or history management. One important issue for databases in general and for OODBs in particular is schema evolution, the modification of the schema and the associated application data during the lifetime of a database system. Most work in the literature focussed on the realization of schema evolution operations in the context of a particular data model or even OODB, such as, Objectstore, Gemstone, Ithasca, or O2. Support is restricted to primitive evolution support only. Our project extends this previous work now providing complex extensible evolution support, as well as transparency of executing such schema changes on a shared OODB while minimally impacting existing applications. This project thus contributes to the base of knowledge in the database field by providing a novel approach at extensible yet flexible evolution support. This includes the development of a well-founded theory of evolution, as well as the design and implementation of an actual prototype that verifies the feasiblity of the proposed ideas. Lastly, we have also conducted a number of experimental studies that demonstrate the performance advantages and limitations of our proposed optimization strategies. Such a testbed and associated experimental findings are beneficial for anyone wanting to proceed along this line of research and/or wanting to build some commercial software system for evolution support. In addition, we have also taken a stab at addressing the related problem of the evolution (and hence first the modeling of) the mapping across heterogeneous data sources, rather than just OODBs. In particular, we have developed a general framework for cross-model mappings, that includes an cross-model algebra, update propagation operators, as well as a meta-model of the cross-maps themselves.
While our background and major emphasis is on database and information technologies, our ideas of flexibly handling evolution of systems should also have impact on the software engineering industry. Techniques for the ease in migration of software over time as well as the possibility of interfacing to legacy databases more effectively are critical problems also being explored by the software academic communit and by the software industry in general. Our techniques can be applied to address evolution, migration and integration problems for a wide array of different disciplines, such as engineering, medicine and other applications.
This project has partially funded several Ph.D. students who now have gone either into academia or into research labs. This thus is a contribution of the project to the skilled workforce so desparately needed by our nation, and more so to providing teachers of the future for our universities. For example, Harumi A. Kuno (female), graduated Summer 1996 from the University of Michigan, is working at HP Research Labs, Palo Alto in Cal. now. Young-Gook Ra (male), graduated 1997 from University of Michigan, first joined at Samsung Data Systems, Korea. Data Sun, but is now Professor of Computer Eng. Dept., Hankyonng National University, Korea. Jyh-Liang Amy Lee (female), graduated 1998 from University of Michigan, is tenure track faculty member at the Hong Kong University of Science and Technology this year. She currently is the process of relocating back to the United States. Kajal Claypool (female) is completing her Ph.d. the fall of 2001 on this project. Kajal has accepted a tenure-track faculty position for 2001/2002 in the University of Massachusetts, Lowell. Several other Ph.D. at WPI have been involved in this NSF project in one form or another. This includes Andreas Koeller, Xin Zhang, Hong Su, and Songting Chen -- all of whom are likely to go either into academia or into some advanced industrial lab, once graduated. In addition, several Master's students have conducted their Master's thesis research under this project, as listed under the publications list. Furthermore, a number of undergraduate students have been involved in activities of this project through the year. Lastly, this NSF project has lead to course projects for students in both our graduate as well as our undergraduate database courses at WPI, such as the Database Systems Course (CS542) and the Advanced Database Systems course (CS561). Thus they contribute to the education of our students.
Several software systems have been released as part of this project, which now can be used freely by others in their research as well as educational projects. Such software can be utilized as starting points for course projects by others, or they can also be utilized as testbed for conducting other types of experimental studies - beyond what we are attempting in our current project.
Increased interoperability, ease in migration of software over time, as well as the possibility of interfacing to legacy databases more effectively are critical problems faced by software industry. Our project promises to provide practical solutions to these important goals. The SERF framework is the first approach to providing extensible yet consistent evolution support. Our solutions are based on standard technologies, whenever possible and appropriate, such as the ODMG object model and the OQL query language, to allow quick dissemination and use of our ideas by industry. SERF has multiple important applications, ranging from schema evolution of one central system, to mapping between two heterogeneous databases or even data models (migration), to functioning as middle-layer technology by integrating multiple data sources and producing views. As another example, the development of SANGAM, a framework for supporting cross-model mapping at an algebra level has the potential to offer a much more disciplined approach towards integrating databases with distinct data models such as say XML data sources and relational data engines.
Special Requirements for Annual Project Report:
Unobligated funds: less than 20 percent of current funds