Data Warehouse Maintenance Under
Concurrent Schema and Data Updates
- Project Overview
- Data warehouses are built by
gathering information from several information sources
and integrating it into one repository customized to
users' needs. There are two kinds of updates of
information sources (ISs) that could affect a
materialized data warehouse, namely, data updates (DU)
and schema evolution changes (SE). Recently, several
algorithms have been proposed in the literature for
incrementally materializing data warehouses under
concurrent data changes. To the best of our knowledge,
the EVE system being built at the Worcester Polytechnic
Institute is the only system that has addressed
maintenance of a data warehouse after schema changes of
multiple ISs. The possible concurrency of SE changes
performed by different ISs has thus far not been
explored. This thesis is the first work to focus on the
problem of handling concurrent data updates and schema
changes of distributed ISs. We propose an overall
approach for concurrency control, named the Schema
evolution and Data update Concurrency Control (SDCC)
system. We present algorithms to handle some sub-problems
of this general concurrency problem. In the future, we
will also implement the algorithms in the context of the
EVE system and evaluate their performance and
correctness.
- Project Members
- Advisor
- Elke A. Rundensteiner
- Graduate
Students:
- Xin Zhang
- Andreas
Koeller
- Publications
- Xin Zhang, and Elke
A. Rundensteiner, Data Warehouse Maintenance
Under concurrent Schema and Data Updates ,
Thesis Proposal, Computer Science Department,
Worcester Polytechnic Institute, April 28, 1998.
Copyright - Xin
Zhang (xinz@cs.wpi.edu)