AREA 1: Web Data Management
XML is an emerging standard for flexible information sharing on the
web, and several query language proposals exist for declaratively
extracting information from XML data. XML query languages have been
recently designed to query over such XML documents.
1.A.
SQL combines designating search space, search criteria and
presentation of the query result into one query expression. Hence, these
are likely to be the current expectations for a declarative language.
How do the different query languages proposed for XML documents,
notably, XSL, XQL and XML_QL, deal with these three aspects? What are
the implications for procedural languages to implement the queries?
1.B.
Propose a general algorithm for translating XML_QL queries into
SQL-92 , assuming a 'semantic' mapping from XML to the relational model
(e.g., like the one proposed in the VLDB'99 paper or the one you, Gail
Mitchell and W.C. Lee have proposed recently) and not a 'syntactic'
mapping such as those studied by Kossman et al. First, briefly describe
the data model mapping you assume as basis for your query
translation strategy. Then, present your strategy or algorithm. Discuss
if there are query classes in XML_QL that cannot be mapped to SQL.
Indicate also the expected efficiency or inefficiency of the resulting
SQL queries.
1.C.
Briefly indicate if mapping instead to SQL-3 (SQL with OO extensions)
will be more or less promising, and give your intuition why. There
is no need to actually propose a full algorithm for this question.
AREA 2: Data warehousing
Data warehousing is a critical technology currently being employed by
businesses for effective decision support purposes. Data warehousing in
industry to date is based on relational database technology. With the
emergence of the internet, we may now want to consider how 1)
semi-structured data (say XML) and 2) the increasing scale of widely
distributed sites influence warehousing technology as well as increase
their potential.
2.A.
Provide a classification of architectural design choices possible to
tackle data warehousing in this web context, i.e., to build data
warehouses over web sources. What are research issues and problems that
must be addressed in each of the different architectural solutions for
web warehousing? You should characterize the problems but you do not
need to
provide solutions to them. You should also conduct a literature search
and categorize on-going project efforts into your above classification.
2.B.
The dynamic data warehousing system (DYDA) has been developed for
incrementally yet consistently maintaining a relational data
warehouse under a mixture of concurrent schema and data changes of
multiple autonomous sources. Sketch how you would modify the DYDA
solution and its algorithms to be able to deal with XML as data sources
and XML_QL as query language used for data warehouse generation. If
none, justify.
AREA 3: Software Engineering, in particular, software repositories,
UML modeling, metadata
3.A.
XML's limitations as systems' integration solution. XML is often
cited as the solution for integrating systems. Why? What are issues that
must be addressed when integrating systems? Which of these integration
problems does XML actually solve, if any? Which of these integration
problems does XML fail to solve?
3.B.
Meta-data repositories, such as Rochade, have been proposed as
technology for the integration of the different phases of the software
development lifecycle. Analyze and discuss the pros and cons of
this solution approach, i.e., what does this solve and what
remains unsolved? How does XML would fit into that picture, if at all?