next up previous
Next: Converting SQL to HTTP-Requests Up: No Title Previous: Goal of the Project

Related work

A related project on Web Source wrappers is described in [3]. A language for specifying an extractor for semi-structured OEM objects is described. The extraction language is based on the syntactic recognition of HTML structures. As a result, it is sensitive to syntactic changes in the HTML document. [4] support semantic recognition of HTML documents using built-in operators, and therefore seems to be more robust. They provide a toolkit for specifying capabilities for Web Sources and defining the wrapper functionality. The problem with all of these approaches is that they try to find exact-result mapping wrappers, which have to be rebuilt if the source slightly changes.

There has been much research on the automatic generation of extractors (e.g. [5]). Research in [5] proposes a tool that generates extractors for Web Sources where the data follows a hierarchical structure. Syntactical recognition of HTML structures, as well as heuristics about HTML syntactic structures, are used to automatically generate a program to extract the data. The extractors are not a problem in our project, because of our using HTML forms.

A project that use the same case study is presented in [6]. The system they designed (ARANEUS) is elegant and uses mainly the existent technology. However, by using an object model for the ADMs, the view definition process is assumed to be dependent on the navigation. But, navigation is given by the structure of the Web page and it needs to be modified as the Web pages are modified. Changing the navigation, however, the ADM and the corresponding Editor programs need also to be changed. As both ADMs and Editor programs are special purpose constructs, re-coding them frequently might not be convenient.

Andreas Koeller
Mon May 10 13:40:38 EDT 1999