Mesodata: Engineering Domains for Attribute Evolution and Data Integration

Author: Denise Bernadette Angela de Vries

de Vries, Denise Bernadette Angela, 2006 Mesodata: Engineering Domains for Attribute Evolution and Data Integration, Flinders University, School of Informatics and Engineering

This electronic version is made publicly available by Flinders University in accordance with its open access policy for student theses. Copyright in this thesis remains with the author. This thesis may incorporate third party material which has been used by the author pursuant to Fair Dealing exceptions. If you are the owner of any included third party copyright material and/or you believe that any material has been made available without permission of the copyright owner please contact copyright@flinders.edu.au with the details.

Abstract

The introduction of databases for data storage and handling revolutionised the way we dealt with records and enabled simple and fast information processing, aggregation and summarisation. Database and information technology systems have evolved from simple file processing systems to powerful database systems. Data management technology has progressed from hierarchical and network systems to relational databases, data modelling tools and indexing and organisational techniques. The development of Relational Database Management Systems and automated systems put the layout and form into the unchanging metadata and gave us record once systems. Unfortunately, the 'real world' upon which databases are modelled constantly changes. These changes may affect the schema for a variety of reasons including; - Unanticipated requirements, - A change in the universe of discourse, - A change to the interpretation of facts about the universe of discourse, - Changes in the form of updates to effect upgrades to the functionality or scope of a system, - Changes in the form of updates to effect efficiency improvements, - Changes caused by system operation, - Error correction. Different formalisms have been developed to deal with schema changes with the aim being to preserve information capacity and preserve semantic correctness. Schematic changes may be the result of evolving one system or may arise due to the need for merging two or more systems. Schematic conflicts occur which must be resolved and the schemata unified to produce a new version. To reach this goal there are graph based schema integration architectures, as well as, semiautomatic systems applying schema matching and schema translation techniques. These systems also utilise ontologies, thesauri, and so forth to integrate data from heterogeneous sources in order to process queries and views. Data integration or conversion remains a partially resolved issue. Some metadata changes are managed by changes to application code and system down time for conversion procedures. However an attribute change may result in data loss, changed accuracy, and altered semantics. Whilst the use of ontologies, concept graphs and other knowledge interchange techniques are alleviating the problems of data integration, these structures are not yet an integral part of the database architecture. This thesis argues a three-level architecture for relational databases with an interface positioned between data and metadata for complex domains. This intermediary level is the mesodata layer. This mesodata layer, separate from the metadata and data, provides complex structures, such as graphs, queues, and circular lists, in which to store domain values and their inter-relationships as well as supplying the 'intelligence' required to operate and manipulate them. The domain structures enable different orderings that form the bases of filters for enhanced querying and information retrieval. DBMS supplied mesodata types would allow for the re-usable inclusion of domain information such as in ontologies, taxonomies and concept graphs that to date have been only application specific.

Keywords: Mesodata,metadata,data,data management technology,Relational database management systems,schema changes,schematic changes,schema integration architectures
Subject: Computer Science thesis

Thesis type: Doctor of Philosophy
Completed: 2006
School: School of Computer Science, Engineering and Mathematics
Supervisor: John Roddick