Computer Science Department
School of Computer Science, Carnegie Mellon University


Mediating Among Diverse Data Formats

John Ockerbloom

January 1998

Ph.D. Thesis

Keywords: Information system applications, mediators, distributed systems, abstract data types, data formats, conversion, object-oriented methodology

The growth of the Internet and other global networks has made large quantities of data available in a wide variety of formats. Unfortunately, most programs are only able to interpret a small number of formats, and cannot take advantage of data in unfamiliar formats. As the Internet grows, new applications arise, and legacy data persists, the diversity of formats will continue to increase, worsening the problem. Current approaches to data diversity fail to scale up gracefully, or fail to handle the full heterogeneity of data and data sources founds on the Internet.

I have developed a data model and a system of mediator agents that support the widespread use of diverse data formats much more effectively than current approaches do. In this thesis, I describe and evaluate the design and implementation of this data model, known as the Typed Object Model (or TOM), and the system of mediators that supports it. TOM is a read-only object-oriented data model that describes the abstract structure of data formats, their concrete representations, and relations between formats. TOM is supported by a distributed network of mediator agents (known as type brokers) that maintain information about data formats, and provide uniform access to conversions and other operations on those formats. Type brokers plan complex conversion strategies that can involve multiple servers, and ensure that conversions preserve information encoded by clients. Data providers can also register new formats, operations, and conversions with type brokers in a manner, and make them usable anywhere on the Internet. TOM type brokers now work with hundreds of data formats, often through integration of off-the-shelf programs. TOM also supports a wide variety of applications and interfaces, such as the Web-based TOM Conversion Service, that have users worldwide.

166 pages

Return to: SCS Technical Report Collection
School of Computer Science homepage

This page maintained by