Semantic Technology Conference | May 20-24, 2007
  Hughes Baden      

An Analysis of Publicly Available OWL Ontologies and Their Properties

Baden Hughes
Program Manager - Information Management
The University of Melbourne


 

Tuesday, 5/22/2007
2:00 PM - 3:00 PM
Level: Technical - Intermediate

A significant portion of the mechanics of the Semantic Web is based on a common understanding of the semantics of concepts and their relations. As such, many parties are actively developing ontologies which describe their domains of interest, using technologies such as the W3C's Web Ontology Language (OWL). However, it is arguable that given the strong interoperability concerns in the Semantic Web space, that leveraging existing investments in the creation of ontologies which cover significant concept spaces is in fact the most efficient approach. Hence our motivation for this work is to determine the availability and coverage of publically available OWL ontologies, as a first step towards determining if there is a justification for enterprises to implement new OWL ontologies from the ground up, or if extant OWL ontologies can be used to bootstrap the development process and provide the foundation for greater semantic interoperability. In conducting this survey, we use a metasearch approach across Google, Yahoo and MSN search engines to find ontologies expressed in OWL which are available generally on the web. In essence, this is based on the query: owl +filetype:owl. Respectively, Google reports 61.2K OWL documents; Yahoo reports 53.4K and MSN reports 58.7K OWL ontologies on the web. It is safe to assume there is some overlap in the result sets between these engines, and hence the resultant URIs are sorted for uniqueness. We programmatically retrieve all of these URIs, resulting in a collection of 48.2K OWL documents. Having obtained this collection, we proceed to analyze the OWL documents in various ways including:

  • determining the upper level ontologies to which these OWL ontologies are connected;
  • counting the number of concepts available in each OWL ontology to gauge the relative size of coverage;
  • classifying ontologies by their OWL types: OWL Lite, OWL-DL and OWL-Full.

We embark upon a deeper analysis of the OWL ontologies to evaluate their domain and concept coverage. To do this, we use a graph-based approach called OWL Lite Alignment to overlay ontologies and compare their similarities and differences. We find that there is a high density of available ontologies in a range of domains, and from this draw inferences about the relative benefit of building an OWL ontology from the ground up vs. connecting to existing instances.


Baden Hughes is Program Manager, Information Management in Planning and Project Services, Information Services at The University of Melbourne, Australia. His responsibilities cover a variety of domains including enterprise information architecture, web services and applications, business intelligence, information management strategy and project management. Prior to this position he was Senior Research Fellow in the Department of Computer Science and Software Engineering at The University of Melbourne.


   
Close Window