|
|
Thursday,
5/24/2007
8:30 AM - 9:30 AM
Level: Technical - Intermediate/Case Study
Sean Boisen is leading an effort at Logos Research Systems to build a semantic knowledgebase encompassing general background information about entities and relationships from the Bible (one of the world's most popular collections of information). The scope includes people, places, belief systems, ethnic attributes, social roles, as well as family and other inter-personal relationships, places visited, etc. This Bible Knowledgebase (BK) will be used to support knowledge discovery and visualization in both desktop and web-server configurations for Logos' products. It will also provide an integration framework for Logos' substantial digital library (more than 7000 titles from over 100 different publishers). The project is a good example of what it takes to move a real-world, knowledge-intensive application into a Semantic Web framework.
Some interesting technical aspects of the work:
- Building BK involved combining a substantial collection of legacy data (several thousand entities representing all the named people from the Bible, their family relationships, and references to their occurrence in the Bible), converted to RDF using XSLT, with a separately-developed OWL ontology (New Testament Names, http://semanticbible.asamasa.net/ntn/ntn-overview.html) . Successfully merging these two data sets involved several technical issues related to ontology modeling and URI construction.
- Incorporating provenance data to indicate the source of information has been a key design goal. Consequently, the BK ontology includes a substantial hierarchy of reified relationships, which has some interesting consequences for development. I've found relatively little available information on this subject (http://www.w3.org/TR/swbp-n-aryRelations/#RDFReification and http://composing-the-semantic-web.blogspot.com/2006/07/reifying-reified-relationships.html are two brief relevant technical notes).
- We are building tools and processes to enable non-specialist domain experts to extend and expand the knowledgebase, a key issue for smaller companies like Logos that can't afford to hire a staff of ontology experts.
- BK will provide an indexing and integration framework for our large collection of digital texts (which are currently indexed by a simple natural language terms). This is a key problem for semantically-oriented digital publishing.
Sean spent 19 years in advanced R&D and technology development at BBN Technologies, primarily in human language technology. He worked on several projects integrating natural language processing and Semantic Web technologies. Since January he has been at Logos Research Systems, where he is leading the design and development of the semantic knowledgebase as well as other activities in natural language processing. Logos is the largest developer of Bible software and a worldwide leader in multilingual electronic publishing, with more than 7000 titles, supporting users in more than 180 countries and a dozen different languages. Sean has an MA in computational linguistics from UCLA.
|
|
|