Semantic Technology Conference | May 20-24, 2007
  Green Ed      

Self-Learning Model for Semantic Recognition: Scalable Automation for Product Data Quality

Ed Green
CTO
Silver Creek Systems


 

Tuesday, 5/22/2007
2:00 PM - 3:00 PM
Level: Technical - Intermediate

Product data often has little structure and is highly variable. It is rife with abbreviations and truncations, contains special codes and other non-standard tokens which often are semantically ambiguous – making it difficult to apply traditional statistical learning techniques for semantic recognition. Silver Creek Systems has pioneered highly efficient ways to build reusable semantic models to recognize and restructure complex product information. This is of high value in automating – for the first time – data quality processes in the e-commerce information supply chain, particularly in association with PIM/MDM strategies. These models encompass thousands of product categories, and tens of thousands of product attributes. Using this base of subject-matter expert-supplied information, Silver Creek Systems is creating semantic recognition that is automatically extended or ‘learned’ without direct involvement of a human subject-matter expert. Bayesian inference and other estimation procedures are used in conjunction with the preexisting semantic information to extract additional recognition from unparsed and unseen product data. The benefits to this approach are:

  • Rapid extension of semantic recognition within a particular domain
  • Even greater flexibility and reusability with new data sources
  • Simpler and faster maintenance of the semantic models
  • Rapid automation of Product Data Quality processes that are currently done by hand

Ed Green is Chief Technology Officer at Silver Creek Systems where he leads a multi-disciplinary team to develop new data quality and integration techniques to solve problems in the information supply chain – drawing on everything from semantic modeling to expert systems and artificial intelligence. His 30 years of experience includes service as the VP of development and COO of Cadis Inc., a parts and products parametric search engine company, and GrafTek, a solids modeling CAD/CAM company. Ed holds a bachelor's degree in physics and biology from the University of Colorado, and a joint Ph.D. in physics from University of Pittsburgh and Carnegie Mellon.


   
Close Window