|
|
Wednesday, 5/23/2007 11:00 AM - 11:30 AM Level: Technical - Introductory
Search engines perform well on a frequent query with the same words as in the target text, but they fail with unusual, complex or deep queries. Can linguistic semantics help to improve search? The classical model of linguistic semantics maps linguistic expressions to truth values, apparently too theoretical to be helpful. But combined with a commonsense lexicon, linguistic semantics increases search precision and recall. CognitionSearch introduces a model of natural language semantics with a complex relation between entities, linguistic expressions and meanings. CognitionSearch's model differs in positing a world-society-mind-concept-language relation. CognitionSearch's lexicon adds commonsense or "naive" knowledge, wherein word meanings (concepts) are naive beliefs, as in "lemons are typically yellow, but some lemons are brown". CognitionSearch indexes very large textual databases and scales to an indefinite size document base. It has several linguistic components to analyze text at many levels from tokenization to sense disambiguation. CognitionSearch's linguist-built semantic databases are:
- A lexicon containing word stems with different senses. Each sense has a definition, an ontological attachment, and naïve semantic information. The lexicon has 376,000 word senses, over 17,000 ambiguous words, 100,000 phrases, and numerous domains (or fields of interest).
- A linguistically and psychologically justified ontology with about 7,000 nodes
- A concept thesaurus of over 50,000 thesaural groups of word senses
- A database of over 4 million semantic contexts
Dr. Dahlgren began her career as a professor of computational linguistics at Pitzer College of the Claremont Colleges and then went to work for IBM at their Los Angeles Scientific Center, focusing on building a "natural language understanding system". Dr. Dahlgren has a PhD in Linguistics and a post-doctorate in Computer Science from UCLA. She has published many scholarly articles on the subjects of linguistics and computer science, and is the author of Naïve Semantics for Natural Language Understanding (1988). She is the co-author of Cognition's seminal patent (1998), and she received the Small Business Innovation Award from the U.S. Army in 1995. Dr. Dahlgren is an Adjunct Professor of Linguistics at UCLA and the CTO of Cognition Technologies, Inc. in Santa Monica, California.
|
|
|