"Odysseus students should gain strength from their numbers both prior, during and after this internship program. We hope these students will form connections with their peers and mentors that will last well beyond the 12 weeks with us"
WordNet is a broad coverage electronic database of English words and their meanings that has found myriad applications in the field of natural language processing. But WordNet is a semi-structured resource: while the most useful parts of WordNet are explicitly codified as taxonomic or meronomic linkages between word-concepts, much of the truly meaningful content is stored in textual annotations, or glosses, that are intended for human rather than machine consumption. Our studies with WordNet in recent years suggest to us that these annotations, while not regular enough to be trivially machine understandable, are sufficiently systematic to support a non-trivial parsing effort. This effort will unlock the relational information implicit in these glosses, to reveal e.g., that surgeons perform surgery, that knights follow a chivalric code, and that cobblers make and repair shoes. Work at DCU by Dr. Josef Van Genabith and his team, on the annotation of functional structural roles in text, will help greatly in this task. We intend to exploit a research student to apply these parsing techniques to the glosses of WordNet entries; the application will not be trivial, as we hope to guide the process via the addition of external resources (such as the content of Wikipedia, an on-line open-source encyclopaedia, and HowNet, a bilingual Chinese and English database that follows radically different design principles toWordNet). At UCD we have several research students engaged in the exploitation of WordNet for NLP ends; this research student will thus work as part of this team.
Relevance of Project to the Host Laboratories:
In the UCD Creative Languages Systems Group we are currently exploiting the structure of lexical ontologies like WordNet and HowNet (as well as additional resources like Wikipedia) in a variety of different ways, and for a variety of different goals, ranging from thesaurus construction to educational game design. One such application is the analogical thesaurus, a term-finder utility that allows a user to employ analogical constructions to specify the words and concepts that are sought (e.g., what is the Muslim bible).
Supervisors:
Dr. Tony Veale (Computer Science and Informatics, UCD)
Keywords:
WordNet; HowNet; lexical ontologies; Wikipedia; analogy; text parsing
Recent comments
2 years 36 weeks ago
2 years 36 weeks ago
3 years 23 weeks ago
3 years 23 weeks ago