Max-Planck-Institut für Informatik
max planck institut
mpii logo Minerva of the Max Planck Society


Towards a Universal Wordnet by learning from combined evidenc

de Melo, Gerard and Weikum, Gerhard

MPI-I-2009-5-005. December 2009, 32 pages. | Status: available - back from printing | Next --> Entry | Previous <-- Entry

Abstract in LaTeX format:
Lexical databases are invaluable sources of knowledge about words and
their meanings,
with numerous applications in areas like NLP, IR, and AI.
We propose a methodology for the automatic construction of a large-scale
lexical database where words of many languages are hierarchically
organized in terms of their
meanings and their semantic relations to other words. This resource is
bootstrapped from
WordNet, a well-known English-language resource. Our approach extends
WordNet with around
1.5 million meaning links for 800,000 words in over 200 languages,
drawing on evidence extracted
from a variety of resources including existing (monolingual) wordnets,
(mostly bilingual) translation
dictionaries, and parallel corpora.
Graph-based scoring functions and statistical learning techniques are
used to iteratively integrate
this information and build an output graph. Experiments show that this
wordnet has a high
level of precision and coverage, and that it can be useful in applied
tasks such as
cross-lingual text classification.
References to related material:

To download this research report, please select the type of document that fits best your needs.Attachement Size(s):
mpi-i-2009-5-005.pdf717 KBytes
Please note: If you don't have a viewer for PostScript on your platform, try to install GhostScript and GhostView
URL to this document:
Hide details for BibTeXBibTeX
  AUTHOR = {de Melo, Gerard and Weikum, Gerhard},
  TITLE = {Towards a Universal Wordnet by learning from combined evidenc},
  TYPE = {Research Report},
  INSTITUTION = {Max-Planck-Institut f{\"u}r Informatik},
  ADDRESS = {Stuhlsatzenhausweg 85, 66123 Saarbr{\"u}cken, Germany},
  NUMBER = {MPI-I-2009-5-005},
  MONTH = {December},
  YEAR = {2009},
  ISSN = {0946-011X},