MPI-INF/SWS Research Reports 1991-2021

2. Number - only D5


Yago - a core of semantic knowledge

Kasnec, Gjergji and Suchanek, Fabian M. and Weikum, Gerhard

November 2006, 39 pages.

Status: available - back from printing

We present YAGO, a light-weight and extensible ontology with high coverage and quality. YAGO builds on entities and relations and currently contains roughly 900,000 entities and 5,000,000 facts. This includes the Is-A hierarchy as well as non-taxonomic relations between entities (such as relation{hasWonPrize}). The facts have been automatically extracted from the unification of Wikipedia and WordNet, using a carefully designed combination of rule-based and heuristic methods described in this paper. The resulting knowledge base is a major step beyond WordNet: in quality by adding knowledge about individuals like persons, organizations, products, etc. with their semantic relationships -- and in quantity by increasing the number of facts by more than an order of magnitude. Our empirical evaluation of fact correctness shows an accuracy of about 95%. YAGO is based on a logically clean model, which is decidable, extensible, and compatible with RDFS. Finally, we show how YAGO can be further extended by state-of-the-art information extraction techniques.

  • MPI-I-2006-5-006.pdf
  • Attachement: MPI-I-2006-5-006.pdf (251 KBytes)

URL to this document:

Hide details for BibTeXBibTeX
  AUTHOR = {Kasnec, Gjergji and Suchanek, Fabian M. and Weikum, Gerhard},
  TITLE = {Yago - a core of semantic knowledge},
  TYPE = {Research Report},
  INSTITUTION = {Max-Planck-Institut f{\"u}r Informatik},
  ADDRESS = {Stuhlsatzenhausweg 85, 66123 Saarbr{\"u}cken, Germany},
  NUMBER = {MPI-I-2006-5-006},
  MONTH = {November},
  YEAR = {2006},
  ISSN = {0946-011X},