Max-Planck-Institut für Informatik
max planck institut
mpii logo Minerva of the Max Planck Society


SOFIE: a self-organizing framework for information extraction

Suchanek, Fabian and Sozio, Mauro and Weikum, Gerhard

MPI-I-2008-5-004. November 2008, 49 pages. | Status: available - back from printing | Next --> Entry | Previous <-- Entry

Abstract in LaTeX format:
This paper presents SOFIE, a system for automated ontology extension.
SOFIE can parse natural language documents, extract ontological facts
from them and link the facts into an ontology. SOFIE uses logical
reasoning on the existing knowledge and on the new knowledge in order
to disambiguate words to their most probable meaning, to reason on the
meaning of text patterns and to take into account world knowledge
axioms. This allows SOFIE to check the plausibility of hypotheses and
to avoid inconsistencies with the ontology. The framework of SOFIE
unites the paradigms of pattern matching, word sense disambiguation
and ontological reasoning in one unified model. Our experiments show
that SOFIE delivers near-perfect output, even from unstructured
Internet documents.
References to related material:

To download this research report, please select the type of document that fits best your needs.Attachement Size(s):
MPI-I-2008-5-004.pdf410 KBytes
Please note: If you don't have a viewer for PostScript on your platform, try to install GhostScript and GhostView
URL to this document:
Hide details for BibTeXBibTeX
  AUTHOR = {Suchanek, Fabian and Sozio, Mauro and Weikum, Gerhard},
  TITLE = {{SOFIE}: a self-organizing framework for information extraction},
  TYPE = {Research Report},
  INSTITUTION = {Max-Planck-Institut f{\"u}r Informatik},
  ADDRESS = {Stuhlsatzenhausweg 85, 66123 Saarbr{\"u}cken, Germany},
  NUMBER = {MPI-I-2008-5-004},
  MONTH = {November},
  YEAR = {2008},
  ISSN = {0946-011X},