The abundance of textual data has led to an increased need for explicit
knowledge about words and the entities they represent. This talk
presents three methods to obtain such knowledge. The first involves
learning models to disambiguate word meanings. The second reconciles
equivalence and distinctness information about entities from multiple
sources. The third method adds a comprehensive taxonomic hierarchy,
reflecting how different entities relate to each other. Together, they
can be used to produce a large-scale multilingual knowledge base
semantically describing over 5 million entities and over 16 million
natural language words and names in more than 200 different languages.