Campus Event Calendar: Amy Siu (09/04/2017 in E1 4/024)

Campus Event Calendar

Campus Event Calendar:
- All Upcoming:
  - only for D1
  - only for D2
  - only for INET
  - only for D4
  - only for D5
  - only for D6
  - only for RG1
  - Mailing Lists
  - by Speaker
  - by Type
  - by Category
  - by Title
  - Calendar
  - RSS Feed
- History of Events:

Event Entry

What and Who

Knowledge-driven Entity Recognition and Disambiguation in Biomedical Text

Amy Siu

MMCI

Promotionskolloquium

AG 1, AG 2, AG 3, AG 4, AG 5, SWS, RG1, MMCI

Public Audience

English

Note: We use this to send email in the morning.

Date, Time and Location

Monday, 4 September 2017

16:00

60 Minutes

E1 4

024

Saarbrücken

Abstract

Entity recognition and disambiguation (ERD) for the biomedical domain
are notoriously difficult problems due to the variety of entities and
their often long names in many variations. Existing works focus heavily
on the molecular level in two ways. First, they target scientific
literature as the input text genre. Second, they target single, highly
specialized entity types such as chemicals, genes, and proteins.
However, a wealth of biomedical information is also buried in the vast
universe of Web content. In order to fully utilize all the information
available, there is a need to tap into Web content as an additional
input. Moreover, there is a need to cater for other entity types such as
symptoms and risk factors since Web content focuses on consumer health.
The goal of this thesis is to investigate ERD methods that are
applicable to all entity types in scien-tific literature as well as Web
content. In addition, we focus on under-explored aspects of the
bio-medical ERD problems -- scalability, long noun phrases, and
out-of-knowledge base (OOKB) enti-ties.
This thesis makes four main contributions, all of which leverage
knowledge in UMLS (Unified Med-ical Language System), the largest and
most authoritative knowledge base (KB) of the biomedical domain. The
first contribution is a fast dictionary lookup method for entity
recognition that maximiz-es throughput while balancing the loss of
precision and recall. The second contribution is a semantic type
classification method targeting common words in long noun phrases. We
develop a custom set of semantic types to capture word usages; besides
biomedical usage, these types also cope with non-biomedical usage and
the case of generic, non-informative usage. The third contribution is a
fast heu-ristics method for entity disambiguation in MEDLINE abstracts,
again maximizing throughput but this time maintaining accuracy. The
fourth contribution is a corpus-driven entity disambiguation method that
addresses OOKB entities. The method first captures the entities
expressed in a corpus as latent representations that comprise in-KB and
OOKB entities alike before performing entity disam-biguation.

Contact

Daniela Alessi

5000

--email hidden

System used:

Meeting URL:

Meeting ID:

Passcode:

passcode not visible

Code Visible for:

logged in users only

Daniela Alessi, 08/25/2017 10:06 -- Created document.

Imprint / Impressum | Data Protection / Datenschutzhinweis