Campus Event Calendar: Albina Asadulina (07/12/2010 in E1 4/024)

Campus Event Calendar

Campus Event Calendar:
- All Upcoming:
  - only for D1
  - only for D2
  - only for INET
  - only for D4
  - only for D5
  - only for D6
  - only for RG1
  - Mailing Lists
  - by Speaker
  - by Type
  - by Category
  - by Title
  - Calendar
  - RSS Feed
- History of Events:

Event Entry

New for: D2, D3

What and Who

PhD Application Talk: A dictionary of chemical names and synonyms merged from different resources based on 2d graph representation for the purpose of recognition of chemical names in the text

Albina Asadulina

University of Bonn

Talk

AG 1, AG 3, AG 5, SWS, AG 2, AG 4, RG1, MMCI

MPI Audience

English

Note: We use this to send email in the morning.

Date, Time and Location

Monday, 12 July 2010

10:30

120 Minutes

E1 4

024

Saarbrücken

Abstract

Extraction of chemical information and storage is essential in the field of medicine, for instance, when creating or improving a drug. In order to acquire such information from the literature a problem of finding chemical names in the text should be solved. Simple string search is not powerful enough in this field because chemicals can be used in the text under different names.

One of the existing approaches for chemical name detection is look-up approach that uses a dictionary comprising term variations and synonyms. It was chosen in the current work for the analysis. The challenge for this method is that available chemical databases are incomplete or focus sometimes on the certain types of chemical compounds like metabolites or approved drugs. Therefore several resources should be merged to generate a comprehensive dictionary. When merging the data sources the criteria for identity of the compounds should be defined, i.e. how to deal with the structures that differ only in stereochemistry, charges, isotopes, etc.
One can merge the compounds based on CAS numbers, InChI identifiers, Synonym overlap. The method proposed here is to merge databases analyzing the 2D graph representation of the compounds when merging databases. Direct comparison of the structure is a more flexible approach where structure information is not lost.
For the creation of a dictionary a workflow is developed that allows to merge databases comparing 2D graph representation of the compounds. The user is able to set up the criteria for structure identity according to the research needs.
In the course of the work the criteria for structure identity should be defined that serve best for the Text Mining purposes: which structure issues should be considered or ignored for compound comparison. Performance of the created dictionary should be compared to the existing ones.

Contact

IMPRS-CS

--email hidden

System used:

Meeting URL:

Meeting ID:

Passcode:

passcode not visible

Code Visible for:

logged in users only

Tags, Category, Keywords and additional notes

Note:

Please note: The talks will take place in random order!

Heike Przybyl, 07/01/2010 15:32 -- Created document.

Imprint / Impressum | Data Protection / Datenschutzhinweis