New for: D1, D2, D3, D4
This talk presents the syntactically annotated corpus for French
developed at Paris 7. The corpus comprises 1 million words fully
annotated and disambiguated for parts of speech, inflectional
morphology, compounds and lemmas, and syntactic constituents. It is
representative of contemporary normalized written French, and covers a
variety of authors and subjects (economy, literature, politics, etc.),
with extracts from newspapers ranging from 1989 to 93. Our goal is to
provide a theory neutral, surface oriented, error-free treebank for
French. We have used the corpus sofar for lexical or syntactic
preferences, and explain why we think some of these results are
relevant both for theoretical linguistics and psycholinguistics.
If you would like to meet with the speaker, please contact:
Valia Kordoni
This seminar series is jointly organized by the Department of
Computational Linguistics and Phonetics and the European Post-Graduate
College in Language Technology and Cognitive Systems.
A current version of the program for this term can be found at:
http://www.coli.uni-sb.de/colloquium/