Oberseminar: Heterogene formale Methoden

Date: 2016, January 20
Author: Engel, Christoph
Title: Domain Specific Corpus Creation for Natural Language Processing


To train and validate systems for information extraction annotated text corpora are needed. Existing corpora contain grammatical annotations for a specific language but semantically annotated corpora for a special domain like traffic management are not available. In this talk an approach for a statistic supported cooperative creation of training corpora for the extraction of traffic information from microblogs shall be discussed.
