Dialogue Corpus Annotation

The Natural Language Dialogue group maintains a large set of corpora that are used in our research. Most of these corpora have been collected in connection with a specific virtual human effort; these include dialogues between humans and virtual humans as well as dialogues between human participants (such as role plays). Many of the corpora contain speech, which is transcribed, and if Automatic Speech Recognition has been used we retain that output as well. Further annotations depend on the project for which the corpus was collected, and are used for training the machine-learning components that drive the systems. Typical annotations include dialogue acts; semantic representations of utterances, using system-specific semantic languages; appropriate responses to questions for direct question-answering systems; splitting utterances into individual meaningful units; syntactic annotation for understanding and generation (typically using external parsers, chunkers and taggers). We also annotate some corpora for the purpose of evaluation, and these annotations typically rate the correctness or appropriateness of various system outputs.

NLD Group Leaders

Ron Artstein

People

Ron Artstein

Alumni

Projects

Collaborators

Publications

Georgila K, Artstein R, Nazarian A, Rushforth M, Traum DR, Sycara K. An annotation scheme for cross-cultural argumentation and persuasion dialogues. In: 12th Annual SIGdial Meeting on Discourse and Dialogue. Portland, Oregon, USA; 2011.
BibTex
Google Scholar
Robinson S, Roque A, Traum DR. Dialogues in Context: An Objective User-Oriented Evaluation Approach for Virtual Human Dialogue. In: 7th International Conference on Language Resources and Evaluation (LREC). Valletta, Malta; 2010. Abstract
BibTex
Google Scholar
Full Text
Robinson S, Traum DR, Ittycheriah M, Henderer J. What would you ask a conversational agent? Observations of Human-Agent dialogues in a museum setting. In: Language Resources and Evaluation Conference (LREC). Marrakech (Morocco); 2008. Abstract
BibTex
Google Scholar
Full Text
Traum DR, Robinson S, Stephan J. Evaluation of multi-party virtual reality dialogue interaction. In: Proceedings of Fourth International Conference on Language Resources and Evaluation (LREC 2004).; 2004. p. 1699-702.
BibTex
Google Scholar
Full Text
Robinson S, Martinovski B, Garg S, Stephan J, Traum DR. Issues in corpus development for multi-party multi-modal task-oriented dialogue. In: Fourth International Conference on Language Resources and Evaluation (LREC 2004). Lisbon, Portugal; 2004. Abstract
BibTex
Google Scholar
Full Text
Martinovski B, Traum DR, Robinson S, Garg S. Functions and Patterns of Speaker and Addressee Identifications in Distributed Complex Organizational Tasks Over Radio. In: Diabruck: seventh workshop on semantics and pragmatics of dialogue.; 2003.
BibTex
Google Scholar

Natural Language Dialogue group

Primary links