The ability for virtual humans and other computer interfaces to understand and use unstructured natural language is important for a variety of purposes, especially naturalistic simulation and training applications, in which people are trained to interact with other people in a task by working with simulators. The main bottleneck in rapidly developing language capabilities for a training simulation including unstructured language lies in establishing an adequate mapping from natural language to application-specific semantic representations (NLU) and from these representations to surface language (NLG). While a lot is known about the structure of natural languages, comparatively little is known about how natural language should be translated into specific semantic structures in applications, in order for the application to understand. In practice, many of the utilized semantic structures are domain, scenario, and application-specific, with little similarity from one application to another. Specification of how a wide range of linguistic forms translate into application-specific semantic structures is a labor intensive process requiring a detailed knowledge of language processing which is not available to most application designers, or a detailed knowledge of the application, which may not be available to language processing experts.
This seedling project aimed to ease this bottleneck by creating prototype tools to allow an application designer to build a natural language interface without detailed knowledge of language processing. As long as the application designer can provide a paired set of structured application meanings and intuitive language to describe those meanings, the tools will learn appropriate mappings and be able to generalize to similar descriptions.
NLU extensions were made to the NPCEdtior to allow additional APIs and a broader set of applicable dialogue models. A prototype connection with the BiLAT project http://ict.usc.edu/projects/elect_bilat1/
which has since been used in the initial version of the INOTS project
http://projects.ict.usc.edu/nld/group/projects/inots
NLG tools were also developed to allow example-based generation techniques first developed in the SASO Project
http://projects.ict.usc.edu/sandbox/projects/saso-en
to be used, by simplifying the data annotation demands and using new machine learning approaches to reduce the authoring burden. A GUI was also developed to support interactive testing and exploration of results. These tools were tested in an ICT terrain creation project, allowing users to author target noun phrases to describe buildings and learning how to produce referring expressions for new buildings with similar featues