In many ways the central problem faced in Virtual Human dialogue system is what to say next. In Virtual Human systems designed for training purposes, the character’s next response at each point in the interaction is motivated by specific goals. For example, if it is a training goal to make an Army captain more sensitive to the risk a local Iraqi might face by cooperating with the U.S. Army, a virtual Iraqi character can be designed not to reveal the certain information unless he has been promised secrecy. Rules such as this one define a dialogue policy, which determines what the character will say at each point so that the training goal will be achieved.
Our increasing understanding of the authoring process for virtual characters and the representation of dialogue policies has allowed for the creation of characters that display complex dialogue behavior, and even for characters with a more straightforward conversational style to be authored by users with no expertise in dialogue systems or AI, in which a simple authoring task can be performed by non-programmers who simply link input questions directly to the output answers the character should give. However, to date, it is challenging even for expert programmers to create a Virtual Human to support more flexible and varied human-like behavior due to the complexity and level of technical detail involved in authoring a dialogue policy, an error-prone and time-consuming process. Even a simple training need such as offering secrecy in exchange for specific information, or other kinds of negotiation that play out over time, has required significant time from a highly skilled computer programmer or researcher who is intimate with the details of the system and could author an effective dialogue policy that is tightly coupled to the underlying representations and data structures.
This project aims to eliminate this bottleneck by developing authoring tools following a new paradigm for intuitive example-driven authoring, where a process carried out by non- programmers yields dialogue policies that are sophisticated and robust enough to meet straightforward training goals such as Tactical Questioning strategies. The novelty of our approach lies in the way in which we will exploit machine learning algorithms to make it possible to keep an intuitive authoring task – one in which the author needs only to specify what the character should say next in a specific situation, or if it would be acceptable for the character to say a specific sentence – while having the system automatically learn a robust dialogue policy in terms of its existing representations from these interactions. The use of online learning will allow the system to update and improve its policy at every step of the authoring task, making the process efficient by targeting authoring of new examples to specific needs of the system, and using the author’s intuitive judgment to converge at behavior for which specification through manually crafted rules would be too complex for non-programmers to create.