Introduction
A format (in XML) plus an Dialogue Management Tool(DMT) is needed for the production of
dialogues. That is, some stimulus from the user, typically a question, produces some
output from the DialogueManager with some weighting or confidence level. The DialogueManager
may also move into a different state having provided that response.
User: Can you tell me about Talking Heads? <STIMULUS>
DM : Yes, <smile>what would you like to know?</smile> <RESPONSE>
(DM now moves into state concerned with knowledge about Talking Heads )
|
|
Comments Please : Is this an accurate model or only my concept?
The DialogueManager is a state machine with different states
matching different stimuli to provide responses.
|
The DTD for this language will also need to be built:
<!--
#########################################################################
# Dialogue Markup Language (DML) DTD, version 1.0.
#
# Usage:
# <!DOCTYPE dialogue SYSTEM "./dialogue-v01.dtd">
#
# Authors:
# Date: Wed May 16 10:46:23 WST 2001
#
#########################################################################
-->
|
|
The XML language
The root tag should be <dialogue>
| Comments Please : Any other suggestions for root tag?
|
The DTD for this being:
<!ELEMENT dialogue (topic)*>
|
|
The root tag can contain zero or more <topic name="xxx"> tags.
The topics would represent different areas of knowledge - there may be topics about the weather, the time, Talking Head Web pages, etc.
So a Dialogue Management Tool document would look like the following:
| Comments Please : Any other attributes for topic tag?
|
In a similar way, the topic tag can contain zero or more <subtopic name="xxx"> tags:
| Comments Please : Any other attributes for subtopic tag?
|
The DTD for a topic element being:
<!ELEMENT topic (subtopic)*>
<!ATTLIST topic
name CDATA #REQUIRED>
|
|
The subtopic may be one aspect of the topic - for example, Talking Heads which are for story telling vs Talking Heads for
Network Navigation, etc.
One of the states in the subtopic is represented by the
<state>
tag. There can be zero or more lots of these in a subtopic.
The DTD for the subtopic element being:
<!ELEMENT subtopic (state)*>
<!ATTLIST subtopic
name CDATA #REQUIRED>
|
|
<state name="xxx" type="xxx">
This state represents one of the nodes in the above graph.
It has a name so that it can be referenced (the name is any
alphanumeric identifier).
It has a type to cater for the different types of nodes that may need to be specified (for example,
some nodes may be "active" - that is, the Dialogue Manager which uses this file may use an "active" node to ask the user questions or make
observations, not just respond to stimulus.
Typical types may be
- default - the stimulus is matched only from "nextstates"
- active - pro-active interaction with the user.
- entry - these stimuli are used for initial input from user
- switch - the start of a chained stimulus-response set of states to cater for learned behaviour in the user.
- ?????
Comments Please : What values should type have?
Should the default be "entry" and the nextstate be named?
|
Comments Please : Any other attributes for state tag?
For example, should we have a "DMSpecific" tag which
holds arbitrary information which we could migrate into the
attributes at a later date?
|
A state element will have
- one or more stimulus elements followed by
- one responseweight element followed by
- one or more response elements followed by
- zero or more signal elements followed by
- zero or more prestate elements followed by
- zero or more nextstate elements.
The DTD for state is:
<!ELEMENT state
(stimulus+,responseweight,response+,signal*,prestate*,nextstate*)
>
<!ATTLIST state
name CDATA #REQUIRED
type (active | entry | switch | default) "default"
>
|
|
Comments Please : What other elements should state have?
|
The stimulus is typically a question or a response to a question or could be
input from a facial recognition system - a shrug, a nod, etc.
Case is important in the input.
The DTD for stimulus is:
<!ELEMENT stimulus (#PCDATA)>
|
|
The response is typically a response but marked up in
vhml. The response could be
text, XHTML text, text plus EML, etc. The response could also be
a question for pro-active dialogues.
The vhml does not have the vhml root tag.
The DTD for response is:
<!ELEMENT response (vhml)>
|
|
The response weight is a floating point number between 0.0 and 1.0 with 0.0 meaning
no confidence in this response and 1.0 meaning total confidence in the response.
A value of 0.7 could be the typical value for most responses which match. This gives the
ability to have other responses match but at a higher priority because the response is seen as being more
important in this situtation
The default value for a responseweight is 0.7.
The Dialogue Manager may ignore this value.
The DTD for responseweight is:
<!ELEMENT responseweight EMPTY>
<!ATTLIST responseweight
value CDATA #REQUIRED
>
|
|
A signal tag enables the match to generate a signal or a notification to the Dialogue Manager which it may
choose to ignore. An example of the use of this is if the match has determined that the user wants to
finish the dialogue and hence the DM should know to finish.
The DTD for signal is:
<!ELEMENT signal EMPTY>
<!ATTLIST signal
signalnumber CDATA #REQUIRED
>
|
|
The prestate tag specifies a set of states which must match for this state to match the stimulus.
This allows for catering for a specific "yes" answer but only to the prestate question.
The DTD for prestate is:
<!ELEMENT prestate EMPTY>
<!ATTLIST prestate
statename CDATA #REQUIRED
>
|
|
The nextstate tag specifies a set of states to test for followup stimulus input. These states would
be checked first (perhaps with some increase in the responseweighting?) before all other states.
This allows for catering for a specific "yes" answer to this response.
The DTD for nextstate is:
<!ELEMENT nextstate EMPTY>
<!ATTLIST nextstate
statename CDATA #REQUIRED
>
|
|
Dialogue XML Editor/Maintainer
The Dialogue Langauage Editor/maintainer should be
system independant (i.e Java or Posix C using Gtk as the GUI)
and should be a folding editor (i.e the individual topics
may be edited, or a specific subtopic, etc). The folding part
could be accomplished by the use of a tree gui:
Rather than everyone having to learn a new editor, it should be that the chosen folder in the XML document
can be exported to the user's favourite editor. When the file changes, the new information is re-integrated
into the XML document. Most editors can be configured to allow for macros via function keys, etc so that gross
structure manipulation can be performed (i.e the adding of new
<state name="xxx" type="xxx">
tags along with the associated
<stimulus> and <response> tags).
The Dialogue Langauage Editor/maintainer must perform proper XML evaluation of the language initially and when
the document information is re-integrated from the editor changes.
Comments Please : Shoud it also parse the vhml?
If we have a dtd for it, it should be possible
|