Integration and Synchronization of Input Modes During Multimodal Human-Computer Interaction
S. Oviatt, A. De Angeli, and K. Kuhn. Proceedings of ACL/EACL-97 SIGMEDIA Workshop on Referring Phenomena in a Multimedia Context and their Computational Treatment, Madrid, Spain, (1997)
Abstract
Our ability to develop robust multimodal systems will depend on knowledge of the natural integration patterns that typify people's combined use of different input modes. To provide a foundation for theory and design, the present research analyzed multimodal interaction while people spoke and wrote to a simulated dynamic map system. Task analysis revealed that multimodal interaction occurred most frequently during spatial location commands, and with intermediate frequency during selection commands. In addition, microanalysis of input signals identified sequential, simultaneous, point-and-speak, and compound integration patterns, as well as data on the temporal precedence of modes and on inter-modal lags. In synchronizing input streams, the temporal precedence of writing over speech was a major theme, with pen input conveying location information first in a sentence. Linguistic analysis also revealed that the spoken and written modes consistently supplied complementary semantic information, rather than redundant. One long-term goal of this research is the development of predictive models of natural modality integration to guide the design of emerging multimodal architectures.
%0 Conference Paper
%1 OviattDeAngeliKuhn97ACL
%A Oviatt, Sharon
%A De Angeli, Antonella
%A Kuhn, Karen
%B Proceedings of ACL/EACL-97 SIGMEDIA Workshop on Referring Phenomena in a Multimedia Context and their Computational Treatment, Madrid, Spain
%D 1997
%K v1205 acl paper ai multimodal dialog user interface interaction zzz.th.c4
%T Integration and Synchronization of Input Modes During Multimodal Human-Computer Interaction
%U http://www.aclweb.org/anthology/W97-1401
%X Our ability to develop robust multimodal systems will depend on knowledge of the natural integration patterns that typify people's combined use of different input modes. To provide a foundation for theory and design, the present research analyzed multimodal interaction while people spoke and wrote to a simulated dynamic map system. Task analysis revealed that multimodal interaction occurred most frequently during spatial location commands, and with intermediate frequency during selection commands. In addition, microanalysis of input signals identified sequential, simultaneous, point-and-speak, and compound integration patterns, as well as data on the temporal precedence of modes and on inter-modal lags. In synchronizing input streams, the temporal precedence of writing over speech was a major theme, with pen input conveying location information first in a sentence. Linguistic analysis also revealed that the spoken and written modes consistently supplied complementary semantic information, rather than redundant. One long-term goal of this research is the development of predictive models of natural modality integration to guide the design of emerging multimodal architectures.
@inproceedings{OviattDeAngeliKuhn97ACL,
abstract = {Our ability to develop robust multimodal systems will depend on knowledge of the natural integration patterns that typify people's combined use of different input modes. To provide a foundation for theory and design, the present research analyzed multimodal interaction while people spoke and wrote to a simulated dynamic map system. Task analysis revealed that multimodal interaction occurred most frequently during spatial location commands, and with intermediate frequency during selection commands. In addition, microanalysis of input signals identified sequential, simultaneous, point-and-speak, and compound integration patterns, as well as data on the temporal precedence of modes and on inter-modal lags. In synchronizing input streams, the temporal precedence of writing over speech was a major theme, with pen input conveying location information first in a sentence. Linguistic analysis also revealed that the spoken and written modes consistently supplied complementary semantic information, rather than redundant. One long-term goal of this research is the development of predictive models of natural modality integration to guide the design of emerging multimodal architectures.},
added-at = {2012-05-30T10:51:46.000+0200},
author = {Oviatt, Sharon and De Angeli, Antonella and Kuhn, Karen},
biburl = {https://www.bibsonomy.org/bibtex/2eef5178875e352739e74b20c294692e4/flint63},
booktitle = {Proceedings of ACL/EACL-97 SIGMEDIA Workshop on Referring Phenomena in a Multimedia Context and their Computational Treatment, Madrid, Spain},
file = {ACL Anthology:1900-99/OviattDeAngeliKuhn97ACL.pdf:PDF},
groups = {public},
interhash = {663e54cd9d593e01c052b6324dd4bc4a},
intrahash = {eef5178875e352739e74b20c294692e4},
keywords = {v1205 acl paper ai multimodal dialog user interface interaction zzz.th.c4},
timestamp = {2018-04-16T11:40:57.000+0200},
title = {Integration and Synchronization of Input Modes During Multimodal Human-Computer Interaction},
url = {http://www.aclweb.org/anthology/W97-1401},
username = {flint63},
year = 1997
}