@flint63

VIsual TRAnslator: Linking Perceptions and Natural Language Descriptions

, and . Integration of Natural Language and Vision Processing: Computational Models and Systems, 1, Kluwer, Dordrecht, (1995)
DOI: 10.1007/978-94-011-0273-5_6

Abstract

Despite the fact that image understanding and natural language processing constitute two major areas of AI, there have only been a few attempts toward the integration of computer vision and the generation of natural language expressions for the description of image sequences. In this contribution we will report on practical experience gained in the project Vitra (VIsual TRAnslator) concerning the design and construction of integrated knowledge-based systems capable of translating visual information into natural language descriptions. In Vitra different domains, like traffic scenes and short sequences from soccer matches, have been investigated. Our approach towards simultaneous scene description emphasizes concurrent image sequence evaluation and natural language processing, carried out on an incremental basis, an important prerequisite for real-time performance. One major achievement of our cooperation with the vision group at the Fraunhofer Institute (IITB, Karlsruhe) is the automatic generation of natural language descriptions for recognized trajectories of objects in real world image sequences. In this survey, the different processes pertaining to high-level scene analysis and natural language generation will be discussed.

Links and resources

Tags