Steps toward a Cognitive Vision System
AI Magazine 22 (2): 31--50 (2004)

An adequate natural language description of developments in a real-world scene can be taken as proof of understanding what is going on. An algorithmic system that generates natural language descriptions from video recordings of road traffic scenes can be said to 'understand' its input to the extent that algorithmically generated text is acceptable to the humans judging it. A fuzzy metrictemporal Horn logic (FMTHL) provides a formalism for representing both schematic and instantiated conceptual knowledge about the depicted scene and its temporal development. The resulting conceptual representation mediates in a systematic manner between the spatiotemporal geometric descriptions extracted from video input and a module that generates natural language text. This article outlines a 30-year effort to create such cognitive vision system, indicates its current status, summarizes lessons learned along the way, and discusses open problems against this background.
  • @flint63
  • @dblp
This publication has not been reviewed yet.

rating distribution
average user rating0.0 out of 5.0 based on 0 reviews
    Please log in to take part in the discussion (add own reviews or comments).