We propose a novel multimodal intervention strategy for Non-Critical Spontaneous Situations (NCSSs) in autonomous driving. The strategy combines speech and deictic gestures to instruct the car about desired interventions which include spatial references to the current car’s and hence driver’s environment (e.g., “stop over pointing there” or “take pointing this parking lot”). Speech allows for specifying a large number of maneuvers and functions in the car (e.g., stop, park, etc.), whereas deictic gestures provide a natural way of indicating spatial discourse referents used in these interventions (e.g., near this tree, that parking lot, etc.). Hence, advantages of each modality are exploited. Our multimodal system also supports a semi-immersive Virtual Reality enhanced by Semantic Entities to realize and test the proposed NCSS intervention strategy. The evaluation confirmed our approach to be more natural and intuitive, and also less cognitively demanding com- pared to a combination of speech and touch, a combination that could be seen as a straight-forward alternative to our approach due to already existing and available in-car touch screens.