ABSTRACT: This article explores the emergence of multimodality as intrinsic to the learning, teaching and assessment of English in the Twenty-First Century. With subject traditions tied to the study of language, literature and media, multimodal texts and new technologies are now accorded overdue recognition in English curriculum documents in several countries, though assessment tends to remain largely print-centric. Until assessment modes and practices align with the nature of multimodal text production, their value as sites for inquiry in classroom practice will not be assured. The article takes up the question: What is involved in assessing the multimodal texts that students create? In exploring this question, we first consider central concepts of multimodality and what is involved in “working multimodally” to create a multimodal text. Here, “transmodal operation” and “staged multimodality” are considered as central concepts to “working multimodally”. Further, we suggest that these concepts challenge current understandings of the purposes of, and possibilities for, assessment of multimodal text production.