@sjbutler

Extracting Meaning from Abbreviated Identifiers

, , and . Seventh IEEE Int'l Working Conf. on Source Code Analysis and Manipulation., page 213--222. IEEE, (2007)
DOI: 10.1109/SCAM.2007.17

Abstract

Informative identifiers are made up of full (natural language) words and (meaningful) abbreviations. Readers of programs typically have little trouble understanding the purpose of identifiers composed of full words. In addition, those familiar with the code can (most often) determine the meaning of abbreviations used in identifiers. However, when faced with unfamiliar code, abbreviations often carry little useful information. Furthermore, tools that focus on the natural language used in the code have a hard time in the presence of abbreviations. One approach to providing meaning to programmers and tools is to translate (expand) abbreviations into full words. This paper presents a methodology for expanding identifiers and evaluates the process on a code based of just over 35 million lines of code. For example, using phrase extraction, fs_exists is expanded to file_status_exists illustrating how the expansion process can facilitate comprehension. On average, 16 percent of the identifiers in a program are expanded. Finally, as an example application, the approach is used to improve the syntactic identification of violations to Deissenbock and Pizka's rules for concise and consistent identifier construction.

Links and resources

Tags

community

  • @sjbutler
  • @dblp
@sjbutler's tags highlighted