Abstract

Identifiers represent an important source of information for programmers understanding and maintaining a system. Self-documenting identifiers reduce the time and effort necessary to obtain the level of understanding appropriate for the task at hand. While the role of the lexicon in program comprehension has long been recognized, only a few works have studied the quality and enhancement of the identifiers and no works have studied the evolution of the lexicon. In this paper, we characterize the evolution of program identifiers in terms of stability metrics and occurrences of renaming. We assess whether an evolution process similar to the one occurring for the program structure exists for identifiers. We report data and results about the evolution of three large systems, for which several releases are available. We have found evidence that the evolution of the lexicon is more limited and constrained than the evolution of the structure. We argue that the different evolution results from several factors including the lack of advanced tool support for lexicon construction, documentation, and evolution.

Description

Main conclusion is that the lexicon rarely changes during software evolution. Case study has three systems (1 java, 2 C++) about 20 snapshots per version. Eclipse, Mozilla and Alice. Good research questions and experimental set-up. Use stemming for making set of words smaller. Stability metric defined by cosine between vectors.

Links and resources

Tags

community

  • @sjbutler
  • @ericbouwers
@ericbouwers's tags highlighted