On finding duplication and near-duplication in large software systems
B. Baker. Reverse Engineering, 1995., Proceedings of 2nd Working Conference on, page 86-95. (July 1995)
DOI: 10.1109/WCRE.1995.514697
Abstract
This paper describes how a program called dup can be used to locate instances of duplication or near-duplication in a software system. Dup reports both textually identical sections of code and sections that are the same textually except for systematic substitution of one set of variable names and constants for another. Further processing locates longer sections of code that are the same except for other small modifications. Experimental results from running dup on millions of lines from two large software systems show dup to be both effective at locating duplication and fast. Applications could include identifying sections of code that should be replaced by procedures, elimination of duplication during reengineering of the system, redocumentation to include references to copies, and debugging
Description
IEEE Xplore Abstract - On finding duplication and near-duplication in large software systems
%0 Conference Paper
%1 baker1995finding
%A Baker, B.S.
%B Reverse Engineering, 1995., Proceedings of 2nd Working Conference on
%D 1995
%K based code duplication string suffix tree
%P 86-95
%R 10.1109/WCRE.1995.514697
%T On finding duplication and near-duplication in large software systems
%U http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=514697&url=http%3A%2F%2Fieeexplore.ieee.org%2Fiel3%2F3936%2F11405%2F00514697.pdf%3Farnumber%3D514697
%X This paper describes how a program called dup can be used to locate instances of duplication or near-duplication in a software system. Dup reports both textually identical sections of code and sections that are the same textually except for systematic substitution of one set of variable names and constants for another. Further processing locates longer sections of code that are the same except for other small modifications. Experimental results from running dup on millions of lines from two large software systems show dup to be both effective at locating duplication and fast. Applications could include identifying sections of code that should be replaced by procedures, elimination of duplication during reengineering of the system, redocumentation to include references to copies, and debugging
@inproceedings{baker1995finding,
abstract = {This paper describes how a program called dup can be used to locate instances of duplication or near-duplication in a software system. Dup reports both textually identical sections of code and sections that are the same textually except for systematic substitution of one set of variable names and constants for another. Further processing locates longer sections of code that are the same except for other small modifications. Experimental results from running dup on millions of lines from two large software systems show dup to be both effective at locating duplication and fast. Applications could include identifying sections of code that should be replaced by procedures, elimination of duplication during reengineering of the system, redocumentation to include references to copies, and debugging},
added-at = {2014-02-28T14:34:21.000+0100},
author = {Baker, B.S.},
biburl = {https://www.bibsonomy.org/bibtex/2c749b857bca28e4bf9cb9168347bde17/s_nkeha},
booktitle = {Reverse Engineering, 1995., Proceedings of 2nd Working Conference on},
description = {IEEE Xplore Abstract - On finding duplication and near-duplication in large software systems},
doi = {10.1109/WCRE.1995.514697},
interhash = {27e268c01321938bea6d834235e165b4},
intrahash = {c749b857bca28e4bf9cb9168347bde17},
keywords = {based code duplication string suffix tree},
month = jul,
pages = {86-95},
timestamp = {2014-02-28T14:34:21.000+0100},
title = {On finding duplication and near-duplication in large software systems},
url = {http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=514697&url=http%3A%2F%2Fieeexplore.ieee.org%2Fiel3%2F3936%2F11405%2F00514697.pdf%3Farnumber%3D514697},
year = 1995
}