copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

A Comparison of String Metrics for Matching Names and Records

W. Cohen, P. Ravikumar, and S. Fienberg. KDD Workshop on Data Cleaning and Object Consolidation, (2003)

Abstract

We describe an open-source Java toolkit of methods for matching names and records. We summarize results obtained from using various string distance metrics on the task of matching entity names. These metrics include distance functions proposed by several different communities, such as edit-distance metrics, fast heuristic string comparators, token-based distance metrics, and hybrid methods. We then describe an extension to the toolkit which allows records to be compared. We discuss some issues involved in performing a similar comparison for record-matching techniques, and finally present results for some baseline record-matching algorithms that aggregate string comparisons between fields

Links and resources

BibTeX key: cohen2003comparison
entry type: inproceedings
booktitle: KDD Workshop on Data Cleaning and Object Consolidation
year: 2003
Document: https://www.cs.cmu.edu/afs/cs/Web/People/wcohen/postscript/kdd-2003-match-ws.pdf

@jaeschke's tags highlighted

Cite this publication

search on

Meta data

Last update 9 years ago
Created 9 years ago

Comments and Reviews
(0)

There is no review or comment yet. You can write one!

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

A Comparison of String Metrics for Matching Names and Records

Abstract

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML A Comparison of String Metrics for Matching Names and Records

Abstract

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

A Comparison of String Metrics for Matching Names and Records

Comments and Reviews
(0)