copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Community Membership Identification from Small Seed Sets

I. Kloumann, and J. Kleinberg. Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, page 1366--1375. New York, NY, USA, ACM, (2014)
DOI: 10.1145/2623330.2623621

Abstract

In many applications we have a social network of people and would like to identify the members of an interesting but unlabeled group or community. We start with a small number of exemplar group members -- they may be followers of a political ideology or fans of a music genre -- and need to use those examples to discover the additional members. This problem gives rise to the seed expansion problem in community detection: given example community members, how can the social graph be used to predict the identities of remaining, hidden community members? In contrast with global community detection (graph partitioning or covering), seed expansion is best suited for identifying communities locally concentrated around nodes of interest. A growing body of work has used seed expansion as a scalable means of detecting overlapping communities. Yet despite growing interest in seed expansion, there are divergent approaches in the literature and there still isn't a systematic understanding of which approaches work best in different domains. Here we evaluate several variants and uncover subtle trade-offs between different approaches. We explore which properties of the seed set can improve performance, focusing on heuristics that one can control in practice. As a consequence of this systematic understanding we have found several opportunities for performance gains. We also consider an adaptive version in which requests are made for additional membership labels of particular nodes, such as one finds in field studies of social communities. This leads to interesting connections and contrasts with active learning and the trade-offs of exploration and exploitation. Finally, we explore topological properties of communities and seed sets that correlate with algorithm performance, and explain these empirical observations with theoretical ones. We evaluate our methods across multiple domains, using publicly available datasets with labeled, ground-truth communities.

Description

Community membership identification from small seed sets

Links and resources

BibTeX key: kloumann2014community
entry type: inproceedings
address: New York, NY, USA
booktitle: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
year: 2014
pages: 1366--1375
publisher: ACM
series: KDD '14
location: New York, New York, USA
acmid: 2623621
isbn: 978-1-4503-2956-9
numpages: 10
DOI: 10.1145/2623330.2623621
url: http://doi.acm.org/10.1145/2623330.2623621

@asmelash's tags highlighted

Cite this publication

%0 Conference Paper %1 kloumann2014community %A Kloumann, Isabel M. %A Kleinberg, Jon M. %B Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining %C New York, NY, USA %D 2014 %I ACM %K econ identification seed sot %P 1366--1375 %R 10.1145/2623330.2623621 %T Community Membership Identification from Small Seed Sets %U http://doi.acm.org/10.1145/2623330.2623621 %X In many applications we have a social network of people and would like to identify the members of an interesting but unlabeled group or community. We start with a small number of exemplar group members -- they may be followers of a political ideology or fans of a music genre -- and need to use those examples to discover the additional members. This problem gives rise to the seed expansion problem in community detection: given example community members, how can the social graph be used to predict the identities of remaining, hidden community members? In contrast with global community detection (graph partitioning or covering), seed expansion is best suited for identifying communities locally concentrated around nodes of interest. A growing body of work has used seed expansion as a scalable means of detecting overlapping communities. Yet despite growing interest in seed expansion, there are divergent approaches in the literature and there still isn't a systematic understanding of which approaches work best in different domains. Here we evaluate several variants and uncover subtle trade-offs between different approaches. We explore which properties of the seed set can improve performance, focusing on heuristics that one can control in practice. As a consequence of this systematic understanding we have found several opportunities for performance gains. We also consider an adaptive version in which requests are made for additional membership labels of particular nodes, such as one finds in field studies of social communities. This leads to interesting connections and contrasts with active learning and the trade-offs of exploration and exploitation. Finally, we explore topological properties of communities and seed sets that correlate with algorithm performance, and explain these empirical observations with theoretical ones. We evaluate our methods across multiple domains, using publicly available datasets with labeled, ground-truth communities. %@ 978-1-4503-2956-9

@inproceedings{kloumann2014community, abstract = {In many applications we have a social network of people and would like to identify the members of an interesting but unlabeled group or community. We start with a small number of exemplar group members -- they may be followers of a political ideology or fans of a music genre -- and need to use those examples to discover the additional members. This problem gives rise to the seed expansion problem in community detection: given example community members, how can the social graph be used to predict the identities of remaining, hidden community members? In contrast with global community detection (graph partitioning or covering), seed expansion is best suited for identifying communities locally concentrated around nodes of interest. A growing body of work has used seed expansion as a scalable means of detecting overlapping communities. Yet despite growing interest in seed expansion, there are divergent approaches in the literature and there still isn't a systematic understanding of which approaches work best in different domains. Here we evaluate several variants and uncover subtle trade-offs between different approaches. We explore which properties of the seed set can improve performance, focusing on heuristics that one can control in practice. As a consequence of this systematic understanding we have found several opportunities for performance gains. We also consider an adaptive version in which requests are made for additional membership labels of particular nodes, such as one finds in field studies of social communities. This leads to interesting connections and contrasts with active learning and the trade-offs of exploration and exploitation. Finally, we explore topological properties of communities and seed sets that correlate with algorithm performance, and explain these empirical observations with theoretical ones. We evaluate our methods across multiple domains, using publicly available datasets with labeled, ground-truth communities.}, acmid = {2623621}, added-at = {2014-10-10T11:36:01.000+0200}, address = {New York, NY, USA}, author = {Kloumann, Isabel M. and Kleinberg, Jon M.}, biburl = {https://www.bibsonomy.org/bibtex/2bce73e0651855a30f200ab6a974dd8f1/asmelash}, booktitle = {Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining}, description = {Community membership identification from small seed sets}, doi = {10.1145/2623330.2623621}, interhash = {f1427cdb8aa6f22ee4e182ae0030a6d6}, intrahash = {bce73e0651855a30f200ab6a974dd8f1}, isbn = {978-1-4503-2956-9}, keywords = {econ identification seed sot}, location = {New York, New York, USA}, numpages = {10}, pages = {1366--1375}, publisher = {ACM}, series = {KDD '14}, timestamp = {2014-10-10T11:36:01.000+0200}, title = {Community Membership Identification from Small Seed Sets}, url = {http://doi.acm.org/10.1145/2623330.2623621}, year = 2014 }

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Community Membership Identification from Small Seed Sets

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Community Membership Identification from Small Seed Sets

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Community Membership Identification from Small Seed Sets

Comments and Reviews
(0)