Abstract
Community analysis algorithm proposed by Clauset, Newman, and Moore (CNM
algorithm) finds community structure in social networks. Unfortunately, CNM
algorithm does not scale well and its use is practically limited to networks
whose sizes are up to 500,000 nodes. The paper identifies that this
inefficiency is caused from merging communities in unbalanced manner. The paper
introduces three kinds of metrics (consolidation ratio) to control the process
of community analysis trying to balance the sizes of the communities being
merged. Three flavors of CNM algorithms are built incorporating those metrics.
The proposed techniques are tested using data sets obtained from existing
social networking service that hosts 5.5 million users. All the methods exhibit
dramatic improvement of execution efficiency in comparison with the original
CNM algorithm and shows high scalability. The fastest method processes a
network with 1 million nodes in 5 minutes and a network with 4 million nodes in
35 minutes, respectively. Another one processes a network with 500,000 nodes in
50 minutes (7 times faster than the original algorithm), finds community
structures that has improved modularity, and scales to a network with 5.5
million.
Users
Please
log in to take part in the discussion (add own reviews or comments).