Article,

Detecting Network Communities: An Application to Phylogenetic Analysis

R. Andrade, I. Rocha-Neto, L. Santos, C. de Santana, M. Diniz, T. Lob\ ao, A. Goés-Neto, S. Pinho, and C. El-Hani.
PLoS Comput Biol, 7 (5): e1001131+ (May 5, 2011)
DOI: 10.1371/journal.pcbi.1001131

Abstract

This paper proposes a new method to identify communities in generally weighted complex networks and apply it to phylogenetic analysis. In this case, weights correspond to the similarity indexes among protein sequences, which can be used for network construction so that the network structure can be analyzed to recover phylogenetically useful information from its properties. The analyses discussed here are mainly based on the modular character of protein similarity networks, explored through the Newman-Girvan algorithm, with the help of the neighborhood matrix . The most relevant networks are found when the network topology changes abruptly revealing distinct modules related to the sets of organisms to which the proteins belong. Sound biological information can be retrieved by the computational routines used in the network approach, without using biological assumptions other than those incorporated by BLAST. Usually, all the main bacterial phyla and, in some cases, also some bacterial classes corresponded totally (100\%) or to a great extent (>70\%) to the modules. We checked for internal consistency in the obtained results, and we scored close to 84\% of matches for community pertinence when comparisons between the results were performed. To illustrate how to use the network-based method, we employed data for enzymes involved in the chitin metabolic pathway that are present in more than 100 organisms from an original data set containing 1,695 organisms, downloaded from GenBank on May 19, 2007. A preliminary comparison between the outcomes of the network-based method and the results of methods based on Bayesian, distance, likelihood, and parsimony criteria suggests that the former is as reliable as these commonly used methods. We conclude that the network-based method can be used as a powerful tool for retrieving modularity information from weighted networks, which is useful for phylogenetic analysis. Complex weighted networks have been applied to uncover organizing principles of complex biological, technological, and social systems. We propose herein a new method to identify communities in such structures and apply it to phylogenetic analysis. Recent studies using this theory in genomics and proteomics contributed to the understanding of the structure and dynamics of cellular complex interaction webs. Three main distinct molecular networks have been investigated based on transcriptional and metabolic activity, and on protein interaction. Here we consider the evolutionary relationship between proteins throughout phylogeny, employing the complex network approach to perform a comparative study of the enzymes related to the chitin metabolic pathway. We show how the similarity index of protein sequences can be used for network construction, and how the underlying structure is analyzed by the computational routines of our method to recover useful and sound information for phylogenetic studies. By focusing on the modular character of protein similarity networks, we were successful in matching the identified networks modules to main bacterial phyla, and even some bacterial classes. The network-based method reported here can be used as a new powerful tool for identifying communities in complex networks, retrieving useful information for phylogenetic studies.

BibTeX key: Andrade2011Detecting
entry type: article
year: 2011
month: may
day: 5
journal: PLoS Comput Biol
number: 5
pages: e1001131+
publisher: Public Library of Science
volume: 7
citeulike-article-id: 9255911
citeulike-linkout-2: http://view.ncbi.nlm.nih.gov/pubmed/21573202
citeulike-linkout-1: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3088654/
citeulike-linkout-3: http://www.hubmed.org/display.cgi?uids=21573202
pmid: 21573202
priority: 5
posted-at: 2011-05-06 05:06:27
issn: 1553-7358
citeulike-linkout-0: http://dx.doi.org/10.1371/journal.pcbi.1001131
pmcid: PMC3088654
DOI: 10.1371/journal.pcbi.1001131
url: http://dx.doi.org/10.1371/journal.pcbi.1001131

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

%0 Journal Article %1 Andrade2011Detecting %A Andrade, Roberto F. S. %A Rocha-Neto, Ivan C. %A Santos, Leonardo B. L. %A de Santana, Charles N. %A Diniz, Marcelo V. C. %A Lob\ ao, Thierry P. %A Goés-Neto, Aristóteles %A Pinho, Suani T. R. %A El-Hani, Charbel N. %D 2011 %I Public Library of Science %J PLoS Comput Biol %K communities networks phylogeny %N 5 %P e1001131+ %R 10.1371/journal.pcbi.1001131 %T Detecting Network Communities: An Application to Phylogenetic Analysis %U http://dx.doi.org/10.1371/journal.pcbi.1001131 %V 7 %X This paper proposes a new method to identify communities in generally weighted complex networks and apply it to phylogenetic analysis. In this case, weights correspond to the similarity indexes among protein sequences, which can be used for network construction so that the network structure can be analyzed to recover phylogenetically useful information from its properties. The analyses discussed here are mainly based on the modular character of protein similarity networks, explored through the Newman-Girvan algorithm, with the help of the neighborhood matrix . The most relevant networks are found when the network topology changes abruptly revealing distinct modules related to the sets of organisms to which the proteins belong. Sound biological information can be retrieved by the computational routines used in the network approach, without using biological assumptions other than those incorporated by BLAST. Usually, all the main bacterial phyla and, in some cases, also some bacterial classes corresponded totally (100\%) or to a great extent (>70\%) to the modules. We checked for internal consistency in the obtained results, and we scored close to 84\% of matches for community pertinence when comparisons between the results were performed. To illustrate how to use the network-based method, we employed data for enzymes involved in the chitin metabolic pathway that are present in more than 100 organisms from an original data set containing 1,695 organisms, downloaded from GenBank on May 19, 2007. A preliminary comparison between the outcomes of the network-based method and the results of methods based on Bayesian, distance, likelihood, and parsimony criteria suggests that the former is as reliable as these commonly used methods. We conclude that the network-based method can be used as a powerful tool for retrieving modularity information from weighted networks, which is useful for phylogenetic analysis. Complex weighted networks have been applied to uncover organizing principles of complex biological, technological, and social systems. We propose herein a new method to identify communities in such structures and apply it to phylogenetic analysis. Recent studies using this theory in genomics and proteomics contributed to the understanding of the structure and dynamics of cellular complex interaction webs. Three main distinct molecular networks have been investigated based on transcriptional and metabolic activity, and on protein interaction. Here we consider the evolutionary relationship between proteins throughout phylogeny, employing the complex network approach to perform a comparative study of the enzymes related to the chitin metabolic pathway. We show how the similarity index of protein sequences can be used for network construction, and how the underlying structure is analyzed by the computational routines of our method to recover useful and sound information for phylogenetic studies. By focusing on the modular character of protein similarity networks, we were successful in matching the identified networks modules to main bacterial phyla, and even some bacterial classes. The network-based method reported here can be used as a new powerful tool for identifying communities in complex networks, retrieving useful information for phylogenetic studies.

@article{Andrade2011Detecting, abstract = {This paper proposes a new method to identify communities in generally weighted complex networks and apply it to phylogenetic analysis. In this case, weights correspond to the similarity indexes among protein sequences, which can be used for network construction so that the network structure can be analyzed to recover phylogenetically useful information from its properties. The analyses discussed here are mainly based on the modular character of protein similarity networks, explored through the {Newman-Girvan} algorithm, with the help of the neighborhood matrix . The most relevant networks are found when the network topology changes abruptly revealing distinct modules related to the sets of organisms to which the proteins belong. Sound biological information can be retrieved by the computational routines used in the network approach, without using biological assumptions other than those incorporated by {BLAST}. Usually, all the main bacterial phyla and, in some cases, also some bacterial classes corresponded totally (100\%) or to a great extent (>70\%) to the modules. We checked for internal consistency in the obtained results, and we scored close to 84\% of matches for community pertinence when comparisons between the results were performed. To illustrate how to use the network-based method, we employed data for enzymes involved in the chitin metabolic pathway that are present in more than 100 organisms from an original data set containing 1,695 organisms, downloaded from {GenBank} on May 19, 2007. A preliminary comparison between the outcomes of the network-based method and the results of methods based on Bayesian, distance, likelihood, and parsimony criteria suggests that the former is as reliable as these commonly used methods. We conclude that the network-based method can be used as a powerful tool for retrieving modularity information from weighted networks, which is useful for phylogenetic analysis. Complex weighted networks have been applied to uncover organizing principles of complex biological, technological, and social systems. We propose herein a new method to identify communities in such structures and apply it to phylogenetic analysis. Recent studies using this theory in genomics and proteomics contributed to the understanding of the structure and dynamics of cellular complex interaction webs. Three main distinct molecular networks have been investigated based on transcriptional and metabolic activity, and on protein interaction. Here we consider the evolutionary relationship between proteins throughout phylogeny, employing the complex network approach to perform a comparative study of the enzymes related to the chitin metabolic pathway. We show how the similarity index of protein sequences can be used for network construction, and how the underlying structure is analyzed by the computational routines of our method to recover useful and sound information for phylogenetic studies. By focusing on the modular character of protein similarity networks, we were successful in matching the identified networks modules to main bacterial phyla, and even some bacterial classes. The network-based method reported here can be used as a new powerful tool for identifying communities in complex networks, retrieving useful information for phylogenetic studies.}, added-at = {2018-12-02T16:09:07.000+0100}, author = {Andrade, Roberto F. S. and Rocha-Neto, Ivan C. and Santos, Leonardo B. L. and de Santana, Charles N. and Diniz, Marcelo V. C. and Lob\ {a}o, Thierry P. and Go\'{e}s-Neto, Arist\'{o}teles and Pinho, Suani T. R. and El-Hani, Charbel N.}, biburl = {https://www.bibsonomy.org/bibtex/275e17e8ce2a08df2c2daeb2b789534d6/karthikraman}, citeulike-article-id = {9255911}, citeulike-linkout-0 = {http://dx.doi.org/10.1371/journal.pcbi.1001131}, citeulike-linkout-1 = {http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3088654/}, citeulike-linkout-2 = {http://view.ncbi.nlm.nih.gov/pubmed/21573202}, citeulike-linkout-3 = {http://www.hubmed.org/display.cgi?uids=21573202}, day = 5, doi = {10.1371/journal.pcbi.1001131}, interhash = {54b48947c768a426b8f6a5b02f15c69e}, intrahash = {75e17e8ce2a08df2c2daeb2b789534d6}, issn = {1553-7358}, journal = {PLoS Comput Biol}, keywords = {communities networks phylogeny}, month = may, number = 5, pages = {e1001131+}, pmcid = {PMC3088654}, pmid = {21573202}, posted-at = {2011-05-06 05:06:27}, priority = {5}, publisher = {Public Library of Science}, timestamp = {2018-12-02T16:09:07.000+0100}, title = {Detecting Network Communities: An Application to Phylogenetic Analysis}, url = {http://dx.doi.org/10.1371/journal.pcbi.1001131}, volume = 7, year = 2011 }

BibSonomy

Detecting Network Communities: An Application to Phylogenetic Analysis

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on