<rdf:RDF xmlns:burst="http://xmlns.com/burst/0.1/" xmlns:admin="http://webns.net/mvcb/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:cc="http://web.resource.org/cc/" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" xmlns:swrc="http://swrc.ontoware.org/ontology#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns="http://purl.org/rss/1.0/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"><channel rdf:about="http://www.bibsonomy.org/burst/user/dzerbino/Analysis,"><title>BibSonomy publications for /user/dzerbino/Analysis,</title><link>http://www.bibsonomy.org/burst/user/dzerbino/Analysis,</link><description>BibSonomy BuRST Feed for /user/dzerbino/Analysis,</description><dc:date>2008-10-07T19:32:38+02:00</dc:date><items><rdf:Seq><rdf:li rdf:resource="http://www.bibsonomy.org/bibtex/2aa8bc1f2986f316dcdd470da0aa1588d/dzerbino"/><rdf:li rdf:resource="http://www.bibsonomy.org/bibtex/2150cd10c40aace2c238aaa20c8480e08/dzerbino"/><rdf:li rdf:resource="http://www.bibsonomy.org/bibtex/24296f5595607517ac76189523adbafdb/dzerbino"/><rdf:li rdf:resource="http://www.bibsonomy.org/bibtex/251d90db8174d7d1da830b654951e5ea0/dzerbino"/><rdf:li rdf:resource="http://www.bibsonomy.org/bibtex/2187f0308ae9e5fa2f47476b9ab180a20/dzerbino"/><rdf:li rdf:resource="http://www.bibsonomy.org/bibtex/238f73cc8ed9f2f976ae6a7360b532cfe/dzerbino"/><rdf:li rdf:resource="http://www.bibsonomy.org/bibtex/208726a1ee302ce26b55d2d0ae419d2b4/dzerbino"/><rdf:li rdf:resource="http://www.bibsonomy.org/bibtex/292574a7dfe4924f76351f0ac33eae32d/dzerbino"/><rdf:li rdf:resource="http://www.bibsonomy.org/bibtex/211f0d08db68b4f73ce33ffab95ffef98/dzerbino"/><rdf:li rdf:resource="http://www.bibsonomy.org/bibtex/2ee9d2aa5c67377101e1ad619b08eac15/dzerbino"/><rdf:li rdf:resource="http://www.bibsonomy.org/bibtex/29acd2b8b071ac8bb83b8007ef69b9d88/dzerbino"/><rdf:li rdf:resource="http://www.bibsonomy.org/bibtex/2502e0622b4d381412eafa06bb77d377e/dzerbino"/><rdf:li rdf:resource="http://www.bibsonomy.org/bibtex/21dc921e2ef4587944697d75bf48c2db4/dzerbino"/><rdf:li rdf:resource="http://www.bibsonomy.org/bibtex/2e05ca61552c8d9f5951978da7619860d/dzerbino"/><rdf:li rdf:resource="http://www.bibsonomy.org/bibtex/285853a4fe7db3494508d6631d10f55ca/dzerbino"/><rdf:li rdf:resource="http://www.bibsonomy.org/bibtex/26fefee7c3e5e1c27b6a90750a6c4c153/dzerbino"/></rdf:Seq></items></channel><item rdf:about="http://www.bibsonomy.org/bibtex/2aa8bc1f2986f316dcdd470da0aa1588d/dzerbino"><title>Generating consensus sequences from partial order multiple sequence alignment graphs</title><link>http://www.bibsonomy.org/bibtex/2aa8bc1f2986f316dcdd470da0aa1588d/dzerbino</link><dc:creator>dzerbino</dc:creator><dc:date>2007-09-17T20:19:41+02:00</dc:date><dc:subject>Profiling, Analysis, Sequence DNA, Humans, Gene Algorithms, Alignment, Software, Consensus Expression </dc:subject><content:encoded>&lt;span style=&#034;color:#555555;&#034;&gt;Christopher &lt;a href=&#034;http://www.bibsonomy.org/author/Lee&#034;&gt;Lee&lt;/a&gt;  &lt;/span&gt;&lt;em&gt;Bioinformatics&lt;/em&gt;&lt;em&gt;19(8):999--1008&lt;/em&gt;&lt;em&gt;May2003. &lt;/em&gt;</content:encoded><taxo:topics><rdf:Bag><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Profiling,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Analysis,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Sequence"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/DNA,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Humans,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Gene"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Algorithms,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Alignment,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Software,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Consensus"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Expression"/></rdf:Bag></taxo:topics><burst:publication><rdf:Description rdf:about="http://www.bibsonomy.org/bibtex/2aa8bc1f2986f316dcdd470da0aa1588d/dzerbino"><owl:sameAs rdf:resource="http://www.bibsonomy.org/uri/bibtex/2aa8bc1f2986f316dcdd470da0aa1588d/dzerbino"/><rdf:type rdf:resource="http://swrc.ontoware.org/ontology#Article"/><swrc:date>Mon Sep 17 20:19:41 CEST 2007</swrc:date><swrc:journal>Bioinformatics</swrc:journal><swrc:month>May</swrc:month><swrc:number>8</swrc:number><swrc:pages>999--1008</swrc:pages><swrc:title>Generating consensus sequences from partial order multiple sequence alignment graphs</swrc:title><swrc:volume>19</swrc:volume><swrc:year>2003</swrc:year><swrc:keywords>Profiling, Analysis, Sequence DNA, Humans, Gene Algorithms, Alignment, Software, Consensus Expression </swrc:keywords><swrc:abstract>MOTIVATION: Consensus sequence generation is important in many kinds of sequence analysis ranging from sequence assembly to profile-based iterative search methods. However, how can a consensus be constructed when its inherent assumption-that the aligned sequences form a single linear consensus-is not true? RESULTS: Partial Order Alignment (POA) enables construction and analysis of multiple sequence alignments as directed acyclic graphs containing complex branching structure. Here we present a dynamic programming algorithm (heaviest_bundle) for generating multiple consensus sequences from such complex alignments. The number and relationships of these consensus sequences reveals the degree of structural complexity of the source alignment. This is a powerful and general approach for analyzing and visualizing complex alignment structures, and can be applied to any alignment. We illustrate its value for analyzing expressed sequence alignments to detect alternative splicing, reconstruct full length mRNA isoform sequences from EST fragments, and separate paralog mixtures that can cause incorrect SNP predictions. AVAILABILITY: The heaviest_bundle source code is available at http://www.bioinformatics.ucla.edu/poa</swrc:abstract><swrc:hasExtraField><swrc:Field swrc:value="12761063" swrc:key="pmid"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="8" swrc:key="issue"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="UCLA-DOE Center for Genomics and Proteomics, Molecular Biology Institute Department of Chemistry, University of California, Los Angeles, Los Angeles, CA 90095-1570, USA. leec@mbi.ucla.edu" swrc:key="affiliation"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="English" swrc:key="language"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="papers://055852FE-1648-42FE-91D0-8CA474D2B905/Paper/p4" swrc:key="uri"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="file://localhost/Users/danielzerbino/Documents/Papers/2003/Lee/Bioinformatics%202003%20Lee.pdf" swrc:key="url"/></swrc:hasExtraField><swrc:author><rdf:Seq><rdf:_1><swrc:Person swrc:name="Christopher Lee"/></rdf:_1></rdf:Seq></swrc:author></rdf:Description></burst:publication></item><item rdf:about="http://www.bibsonomy.org/bibtex/2150cd10c40aace2c238aaa20c8480e08/dzerbino"><title>Minimus: a fast, lightweight genome assembler</title><link>http://www.bibsonomy.org/bibtex/2150cd10c40aace2c238aaa20c8480e08/dzerbino</link><dc:creator>dzerbino</dc:creator><dc:date>2007-09-17T20:19:41+02:00</dc:date><dc:subject>Algorithms, Mapping, Base Sequence, Data, DNA, Chromosome Interface Software, Sequence User-Computer Software Molecular Alignment, Analysis, Design, </dc:subject><content:encoded>&lt;span style=&#034;color:#555555;&#034;&gt;Daniel D &lt;a href=&#034;http://www.bibsonomy.org/author/Sommer&#034;&gt;Sommer&lt;/a&gt;  und Arthur L &lt;a href=&#034;http://www.bibsonomy.org/author/Delcher&#034;&gt;Delcher&lt;/a&gt;  und Steven L &lt;a href=&#034;http://www.bibsonomy.org/author/Salzberg&#034;&gt;Salzberg&lt;/a&gt;  und Mihai &lt;a href=&#034;http://www.bibsonomy.org/author/Pop&#034;&gt;Pop&lt;/a&gt;  &lt;/span&gt;&lt;em&gt;BMC Bioinformatics&lt;/em&gt;&lt;em&gt;Feb2007. &lt;/em&gt;</content:encoded><taxo:topics><rdf:Bag><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Algorithms,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Mapping,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Base"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Sequence,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Data,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/DNA,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Chromosome"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Interface"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Software,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Sequence"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/User-Computer"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Software"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Molecular"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Alignment,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Analysis,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Design,"/></rdf:Bag></taxo:topics><burst:publication><rdf:Description rdf:about="http://www.bibsonomy.org/bibtex/2150cd10c40aace2c238aaa20c8480e08/dzerbino"><owl:sameAs rdf:resource="http://www.bibsonomy.org/uri/bibtex/2150cd10c40aace2c238aaa20c8480e08/dzerbino"/><rdf:type rdf:resource="http://swrc.ontoware.org/ontology#Article"/><swrc:date>Mon Sep 17 20:19:41 CEST 2007</swrc:date><swrc:journal>BMC Bioinformatics</swrc:journal><swrc:month>Feb</swrc:month><swrc:pages>64</swrc:pages><swrc:title>Minimus: a fast, lightweight genome assembler</swrc:title><swrc:volume>8</swrc:volume><swrc:year>2007</swrc:year><swrc:keywords>Algorithms, Mapping, Base Sequence, Data, DNA, Chromosome Interface Software, Sequence User-Computer Software Molecular Alignment, Analysis, Design, </swrc:keywords><swrc:abstract>BACKGROUND: Genome assemblers have grown very large and complex in response to the need for algorithms to handle the challenges of large whole-genome sequencing projects. Many of the most common uses of assemblers, however, are best served by a simpler type of assembler that requires fewer software components, uses less memory, and is far easier to install and run. RESULTS: We have developed the Minimus assembler to address these issues, and tested it on a range of assembly problems. We show that Minimus performs well on several small assembly tasks, including the assembly of viral genomes, individual genes, and BAC clones. In addition, we evaluate Minimus&#039; performance in assembling bacterial genomes in order to assess its suitability as a component of a larger assembly pipeline. We show that, unlike other software currently used for these tasks, Minimus produces significantly fewer assembly errors, at the cost of generating a more fragmented assembly. CONCLUSION: We find that for small genomes and other small assembly tasks, Minimus is faster and far more flexible than existing tools. Due to its small size and modular design Minimus is perfectly suited to be a component of complex assembly pipelines. Minimus is released as an open-source software project and the code is available as part of the AMOS project at Sourceforge.</swrc:abstract><swrc:hasExtraField><swrc:Field swrc:value="1471-2105-8-64" swrc:key="pii"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="17324286" swrc:key="pmid"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD 20742, USA. dsommer@umiacs.umd.edu &lt;dsommer@umiacs.umd.edu&gt;" swrc:key="affiliation"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="English" swrc:key="language"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="papers://055852FE-1648-42FE-91D0-8CA474D2B905/Paper/p22" swrc:key="uri"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="file://localhost/Users/danielzerbino/Documents/Papers/2007/Sommer/BMC%20Bioinformatics%202007%20Sommer.pdf" swrc:key="url"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="10.1186/1471-2105-8-64" swrc:key="doi"/></swrc:hasExtraField><swrc:author><rdf:Seq><rdf:_1><swrc:Person swrc:name="Daniel D Sommer"/></rdf:_1><rdf:_2><swrc:Person swrc:name="Arthur L Delcher"/></rdf:_2><rdf:_3><swrc:Person swrc:name="Steven L Salzberg"/></rdf:_3><rdf:_4><swrc:Person swrc:name="Mihai Pop"/></rdf:_4></rdf:Seq></swrc:author></rdf:Description></burst:publication></item><item rdf:about="http://www.bibsonomy.org/bibtex/24296f5595607517ac76189523adbafdb/dzerbino"><title>A novel method for multiple alignment of sequences with repeated and shuffled elements</title><link>http://www.bibsonomy.org/bibtex/24296f5595607517ac76189523adbafdb/dzerbino</link><dc:creator>dzerbino</dc:creator><dc:date>2007-09-17T20:19:41+02:00</dc:date><dc:subject>Algorithms, Alignment, Databases, Analysis, Genetic, Sequence Protein, Amino Software, Acid Homology, </dc:subject><content:encoded>&lt;span style=&#034;color:#555555;&#034;&gt;Benjamin &lt;a href=&#034;http://www.bibsonomy.org/author/Raphael&#034;&gt;Raphael&lt;/a&gt;  und Degui &lt;a href=&#034;http://www.bibsonomy.org/author/Zhi&#034;&gt;Zhi&lt;/a&gt;  und Haixu &lt;a href=&#034;http://www.bibsonomy.org/author/Tang&#034;&gt;Tang&lt;/a&gt;  und Pavel &lt;a href=&#034;http://www.bibsonomy.org/author/Pevzner&#034;&gt;Pevzner&lt;/a&gt;  &lt;/span&gt;&lt;em&gt;Genome Res&lt;/em&gt;&lt;em&gt;14(11):2336--46&lt;/em&gt;&lt;em&gt;Nov2004. &lt;/em&gt;</content:encoded><taxo:topics><rdf:Bag><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Algorithms,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Alignment,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Databases,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Analysis,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Genetic,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Sequence"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Protein,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Amino"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Software,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Acid"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Homology,"/></rdf:Bag></taxo:topics><burst:publication><rdf:Description rdf:about="http://www.bibsonomy.org/bibtex/24296f5595607517ac76189523adbafdb/dzerbino"><owl:sameAs rdf:resource="http://www.bibsonomy.org/uri/bibtex/24296f5595607517ac76189523adbafdb/dzerbino"/><rdf:type rdf:resource="http://swrc.ontoware.org/ontology#Article"/><swrc:date>Mon Sep 17 20:19:41 CEST 2007</swrc:date><swrc:journal>Genome Res</swrc:journal><swrc:month>Nov</swrc:month><swrc:number>11</swrc:number><swrc:pages>2336--46</swrc:pages><swrc:title>A novel method for multiple alignment of sequences with repeated and shuffled elements</swrc:title><swrc:volume>14</swrc:volume><swrc:year>2004</swrc:year><swrc:keywords>Algorithms, Alignment, Databases, Analysis, Genetic, Sequence Protein, Amino Software, Acid Homology, </swrc:keywords><swrc:abstract>We describe ABA (A-Bruijn alignment), a new method for multiple alignment of biological sequences. The major difference between ABA and existing multiple alignment methods is that ABA represents an alignment as a directed graph, possibly containing cycles. This representation provides more flexibility than does a traditional alignment matrix or the recently introduced partial order alignment (POA) graph by allowing a larger class of evolutionary relationships between the aligned sequences. Our graph representation is particularly well-suited to the alignment of protein sequences with shuffled and/or repeated domain structure, and allows one to construct multiple alignments of proteins containing (1) domains that are not present in all proteins, (2) domains that are present in different orders in different proteins, and (3) domains that are present in multiple copies in some proteins. In addition, ABA is useful in the alignment of genomic sequences that contain duplications and inversions. We provide several examples illustrating the applications of ABA.</swrc:abstract><swrc:hasExtraField><swrc:Field swrc:value="14/11/2336" swrc:key="pii"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="15520295" swrc:key="pmid"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="11" swrc:key="issue"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="Department of Computer Science and Engineering, University of California, San Diego, La Jolla, California 92093-0114, USA. braphael@ucsd.edu" swrc:key="affiliation"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="English" swrc:key="language"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="papers://055852FE-1648-42FE-91D0-8CA474D2B905/Paper/p8" swrc:key="uri"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="file://localhost/Users/danielzerbino/Documents/Papers/2004/Raphael/Genome%20Res%202004%20Raphael.pdf" swrc:key="url"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="10.1101/gr.2657504" swrc:key="doi"/></swrc:hasExtraField><swrc:author><rdf:Seq><rdf:_1><swrc:Person swrc:name="Benjamin Raphael"/></rdf:_1><rdf:_2><swrc:Person swrc:name="Degui Zhi"/></rdf:_2><rdf:_3><swrc:Person swrc:name="Haixu Tang"/></rdf:_3><rdf:_4><swrc:Person swrc:name="Pavel Pevzner"/></rdf:_4></rdf:Seq></swrc:author></rdf:Description></burst:publication></item><item rdf:about="http://www.bibsonomy.org/bibtex/251d90db8174d7d1da830b654951e5ea0/dzerbino"><title>De novo repeat classification and fragment assembly</title><link>http://www.bibsonomy.org/bibtex/251d90db8174d7d1da830b654951e5ea0/dzerbino</link><dc:creator>dzerbino</dc:creator><dc:date>2007-09-17T20:19:41+02:00</dc:date><dc:subject>Cluster (Genetics), Bacterial, Biology, Computational Multigene Sequences, Mapping, Acid, Artificial, Human, Repetitive Family Sequence Analysis, Linkage Alignment, Contig Nucleic Genome, Chromosomes, Humans, Algorithms, </dc:subject><content:encoded>&lt;span style=&#034;color:#555555;&#034;&gt;Pavel A &lt;a href=&#034;http://www.bibsonomy.org/author/Pevzner&#034;&gt;Pevzner&lt;/a&gt;  und Paul A &lt;a href=&#034;http://www.bibsonomy.org/author/Pevzner&#034;&gt;Pevzner&lt;/a&gt;  und Haixu &lt;a href=&#034;http://www.bibsonomy.org/author/Tang&#034;&gt;Tang&lt;/a&gt;  und Glenn &lt;a href=&#034;http://www.bibsonomy.org/author/Tesler&#034;&gt;Tesler&lt;/a&gt;  &lt;/span&gt;&lt;em&gt;Genome Res&lt;/em&gt;&lt;em&gt;14(9):1786--96&lt;/em&gt;&lt;em&gt;Sep2004. &lt;/em&gt;</content:encoded><taxo:topics><rdf:Bag><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Cluster"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/(Genetics),"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Bacterial,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Biology,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Computational"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Multigene"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Sequences,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Mapping,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Acid,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Artificial,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Human,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Repetitive"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Family"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Sequence"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Analysis,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Linkage"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Alignment,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Contig"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Nucleic"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Genome,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Chromosomes,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Humans,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Algorithms,"/></rdf:Bag></taxo:topics><burst:publication><rdf:Description rdf:about="http://www.bibsonomy.org/bibtex/251d90db8174d7d1da830b654951e5ea0/dzerbino"><owl:sameAs rdf:resource="http://www.bibsonomy.org/uri/bibtex/251d90db8174d7d1da830b654951e5ea0/dzerbino"/><rdf:type rdf:resource="http://swrc.ontoware.org/ontology#Article"/><swrc:date>Mon Sep 17 20:19:41 CEST 2007</swrc:date><swrc:journal>Genome Res</swrc:journal><swrc:month>Sep</swrc:month><swrc:number>9</swrc:number><swrc:pages>1786--96</swrc:pages><swrc:title>De novo repeat classification and fragment assembly</swrc:title><swrc:volume>14</swrc:volume><swrc:year>2004</swrc:year><swrc:keywords>Cluster (Genetics), Bacterial, Biology, Computational Multigene Sequences, Mapping, Acid, Artificial, Human, Repetitive Family Sequence Analysis, Linkage Alignment, Contig Nucleic Genome, Chromosomes, Humans, Algorithms, </swrc:keywords><swrc:abstract>Repetitive sequences make up a significant fraction of almost any genome, and an important and still open question in bioinformatics is how to represent all repeats in DNA sequences. We propose a new approach to repeat classification that represents all repeats in a genome as a mosaic of sub-repeats. Our key algorithmic idea also leads to new approaches to multiple alignment and fragment assembly. In particular, we show that our FragmentGluer assembler improves on Phrap and ARACHNE in assembly of BACs and bacterial genomes.</swrc:abstract><swrc:hasExtraField><swrc:Field swrc:value="14/9/1786" swrc:key="pii"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="15342561" swrc:key="pmid"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="9" swrc:key="issue"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="Department of Computer Science and Engineering, University of California, San Diego, La Jolla, California 92093, USA." swrc:key="affiliation"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="English" swrc:key="language"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="papers://055852FE-1648-42FE-91D0-8CA474D2B905/Paper/p26" swrc:key="uri"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="file://localhost/Users/danielzerbino/Documents/Papers/2004/Pevzner/Genome%20Res%202004%20Pevzner.pdf" swrc:key="url"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="10.1101/gr.2395204" swrc:key="doi"/></swrc:hasExtraField><swrc:author><rdf:Seq><rdf:_1><swrc:Person swrc:name="Pavel A Pevzner"/></rdf:_1><rdf:_2><swrc:Person swrc:name="Paul A Pevzner"/></rdf:_2><rdf:_3><swrc:Person swrc:name="Haixu Tang"/></rdf:_3><rdf:_4><swrc:Person swrc:name="Glenn Tesler"/></rdf:_4></rdf:Seq></swrc:author></rdf:Description></burst:publication></item><item rdf:about="http://www.bibsonomy.org/bibtex/2187f0308ae9e5fa2f47476b9ab180a20/dzerbino"><title>Fragment assembly with short reads</title><link>http://www.bibsonomy.org/bibtex/2187f0308ae9e5fa2f47476b9ab180a20/dzerbino</link><dc:creator>dzerbino</dc:creator><dc:date>2007-09-17T20:19:41+02:00</dc:date><dc:subject>Data, Feasibility Sequence Studies, Algorithms, DNA, Profiling, Contig Analysis, Expression Molecular Mapping, Gene Base Alignment, </dc:subject><content:encoded>&lt;span style=&#034;color:#555555;&#034;&gt;Mark &lt;a href=&#034;http://www.bibsonomy.org/author/Chaisson&#034;&gt;Chaisson&lt;/a&gt;  und Pavel &lt;a href=&#034;http://www.bibsonomy.org/author/Pevzner&#034;&gt;Pevzner&lt;/a&gt;  und Haixu &lt;a href=&#034;http://www.bibsonomy.org/author/Tang&#034;&gt;Tang&lt;/a&gt;  &lt;/span&gt;&lt;em&gt;Bioinformatics&lt;/em&gt;&lt;em&gt;20(13):2067--74&lt;/em&gt;&lt;em&gt;Sep2004. &lt;/em&gt;</content:encoded><taxo:topics><rdf:Bag><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Data,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Feasibility"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Sequence"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Studies,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Algorithms,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/DNA,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Profiling,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Contig"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Analysis,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Expression"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Molecular"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Mapping,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Gene"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Base"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Alignment,"/></rdf:Bag></taxo:topics><burst:publication><rdf:Description rdf:about="http://www.bibsonomy.org/bibtex/2187f0308ae9e5fa2f47476b9ab180a20/dzerbino"><owl:sameAs rdf:resource="http://www.bibsonomy.org/uri/bibtex/2187f0308ae9e5fa2f47476b9ab180a20/dzerbino"/><rdf:type rdf:resource="http://swrc.ontoware.org/ontology#Article"/><swrc:date>Mon Sep 17 20:19:41 CEST 2007</swrc:date><swrc:journal>Bioinformatics</swrc:journal><swrc:month>Sep</swrc:month><swrc:number>13</swrc:number><swrc:pages>2067--74</swrc:pages><swrc:title>Fragment assembly with short reads</swrc:title><swrc:volume>20</swrc:volume><swrc:year>2004</swrc:year><swrc:keywords>Data, Feasibility Sequence Studies, Algorithms, DNA, Profiling, Contig Analysis, Expression Molecular Mapping, Gene Base Alignment, </swrc:keywords><swrc:abstract>MOTIVATION: Current DNA sequencing technology produces reads of about 500-750 bp, with typical coverage under 10x. New sequencing technologies are emerging that produce shorter reads (length 80-200 bp) but allow one to generate significantly higher coverage (30x and higher) at low cost. Modern assembly programs and error correction routines have been tuned to work well with current read technology but were not designed for assembly of short reads. RESULTS: We analyze the limitations of assembling reads generated by these new technologies and present a routine for base-calling in reads prior to their assembly. We demonstrate that while it is feasible to assemble such short reads, the resulting contigs will require significant (if not prohibitive) finishing efforts. AVAILABILITY: Available from the web at http://www.cse.ucsd.edu/groups/bioinformatics/software.html</swrc:abstract><swrc:hasExtraField><swrc:Field swrc:value="bth205" swrc:key="pii"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="15059830" swrc:key="pmid"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="13" swrc:key="issue"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="Bioinformatics Program, University of California San Diego, La Jolla, CA 92093, USA. mchaisso@bioinf.ucsd.edu" swrc:key="affiliation"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="English" swrc:key="language"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="papers://055852FE-1648-42FE-91D0-8CA474D2B905/Paper/p25" swrc:key="uri"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="file://localhost/Users/danielzerbino/Documents/Papers/2004/Chaisson/Bioinformatics%202004%20Chaisson.pdf" swrc:key="url"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="10.1093/bioinformatics/bth205" swrc:key="doi"/></swrc:hasExtraField><swrc:author><rdf:Seq><rdf:_1><swrc:Person swrc:name="Mark Chaisson"/></rdf:_1><rdf:_2><swrc:Person swrc:name="Pavel Pevzner"/></rdf:_2><rdf:_3><swrc:Person swrc:name="Haixu Tang"/></rdf:_3></rdf:Seq></swrc:author></rdf:Description></burst:publication></item><item rdf:about="http://www.bibsonomy.org/bibtex/238f73cc8ed9f2f976ae6a7360b532cfe/dzerbino"><title>The fragment assembly string graph</title><link>http://www.bibsonomy.org/bibtex/238f73cc8ed9f2f976ae6a7360b532cfe/dzerbino</link><dc:creator>dzerbino</dc:creator><dc:date>2007-09-17T20:19:41+02:00</dc:date><dc:subject>Data, Base Mapping Chromosome Algorithms, Sequence Molecular Fragmentation, Analysis, Sequence, DNA, DNA </dc:subject><content:encoded>&lt;span style=&#034;color:#555555;&#034;&gt;Eugene W &lt;a href=&#034;http://www.bibsonomy.org/author/Myers&#034;&gt;Myers&lt;/a&gt;  &lt;/span&gt;&lt;em&gt;Bioinformatics&lt;/em&gt;&lt;em&gt;Sep2005. &lt;/em&gt;</content:encoded><taxo:topics><rdf:Bag><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Data,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Base"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Mapping"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Chromosome"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Algorithms,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Sequence"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Molecular"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Fragmentation,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Analysis,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Sequence,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/DNA,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/DNA"/></rdf:Bag></taxo:topics><burst:publication><rdf:Description rdf:about="http://www.bibsonomy.org/bibtex/238f73cc8ed9f2f976ae6a7360b532cfe/dzerbino"><owl:sameAs rdf:resource="http://www.bibsonomy.org/uri/bibtex/238f73cc8ed9f2f976ae6a7360b532cfe/dzerbino"/><rdf:type rdf:resource="http://swrc.ontoware.org/ontology#Article"/><swrc:date>Mon Sep 17 20:19:41 CEST 2007</swrc:date><swrc:journal>Bioinformatics</swrc:journal><swrc:month>Sep</swrc:month><swrc:pages>ii79--85</swrc:pages><swrc:title>The fragment assembly string graph</swrc:title><swrc:volume>21 Suppl 2</swrc:volume><swrc:year>2005</swrc:year><swrc:keywords>Data, Base Mapping Chromosome Algorithms, Sequence Molecular Fragmentation, Analysis, Sequence, DNA, DNA </swrc:keywords><swrc:abstract>We present a concept and formalism, the string graph, which represents all that is inferable about a DNA sequence from a collection of shotgun sequencing reads collected from it. We give time and space efficient algorithms for constructing a string graph given the collection of overlaps between the reads and, in particular, present a novel linear expected time algorithm for transitive reduction in this context. The result demonstrates that the decomposition of reads into kmers employed in the de Bruijn graph approach described earlier is not essential, and exposes its close connection to the unitig approach we developed at Celera. This paper is a preliminary piece giving the basic algorithm and results that demonstrate the efficiency and scalability of the method. These ideas are being used to build a next-generation whole genome assembler called BOA (Berkeley Open Assembler) that will easily scale to mammalian genomes.</swrc:abstract><swrc:hasExtraField><swrc:Field swrc:value="21/suppl_2/ii79" swrc:key="pii"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="16204131" swrc:key="pmid"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="Department of Computer Science, University of California Berkeley, CA, USA. gene@eecs.berkeley.edu" swrc:key="affiliation"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="English" swrc:key="language"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="papers://055852FE-1648-42FE-91D0-8CA474D2B905/Paper/p35" swrc:key="uri"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="file://localhost/Users/danielzerbino/Documents/Papers/2005/Myers/Bioinformatics%202005%20Myers.pdf" swrc:key="url"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="10.1093/bioinformatics/bti1114" swrc:key="doi"/></swrc:hasExtraField><swrc:author><rdf:Seq><rdf:_1><swrc:Person swrc:name="Eugene W Myers"/></rdf:_1></rdf:Seq></swrc:author></rdf:Description></burst:publication></item><item rdf:about="http://www.bibsonomy.org/bibtex/208726a1ee302ce26b55d2d0ae419d2b4/dzerbino"><title>Reconstructing large regions of an ancestral mammalian genome in silico</title><link>http://www.bibsonomy.org/bibtex/208726a1ee302ce26b55d2d0ae419d2b4/dzerbino</link><dc:creator>dzerbino</dc:creator><dc:date>2007-09-17T20:19:41+02:00</dc:date><dc:subject>Molecular, Fibrosis Evolution, Phylogeny, Genetic Simulation, Animals, Functions, Base Variation Cystic Sequence, (Genetics), Alignment, Regulator, Conductance Analysis, Genome, Models, Data, Mammals, Molecular Sequence Transmembrane DNA, Likelihood Computer </dc:subject><content:encoded>&lt;span style=&#034;color:#555555;&#034;&gt;Mathieu &lt;a href=&#034;http://www.bibsonomy.org/author/Blanchette&#034;&gt;Blanchette&lt;/a&gt;  und Eric D &lt;a href=&#034;http://www.bibsonomy.org/author/Green&#034;&gt;Green&lt;/a&gt;  und Webb &lt;a href=&#034;http://www.bibsonomy.org/author/Miller&#034;&gt;Miller&lt;/a&gt;  und David &lt;a href=&#034;http://www.bibsonomy.org/author/Haussler&#034;&gt;Haussler&lt;/a&gt;  &lt;/span&gt;&lt;em&gt;Genome Res&lt;/em&gt;&lt;em&gt;14(12):2412--23&lt;/em&gt;&lt;em&gt;Dec2004. &lt;/em&gt;</content:encoded><taxo:topics><rdf:Bag><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Molecular,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Fibrosis"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Evolution,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Phylogeny,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Genetic"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Simulation,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Animals,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Functions,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Base"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Variation"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Cystic"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Sequence,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/(Genetics),"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Alignment,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Regulator,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Conductance"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Analysis,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Genome,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Models,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Data,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Mammals,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Molecular"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Sequence"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Transmembrane"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/DNA,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Likelihood"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Computer"/></rdf:Bag></taxo:topics><burst:publication><rdf:Description rdf:about="http://www.bibsonomy.org/bibtex/208726a1ee302ce26b55d2d0ae419d2b4/dzerbino"><owl:sameAs rdf:resource="http://www.bibsonomy.org/uri/bibtex/208726a1ee302ce26b55d2d0ae419d2b4/dzerbino"/><rdf:type rdf:resource="http://swrc.ontoware.org/ontology#Article"/><swrc:date>Mon Sep 17 20:19:41 CEST 2007</swrc:date><swrc:journal>Genome Res</swrc:journal><swrc:month>Dec</swrc:month><swrc:number>12</swrc:number><swrc:pages>2412--23</swrc:pages><swrc:title>Reconstructing large regions of an ancestral mammalian genome in silico</swrc:title><swrc:volume>14</swrc:volume><swrc:year>2004</swrc:year><swrc:keywords>Molecular, Fibrosis Evolution, Phylogeny, Genetic Simulation, Animals, Functions, Base Variation Cystic Sequence, (Genetics), Alignment, Regulator, Conductance Analysis, Genome, Models, Data, Mammals, Molecular Sequence Transmembrane DNA, Likelihood Computer </swrc:keywords><swrc:abstract>It is believed that most modern mammalian lineages arose from a series of rapid speciation events near the Cretaceous-Tertiary boundary. It is shown that such a phylogeny makes the common ancestral genome sequence an ideal target for reconstruction. Simulations suggest that with methods currently available, we can expect to get 98% of the bases correct in reconstructing megabase-scale euchromatic regions of an eutherian ancestral genome from the genomes of approximately 20 optimally chosen modern mammals. Using actual genomic sequences from 19 extant mammals, we reconstruct 1.1 Mb of ancient genome sequence around the CFTR locus. Detailed examination suggests the reconstruction is accurate and that it allows us to identify features in modern species, such as remnants of ancient transposon insertions, that were not identified by direct analysis. Tracing the predicted evolutionary history of the bases in the reconstructed region, estimates are made of the amount of DNA turnover due to insertion, deletion, and substitution in the different placental mammalian lineages since the common eutherian ancestor, showing considerable variation between lineages. In coming years, such reconstructions may help in identifying and understanding the genetic features common to eutherian mammals and may shed light on the evolution of human or primate-specific traits.</swrc:abstract><swrc:hasExtraField><swrc:Field swrc:value="14/12/2412" swrc:key="pii"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="15574820" swrc:key="pmid"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="12" swrc:key="issue"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="Howard Hughes Medical Institute, University of California, Santa Cruz, California 95064, USA. blanchem@mcb.mcgill.ca" swrc:key="affiliation"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="English" swrc:key="language"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="papers://055852FE-1648-42FE-91D0-8CA474D2B905/Paper/p36" swrc:key="uri"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="file://localhost/Users/danielzerbino/Documents/Papers/2004/Blanchette/Genome%20Res%202004%20Blanchette.pdf" swrc:key="url"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="10.1101/gr.2800104" swrc:key="doi"/></swrc:hasExtraField><swrc:author><rdf:Seq><rdf:_1><swrc:Person swrc:name="Mathieu Blanchette"/></rdf:_1><rdf:_2><swrc:Person swrc:name="Eric D Green"/></rdf:_2><rdf:_3><swrc:Person swrc:name="Webb Miller"/></rdf:_3><rdf:_4><swrc:Person swrc:name="David Haussler"/></rdf:_4></rdf:Seq></swrc:author></rdf:Description></burst:publication></item><item rdf:about="http://www.bibsonomy.org/bibtex/292574a7dfe4924f76351f0ac33eae32d/dzerbino"><title>Assembling millions of short DNA sequences using SSAKE</title><link>http://www.bibsonomy.org/bibtex/292574a7dfe4924f76351f0ac33eae32d/dzerbino</link><dc:creator>dzerbino</dc:creator><dc:date>2007-09-17T20:19:41+02:00</dc:date><dc:subject>DNA, Sequence Algorithms, Molecular Data, Software, Mapping Base Mapping, Chromosome Sequence, Contig Analysis, </dc:subject><content:encoded>&lt;span style=&#034;color:#555555;&#034;&gt;Ren&#039;e L &lt;a href=&#034;http://www.bibsonomy.org/author/Warren&#034;&gt;Warren&lt;/a&gt;  und Granger G &lt;a href=&#034;http://www.bibsonomy.org/author/Sutton&#034;&gt;Sutton&lt;/a&gt;  und Steven J M &lt;a href=&#034;http://www.bibsonomy.org/author/Jones&#034;&gt;Jones&lt;/a&gt;  und Robert A &lt;a href=&#034;http://www.bibsonomy.org/author/Holt&#034;&gt;Holt&lt;/a&gt;  &lt;/span&gt;&lt;em&gt;Bioinformatics&lt;/em&gt;&lt;em&gt;23(4):500--1&lt;/em&gt;&lt;em&gt;Feb2007. &lt;/em&gt;</content:encoded><taxo:topics><rdf:Bag><rdf:li rdf:resource="http://www.bibsonomy.org/tag/DNA,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Sequence"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Algorithms,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Molecular"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Data,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Software,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Mapping"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Base"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Mapping,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Chromosome"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Sequence,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Contig"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Analysis,"/></rdf:Bag></taxo:topics><burst:publication><rdf:Description rdf:about="http://www.bibsonomy.org/bibtex/292574a7dfe4924f76351f0ac33eae32d/dzerbino"><owl:sameAs rdf:resource="http://www.bibsonomy.org/uri/bibtex/292574a7dfe4924f76351f0ac33eae32d/dzerbino"/><rdf:type rdf:resource="http://swrc.ontoware.org/ontology#Article"/><swrc:date>Mon Sep 17 20:19:41 CEST 2007</swrc:date><swrc:journal>Bioinformatics</swrc:journal><swrc:month>Feb</swrc:month><swrc:number>4</swrc:number><swrc:pages>500--1</swrc:pages><swrc:title>Assembling millions of short DNA sequences using SSAKE</swrc:title><swrc:volume>23</swrc:volume><swrc:year>2007</swrc:year><swrc:keywords>DNA, Sequence Algorithms, Molecular Data, Software, Mapping Base Mapping, Chromosome Sequence, Contig Analysis, </swrc:keywords><swrc:abstract>Novel DNA sequencing technologies with the potential for up to three orders magnitude more sequence throughput than conventional Sanger sequencing are emerging. The instrument now available from Solexa Ltd, produces millions of short DNA sequences of 25 nt each. Due to ubiquitous repeats in large genomes and the inability of short sequences to uniquely and unambiguously characterize them, the short read length limits applicability for de novo sequencing. However, given the sequencing depth and the throughput of this instrument, stringent assembly of highly identical sequences can be achieved. We describe SSAKE, a tool for aggressively assembling millions of short nucleotide sequences by progressively searching through a prefix tree for the longest possible overlap between any two sequences. SSAKE is designed to help leverage the information from short sequence reads by stringently assembling them into contiguous sequences that can be used to characterize novel sequencing targets. Availability: http://www.bcgsc.ca/bioinfo/software/ssake.</swrc:abstract><swrc:hasExtraField><swrc:Field swrc:value="btl629" swrc:key="pii"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="17158514" swrc:key="pmid"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="4" swrc:key="issue"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="British Columbia Cancer Agency, Genome Sciences Centre, 675 West 10th Avenue, Vancouver, BC V5Z 1L3, Canada. rwarren@bcgsc.ca" swrc:key="affiliation"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="English" swrc:key="language"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="papers://055852FE-1648-42FE-91D0-8CA474D2B905/Paper/p21" swrc:key="uri"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="file://localhost/Users/danielzerbino/Documents/Papers/2007/Warren/Bioinformatics%202007%20Warren.pdf" swrc:key="url"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="10.1093/bioinformatics/btl629" swrc:key="doi"/></swrc:hasExtraField><swrc:author><rdf:Seq><rdf:_1><swrc:Person swrc:name="Ren{\&#039;e} L Warren"/></rdf:_1><rdf:_2><swrc:Person swrc:name="Granger G Sutton"/></rdf:_2><rdf:_3><swrc:Person swrc:name="Steven J M Jones"/></rdf:_3><rdf:_4><swrc:Person swrc:name="Robert A Holt"/></rdf:_4></rdf:Seq></swrc:author></rdf:Description></burst:publication></item><item rdf:about="http://www.bibsonomy.org/bibtex/211f0d08db68b4f73ce33ffab95ffef98/dzerbino"><title>A whole-genome assembly of Drosophila</title><link>http://www.bibsonomy.org/bibtex/211f0d08db68b4f73ce33ffab95ffef98/dzerbino</link><dc:creator>dzerbino</dc:creator><dc:date>2007-09-17T20:19:41+02:00</dc:date><dc:subject>Computational Biology Drosophila Insect, Chromosome Genes, Genome, Data, Physical Sites, Chromatin, Sequence Acid, Animals, Contig Analysis, Sequences, Algorithms, Nucleic Molecular Repetitive DNA, Heterochromatin, Euchromatin, Tagged Mapping, melanogaster, </dc:subject><content:encoded>&lt;span style=&#034;color:#555555;&#034;&gt;E W &lt;a href=&#034;http://www.bibsonomy.org/author/Myers&#034;&gt;Myers&lt;/a&gt;  und G G &lt;a href=&#034;http://www.bibsonomy.org/author/Sutton&#034;&gt;Sutton&lt;/a&gt;  und A L &lt;a href=&#034;http://www.bibsonomy.org/author/Delcher&#034;&gt;Delcher&lt;/a&gt;  und I M &lt;a href=&#034;http://www.bibsonomy.org/author/Dew&#034;&gt;Dew&lt;/a&gt;  und D P &lt;a href=&#034;http://www.bibsonomy.org/author/Fasulo&#034;&gt;Fasulo&lt;/a&gt;  und M J &lt;a href=&#034;http://www.bibsonomy.org/author/Flanigan&#034;&gt;Flanigan&lt;/a&gt;  und S A &lt;a href=&#034;http://www.bibsonomy.org/author/Kravitz&#034;&gt;Kravitz&lt;/a&gt;  und C M &lt;a href=&#034;http://www.bibsonomy.org/author/Mobarry&#034;&gt;Mobarry&lt;/a&gt;  und K H &lt;a href=&#034;http://www.bibsonomy.org/author/Reinert&#034;&gt;Reinert&lt;/a&gt;  und K A &lt;a href=&#034;http://www.bibsonomy.org/author/Remington&#034;&gt;Remington&lt;/a&gt;  und E L &lt;a href=&#034;http://www.bibsonomy.org/author/Anson&#034;&gt;Anson&lt;/a&gt;  und R A &lt;a href=&#034;http://www.bibsonomy.org/author/Bolanos&#034;&gt;Bolanos&lt;/a&gt;  und H H &lt;a href=&#034;http://www.bibsonomy.org/author/Chou&#034;&gt;Chou&lt;/a&gt;  und C M &lt;a href=&#034;http://www.bibsonomy.org/author/Jordan&#034;&gt;Jordan&lt;/a&gt;  und A L &lt;a href=&#034;http://www.bibsonomy.org/author/Halpern&#034;&gt;Halpern&lt;/a&gt;  und S &lt;a href=&#034;http://www.bibsonomy.org/author/Lonardi&#034;&gt;Lonardi&lt;/a&gt;  und E M &lt;a href=&#034;http://www.bibsonomy.org/author/Beasley&#034;&gt;Beasley&lt;/a&gt;  und R C &lt;a href=&#034;http://www.bibsonomy.org/author/Brandon&#034;&gt;Brandon&lt;/a&gt;  und L &lt;a href=&#034;http://www.bibsonomy.org/author/Chen&#034;&gt;Chen&lt;/a&gt;  und P J &lt;a href=&#034;http://www.bibsonomy.org/author/Dunn&#034;&gt;Dunn&lt;/a&gt;  und Z &lt;a href=&#034;http://www.bibsonomy.org/author/Lai&#034;&gt;Lai&lt;/a&gt;  und Y &lt;a href=&#034;http://www.bibsonomy.org/author/Liang&#034;&gt;Liang&lt;/a&gt;  und D R &lt;a href=&#034;http://www.bibsonomy.org/author/Nusskern&#034;&gt;Nusskern&lt;/a&gt;  und M &lt;a href=&#034;http://www.bibsonomy.org/author/Zhan&#034;&gt;Zhan&lt;/a&gt;  und Q &lt;a href=&#034;http://www.bibsonomy.org/author/Zhang&#034;&gt;Zhang&lt;/a&gt;  und X &lt;a href=&#034;http://www.bibsonomy.org/author/Zheng&#034;&gt;Zheng&lt;/a&gt;  und G M &lt;a href=&#034;http://www.bibsonomy.org/author/Rubin&#034;&gt;Rubin&lt;/a&gt;  und M D &lt;a href=&#034;http://www.bibsonomy.org/author/Adams&#034;&gt;Adams&lt;/a&gt;  und J C &lt;a href=&#034;http://www.bibsonomy.org/author/Venter&#034;&gt;Venter&lt;/a&gt;  &lt;/span&gt;&lt;em&gt;Science&lt;/em&gt;&lt;em&gt;287(5461):2196--204&lt;/em&gt;&lt;em&gt;Mar2000. &lt;/em&gt;</content:encoded><taxo:topics><rdf:Bag><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Computational"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Biology"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Drosophila"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Insect,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Chromosome"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Genes,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Genome,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Data,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Physical"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Sites,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Chromatin,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Sequence"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Acid,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Animals,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Contig"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Analysis,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Sequences,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Algorithms,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Nucleic"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Molecular"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Repetitive"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/DNA,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Heterochromatin,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Euchromatin,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Tagged"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Mapping,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/melanogaster,"/></rdf:Bag></taxo:topics><burst:publication><rdf:Description rdf:about="http://www.bibsonomy.org/bibtex/211f0d08db68b4f73ce33ffab95ffef98/dzerbino"><owl:sameAs rdf:resource="http://www.bibsonomy.org/uri/bibtex/211f0d08db68b4f73ce33ffab95ffef98/dzerbino"/><rdf:type rdf:resource="http://swrc.ontoware.org/ontology#Article"/><swrc:date>Mon Sep 17 20:19:41 CEST 2007</swrc:date><swrc:journal>Science</swrc:journal><swrc:month>Mar</swrc:month><swrc:number>5461</swrc:number><swrc:pages>2196--204</swrc:pages><swrc:title>A whole-genome assembly of Drosophila</swrc:title><swrc:volume>287</swrc:volume><swrc:year>2000</swrc:year><swrc:keywords>Computational Biology Drosophila Insect, Chromosome Genes, Genome, Data, Physical Sites, Chromatin, Sequence Acid, Animals, Contig Analysis, Sequences, Algorithms, Nucleic Molecular Repetitive DNA, Heterochromatin, Euchromatin, Tagged Mapping, melanogaster, </swrc:keywords><swrc:abstract>We report on the quality of a whole-genome assembly of Drosophila melanogaster and the nature of the computer algorithms that accomplished it. Three independent external data sources essentially agree with and support the assembly&#039;s sequence and ordering of contigs across the euchromatic portion of the genome. In addition, there are isolated contigs that we believe represent nonrepetitive pockets within the heterochromatin of the centromeres. Comparison with a previously sequenced 2.9- megabase region indicates that sequencing accuracy within nonrepetitive segments is greater than 99. 99% without manual curation. As such, this initial reconstruction of the Drosophila sequence should be of substantial value to the scientific community.</swrc:abstract><swrc:hasExtraField><swrc:Field swrc:value="8395" swrc:key="pii"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="10731133" swrc:key="pmid"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="5461" swrc:key="issue"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="Celera Genomics, Inc., 45 West Gude Drive, Rockville, MD 20850, USA. Gene.Myers@celera.com" swrc:key="affiliation"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="English" swrc:key="language"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="papers://055852FE-1648-42FE-91D0-8CA474D2B905/Paper/p20" swrc:key="uri"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="file://localhost/Users/danielzerbino/Documents/Papers/2000/Myers/Science%202000%20Myers.pdf" swrc:key="url"/></swrc:hasExtraField><swrc:author><rdf:Seq><rdf:_1><swrc:Person swrc:name="E W Myers"/></rdf:_1><rdf:_2><swrc:Person swrc:name="G G Sutton"/></rdf:_2><rdf:_3><swrc:Person swrc:name="A L Delcher"/></rdf:_3><rdf:_4><swrc:Person swrc:name="I M Dew"/></rdf:_4><rdf:_5><swrc:Person swrc:name="D P Fasulo"/></rdf:_5><rdf:_6><swrc:Person swrc:name="M J Flanigan"/></rdf:_6><rdf:_7><swrc:Person swrc:name="S A Kravitz"/></rdf:_7><rdf:_8><swrc:Person swrc:name="C M Mobarry"/></rdf:_8><rdf:_9><swrc:Person swrc:name="K H Reinert"/></rdf:_9><rdf:_10><swrc:Person swrc:name="K A Remington"/></rdf:_10><rdf:_11><swrc:Person swrc:name="E L Anson"/></rdf:_11><rdf:_12><swrc:Person swrc:name="R A Bolanos"/></rdf:_12><rdf:_13><swrc:Person swrc:name="H H Chou"/></rdf:_13><rdf:_14><swrc:Person swrc:name="C M Jordan"/></rdf:_14><rdf:_15><swrc:Person swrc:name="A L Halpern"/></rdf:_15><rdf:_16><swrc:Person swrc:name="S Lonardi"/></rdf:_16><rdf:_17><swrc:Person swrc:name="E M Beasley"/></rdf:_17><rdf:_18><swrc:Person swrc:name="R C Brandon"/></rdf:_18><rdf:_19><swrc:Person swrc:name="L Chen"/></rdf:_19><rdf:_20><swrc:Person swrc:name="P J Dunn"/></rdf:_20><rdf:_21><swrc:Person swrc:name="Z Lai"/></rdf:_21><rdf:_22><swrc:Person swrc:name="Y Liang"/></rdf:_22><rdf:_23><swrc:Person swrc:name="D R Nusskern"/></rdf:_23><rdf:_24><swrc:Person swrc:name="M Zhan"/></rdf:_24><rdf:_25><swrc:Person swrc:name="Q Zhang"/></rdf:_25><rdf:_26><swrc:Person swrc:name="X Zheng"/></rdf:_26><rdf:_27><swrc:Person swrc:name="G M Rubin"/></rdf:_27><rdf:_28><swrc:Person swrc:name="M D Adams"/></rdf:_28><rdf:_29><swrc:Person swrc:name="J C Venter"/></rdf:_29></rdf:Seq></swrc:author></rdf:Description></burst:publication></item><item rdf:about="http://www.bibsonomy.org/bibtex/2ee9d2aa5c67377101e1ad619b08eac15/dzerbino"><title>Efficiently detecting polymorphisms during the fragment assembly process</title><link>http://www.bibsonomy.org/bibtex/2ee9d2aa5c67377101e1ad619b08eac15/dzerbino</link><dc:creator>dzerbino</dc:creator><dc:date>2007-09-17T20:19:41+02:00</dc:date><dc:subject>Restriction Fragmentation Polymorphism, Gene DNA Consensus Alignment, Variation Sequence DNA, Profiling, Base Fragment Expression Analysis, Molecular Genetic, Sequence, Data, Length, (Genetics), Algorithms, </dc:subject><content:encoded>&lt;span style=&#034;color:#555555;&#034;&gt;Daniel &lt;a href=&#034;http://www.bibsonomy.org/author/Fasulo&#034;&gt;Fasulo&lt;/a&gt;  und Aaron &lt;a href=&#034;http://www.bibsonomy.org/author/Halpern&#034;&gt;Halpern&lt;/a&gt;  und Ian &lt;a href=&#034;http://www.bibsonomy.org/author/Dew&#034;&gt;Dew&lt;/a&gt;  und Clark &lt;a href=&#034;http://www.bibsonomy.org/author/Mobarry&#034;&gt;Mobarry&lt;/a&gt;  &lt;/span&gt;&lt;em&gt;Bioinformatics&lt;/em&gt;&lt;em&gt;Jan2002. &lt;/em&gt;</content:encoded><taxo:topics><rdf:Bag><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Restriction"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Fragmentation"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Polymorphism,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Gene"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/DNA"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Consensus"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Alignment,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Variation"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Sequence"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/DNA,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Profiling,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Base"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Fragment"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Expression"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Analysis,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Molecular"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Genetic,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Sequence,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Data,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Length,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/(Genetics),"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Algorithms,"/></rdf:Bag></taxo:topics><burst:publication><rdf:Description rdf:about="http://www.bibsonomy.org/bibtex/2ee9d2aa5c67377101e1ad619b08eac15/dzerbino"><owl:sameAs rdf:resource="http://www.bibsonomy.org/uri/bibtex/2ee9d2aa5c67377101e1ad619b08eac15/dzerbino"/><rdf:type rdf:resource="http://swrc.ontoware.org/ontology#Article"/><swrc:date>Mon Sep 17 20:19:41 CEST 2007</swrc:date><swrc:journal>Bioinformatics</swrc:journal><swrc:month>Jan</swrc:month><swrc:pages>S294--302</swrc:pages><swrc:title>Efficiently detecting polymorphisms during the fragment assembly process</swrc:title><swrc:volume>18 Suppl 1</swrc:volume><swrc:year>2002</swrc:year><swrc:keywords>Restriction Fragmentation Polymorphism, Gene DNA Consensus Alignment, Variation Sequence DNA, Profiling, Base Fragment Expression Analysis, Molecular Genetic, Sequence, Data, Length, (Genetics), Algorithms, </swrc:keywords><swrc:abstract>MOTIVATION: Current genomic sequence assemblers assume that the input data is derived from a single, homogeneous source. However, recent whole-genome shotgun sequencing projects have violated this assumption, resulting in input fragments covering the same region of the genome whose sequences differ due to polymorphic variation in the population. While single-nucleotide polymorphisms (SNPs) do not pose a significant problem to state-of-the-art assembly methods, these methods do not handle insertion/deletion (indel) polymorphisms of more than a few bases. RESULTS: This paper describes an efficient method for detecting sequence discrepencies due to polymorphism that avoids resorting to global use of more costly, less stringent affine sequence alignments. Instead, the algorithm uses graph-based methods to determine the small set of fragments involved in each polymorphism and performs more sophisticated alignments only among fragments in that set. Results from the incorporation of this method into the Celera Assembler are reported for the D. melanogaster, H. sapiens, and M. musculus genomes.</swrc:abstract><swrc:hasExtraField><swrc:Field swrc:value="12169559" swrc:key="pmid"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="Informatics Research, Celera Genomics, 45 W. Gude Dr., Rockville MD 20850, USA. daniel.fasulo@celera.com" swrc:key="affiliation"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="English" swrc:key="language"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="papers://055852FE-1648-42FE-91D0-8CA474D2B905/Paper/p17" swrc:key="uri"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="file://localhost/Users/danielzerbino/Documents/Papers/2002/Fasulo/Bioinformatics%202002%20Fasulo.pdf" swrc:key="url"/></swrc:hasExtraField><swrc:author><rdf:Seq><rdf:_1><swrc:Person swrc:name="Daniel Fasulo"/></rdf:_1><rdf:_2><swrc:Person swrc:name="Aaron Halpern"/></rdf:_2><rdf:_3><swrc:Person swrc:name="Ian Dew"/></rdf:_3><rdf:_4><swrc:Person swrc:name="Clark Mobarry"/></rdf:_4></rdf:Seq></swrc:author></rdf:Description></burst:publication></item><item rdf:about="http://www.bibsonomy.org/bibtex/29acd2b8b071ac8bb83b8007ef69b9d88/dzerbino"><title>Occupancy modeling of coverage distribution for whole genome shotgun DNA sequencing</title><link>http://www.bibsonomy.org/bibtex/29acd2b8b071ac8bb83b8007ef69b9d88/dzerbino</link><dc:creator>dzerbino</dc:creator><dc:date>2007-09-17T20:19:41+02:00</dc:date><dc:subject>Analysis, Sequence DNA, Models, Animals, Genome, Statistical Humans, Algorithms, Genomics, </dc:subject><content:encoded>&lt;span style=&#034;color:#555555;&#034;&gt;Michael C &lt;a href=&#034;http://www.bibsonomy.org/author/Wendl&#034;&gt;Wendl&lt;/a&gt;  &lt;/span&gt;&lt;em&gt;Bull Math Biol&lt;/em&gt;&lt;em&gt;68(1):179--96&lt;/em&gt;&lt;em&gt;Jan2006. &lt;/em&gt;</content:encoded><taxo:topics><rdf:Bag><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Analysis,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Sequence"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/DNA,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Models,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Animals,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Genome,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Statistical"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Humans,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Algorithms,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Genomics,"/></rdf:Bag></taxo:topics><burst:publication><rdf:Description rdf:about="http://www.bibsonomy.org/bibtex/29acd2b8b071ac8bb83b8007ef69b9d88/dzerbino"><owl:sameAs rdf:resource="http://www.bibsonomy.org/uri/bibtex/29acd2b8b071ac8bb83b8007ef69b9d88/dzerbino"/><rdf:type rdf:resource="http://swrc.ontoware.org/ontology#Article"/><swrc:date>Mon Sep 17 20:19:41 CEST 2007</swrc:date><swrc:journal>Bull Math Biol</swrc:journal><swrc:month>Jan</swrc:month><swrc:number>1</swrc:number><swrc:pages>179--96</swrc:pages><swrc:title>Occupancy modeling of coverage distribution for whole genome shotgun DNA sequencing</swrc:title><swrc:volume>68</swrc:volume><swrc:year>2006</swrc:year><swrc:keywords>Analysis, Sequence DNA, Models, Animals, Genome, Statistical Humans, Algorithms, Genomics, </swrc:keywords><swrc:abstract>Expected-value models have long provided a rudimentary theoretical foundation for random DNA sequencing. Here, we are interested in improving characterization of genome coverage in terms of its underlying probability distributions. We find that the mathematical notion of occupancy serves as a good model for evolution of the coverage distribution function and reveals new insights related to sequence redundancy. Established concepts, such as &#034;full shotgun depth,&#034; have been assumed invariant, but actually depend on project size and decrease over time. For most microbial projects, the full shotgun milestone should be revised downward by about 30%. Accordingly, many already-completed genomes appear to have been over-sequenced. Results also suggest that read lengths for emerging high-throughput sequencing methods must be increased substantially before they can be considered as possible successors to the standard Sanger method. In particular, gains in throughput and sequence depth cannot be made to compensate for diminished read length. Limits are well approximated by a simple logarithmic equation, which should be useful in estimating maximum coverage-based redundancy for future projects.</swrc:abstract><swrc:hasExtraField><swrc:Field swrc:value="16794926" swrc:key="pmid"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="1" swrc:key="issue"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="Genome Sequencing Center, Washington University, 4444 Forest Park Boulevard, Campus Box 8501, St. Louis, MO 63108, USA. mwendl@wustl.edu" swrc:key="affiliation"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="English" swrc:key="language"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="papers://055852FE-1648-42FE-91D0-8CA474D2B905/Paper/p32" swrc:key="uri"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="file://localhost/Users/danielzerbino/Documents/Papers/2006/Wendl/Bull%20Math%20Biol%202006%20Wendl.pdf" swrc:key="url"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="10.1007/s11538-005-9021-4" swrc:key="doi"/></swrc:hasExtraField><swrc:author><rdf:Seq><rdf:_1><swrc:Person swrc:name="Michael C Wendl"/></rdf:_1></rdf:Seq></swrc:author></rdf:Description></burst:publication></item><item rdf:about="http://www.bibsonomy.org/bibtex/2502e0622b4d381412eafa06bb77d377e/dzerbino"><title>An analysis of the feasibility of short read sequencing</title><link>http://www.bibsonomy.org/bibtex/2502e0622b4d381412eafa06bb77d377e/dzerbino</link><dc:creator>dzerbino</dc:creator><dc:date>2007-09-17T20:19:41+02:00</dc:date><dc:subject>Sequence Genomics, Genome, Studies, Chromosomes, Human, Pair 1, Feasibility DNA, Humans, Viral, Analysis, Bacterial </dc:subject><content:encoded>&lt;span style=&#034;color:#555555;&#034;&gt;Nava &lt;a href=&#034;http://www.bibsonomy.org/author/Whiteford&#034;&gt;Whiteford&lt;/a&gt;  und Niall &lt;a href=&#034;http://www.bibsonomy.org/author/Haslam&#034;&gt;Haslam&lt;/a&gt;  und Gerald &lt;a href=&#034;http://www.bibsonomy.org/author/Weber&#034;&gt;Weber&lt;/a&gt;  und Adam &lt;a href=&#034;http://www.bibsonomy.org/author/Pr{\&amp;#034;u}gel-Bennett&#034;&gt;Pr&amp;#252;gel-Bennett&lt;/a&gt;  und Jonathan W &lt;a href=&#034;http://www.bibsonomy.org/author/Essex&#034;&gt;Essex&lt;/a&gt;  und Peter L &lt;a href=&#034;http://www.bibsonomy.org/author/Roach&#034;&gt;Roach&lt;/a&gt;  und Mark &lt;a href=&#034;http://www.bibsonomy.org/author/Bradley&#034;&gt;Bradley&lt;/a&gt;  und Cameron &lt;a href=&#034;http://www.bibsonomy.org/author/Neylon&#034;&gt;Neylon&lt;/a&gt;  &lt;/span&gt;&lt;em&gt;Nucleic Acids Res&lt;/em&gt;&lt;em&gt;33(19):e171&lt;/em&gt;&lt;em&gt;Nov2005. &lt;/em&gt;</content:encoded><taxo:topics><rdf:Bag><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Sequence"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Genomics,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Genome,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Studies,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Chromosomes,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Human,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Pair"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/1,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Feasibility"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/DNA,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Humans,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Viral,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Analysis,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Bacterial"/></rdf:Bag></taxo:topics><burst:publication><rdf:Description rdf:about="http://www.bibsonomy.org/bibtex/2502e0622b4d381412eafa06bb77d377e/dzerbino"><owl:sameAs rdf:resource="http://www.bibsonomy.org/uri/bibtex/2502e0622b4d381412eafa06bb77d377e/dzerbino"/><rdf:type rdf:resource="http://swrc.ontoware.org/ontology#Article"/><swrc:date>Mon Sep 17 20:19:41 CEST 2007</swrc:date><swrc:journal>Nucleic Acids Res</swrc:journal><swrc:month>Nov</swrc:month><swrc:number>19</swrc:number><swrc:pages>e171</swrc:pages><swrc:title>An analysis of the feasibility of short read sequencing</swrc:title><swrc:volume>33</swrc:volume><swrc:year>2005</swrc:year><swrc:keywords>Sequence Genomics, Genome, Studies, Chromosomes, Human, Pair 1, Feasibility DNA, Humans, Viral, Analysis, Bacterial </swrc:keywords><swrc:abstract>Several methods for ultra high-throughput DNA sequencing are currently under investigation. Many of these methods yield very short blocks of sequence information (reads). Here we report on an analysis showing the level of genome sequencing possible as a function of read length. It is shown that re-sequencing and de novo sequencing of the majority of a bacterial genome is possible with read lengths of 20-30 nt, and that reads of 50 nt can provide reconstructed contigs (a contiguous fragment of sequence data) of 1000 nt and greater that cover 80% of human chromosome 1.</swrc:abstract><swrc:hasExtraField><swrc:Field swrc:value="33/19/e171" swrc:key="pii"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="16275781" swrc:key="pmid"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="19" swrc:key="issue"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="School of Chemistry, University of Southampton, Southampton SO17 1BJ, UK." swrc:key="affiliation"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="English" swrc:key="language"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="papers://055852FE-1648-42FE-91D0-8CA474D2B905/Paper/p10" swrc:key="uri"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="file://localhost/Users/danielzerbino/Documents/Papers/2005/Whiteford/Nucleic%20Acids%20Res%202005%20Whiteford.pdf" swrc:key="url"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="10.1093/nar/gni170" swrc:key="doi"/></swrc:hasExtraField><swrc:author><rdf:Seq><rdf:_1><swrc:Person swrc:name="Nava Whiteford"/></rdf:_1><rdf:_2><swrc:Person swrc:name="Niall Haslam"/></rdf:_2><rdf:_3><swrc:Person swrc:name="Gerald Weber"/></rdf:_3><rdf:_4><swrc:Person swrc:name="Adam Pr{\&#034;u}gel-Bennett"/></rdf:_4><rdf:_5><swrc:Person swrc:name="Jonathan W Essex"/></rdf:_5><rdf:_6><swrc:Person swrc:name="Peter L Roach"/></rdf:_6><rdf:_7><swrc:Person swrc:name="Mark Bradley"/></rdf:_7><rdf:_8><swrc:Person swrc:name="Cameron Neylon"/></rdf:_8></rdf:Seq></swrc:author></rdf:Description></burst:publication></item><item rdf:about="http://www.bibsonomy.org/bibtex/21dc921e2ef4587944697d75bf48c2db4/dzerbino"><title>Gene maps linearization using genomic rearrangement distances</title><link>http://www.bibsonomy.org/bibtex/21dc921e2ef4587944697d75bf48c2db4/dzerbino</link><dc:creator>dzerbino</dc:creator><dc:date>2007-09-17T20:19:41+02:00</dc:date><dc:subject>Sequence Algorithms, Biology, Computational Software Genomics, Analysis, Genome, DNA, </dc:subject><content:encoded>&lt;span style=&#034;color:#555555;&#034;&gt;Guillaume &lt;a href=&#034;http://www.bibsonomy.org/author/Blin&#034;&gt;Blin&lt;/a&gt;  und Eric &lt;a href=&#034;http://www.bibsonomy.org/author/Blais&#034;&gt;Blais&lt;/a&gt;  und Danny &lt;a href=&#034;http://www.bibsonomy.org/author/Hermelin&#034;&gt;Hermelin&lt;/a&gt;  und Pierre &lt;a href=&#034;http://www.bibsonomy.org/author/Guillon&#034;&gt;Guillon&lt;/a&gt;  und Mathieu &lt;a href=&#034;http://www.bibsonomy.org/author/Blanchette&#034;&gt;Blanchette&lt;/a&gt;  und Nadia &lt;a href=&#034;http://www.bibsonomy.org/author/El-Mabrouk&#034;&gt;El-Mabrouk&lt;/a&gt;  &lt;/span&gt;&lt;em&gt;J Comput Biol&lt;/em&gt;&lt;em&gt;14(4):394--407&lt;/em&gt;&lt;em&gt;May2007. &lt;/em&gt;</content:encoded><taxo:topics><rdf:Bag><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Sequence"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Algorithms,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Biology,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Computational"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Software"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Genomics,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Analysis,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Genome,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/DNA,"/></rdf:Bag></taxo:topics><burst:publication><rdf:Description rdf:about="http://www.bibsonomy.org/bibtex/21dc921e2ef4587944697d75bf48c2db4/dzerbino"><owl:sameAs rdf:resource="http://www.bibsonomy.org/uri/bibtex/21dc921e2ef4587944697d75bf48c2db4/dzerbino"/><rdf:type rdf:resource="http://swrc.ontoware.org/ontology#Article"/><swrc:date>Mon Sep 17 20:19:41 CEST 2007</swrc:date><swrc:journal>J Comput Biol</swrc:journal><swrc:month>May</swrc:month><swrc:number>4</swrc:number><swrc:pages>394--407</swrc:pages><swrc:title>Gene maps linearization using genomic rearrangement distances</swrc:title><swrc:volume>14</swrc:volume><swrc:year>2007</swrc:year><swrc:keywords>Sequence Algorithms, Biology, Computational Software Genomics, Analysis, Genome, DNA, </swrc:keywords><swrc:abstract>A preliminary step to most comparative genomics studies is the annotation of chromosomes as ordered sequences of genes. Different genetic mapping techniques often give rise to different maps with unequal gene content and sets of unordered neighboring genes. Only partial orders can thus be obtained from combining such maps. However, once a total order O is known for a given genome, it can be used as a reference to order genes of a closely related species characterized by a partial order P. Our goal is to find a linearization of P that is as close as possible to O, in term of a given genomic distance. We first prove NP-completeness complexity results considering the breakpoint and the common interval distances. We then focus on the breakpoint distance and give a dynamic programming algorithm whose running time is exponential for general partial orders, but polynomial when the partial order is derived from a bounded number of genetic maps. A time-efficient greedy heuristic is then given for the general case and is empirically shown to produce solutions within 10% of the optimal solution, on simulated data. Applications to the analysis of grass genomes are presented.</swrc:abstract><swrc:hasExtraField><swrc:Field swrc:value="17572019" swrc:key="pmid"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="4" swrc:key="issue"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="IGM-LabInfo, UMR CNRS 8049, Universit{\&#039;e" swrc:key="affiliation"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="English" swrc:key="language"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="papers://055852FE-1648-42FE-91D0-8CA474D2B905/Paper/p37" swrc:key="uri"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="file://localhost/Users/danielzerbino/Documents/Papers/2007/Blin/J%20Comput%20Biol%202007%20Blin.pdf" swrc:key="url"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="10.1089/cmb.2007.A002" swrc:key="doi"/></swrc:hasExtraField><swrc:author><rdf:Seq><rdf:_1><swrc:Person swrc:name="Guillaume Blin"/></rdf:_1><rdf:_2><swrc:Person swrc:name="Eric Blais"/></rdf:_2><rdf:_3><swrc:Person swrc:name="Danny Hermelin"/></rdf:_3><rdf:_4><swrc:Person swrc:name="Pierre Guillon"/></rdf:_4><rdf:_5><swrc:Person swrc:name="Mathieu Blanchette"/></rdf:_5><rdf:_6><swrc:Person swrc:name="Nadia El-Mabrouk"/></rdf:_6></rdf:Seq></swrc:author></rdf:Description></burst:publication></item><item rdf:about="http://www.bibsonomy.org/bibtex/2e05ca61552c8d9f5951978da7619860d/dzerbino"><title>A pseudo-boolean framework for computing rearrangement distances between genomes with duplicates</title><link>http://www.bibsonomy.org/bibtex/2e05ca61552c8d9f5951978da7619860d/dzerbino</link><dc:creator>dzerbino</dc:creator><dc:date>2007-09-17T20:19:41+02:00</dc:date><dc:subject>Analysis, DNA, Biology, Sequence Gene Genome, Gammaproteobacteria, Duplication, Computational Bacterial Algorithms, Software, </dc:subject><content:encoded>&lt;span style=&#034;color:#555555;&#034;&gt;S&#039;ebastien &lt;a href=&#034;http://www.bibsonomy.org/author/Angibaud&#034;&gt;Angibaud&lt;/a&gt;  und Guillaume &lt;a href=&#034;http://www.bibsonomy.org/author/Fertin&#034;&gt;Fertin&lt;/a&gt;  und Irena &lt;a href=&#034;http://www.bibsonomy.org/author/Rusu&#034;&gt;Rusu&lt;/a&gt;  und St&#039;ephane &lt;a href=&#034;http://www.bibsonomy.org/author/Vialette&#034;&gt;Vialette&lt;/a&gt;  &lt;/span&gt;&lt;em&gt;J Comput Biol&lt;/em&gt;&lt;em&gt;14(4):379--93&lt;/em&gt;&lt;em&gt;May2007. &lt;/em&gt;</content:encoded><taxo:topics><rdf:Bag><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Analysis,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/DNA,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Biology,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Sequence"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Gene"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Genome,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Gammaproteobacteria,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Duplication,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Computational"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Bacterial"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Algorithms,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Software,"/></rdf:Bag></taxo:topics><burst:publication><rdf:Description rdf:about="http://www.bibsonomy.org/bibtex/2e05ca61552c8d9f5951978da7619860d/dzerbino"><owl:sameAs rdf:resource="http://www.bibsonomy.org/uri/bibtex/2e05ca61552c8d9f5951978da7619860d/dzerbino"/><rdf:type rdf:resource="http://swrc.ontoware.org/ontology#Article"/><swrc:date>Mon Sep 17 20:19:41 CEST 2007</swrc:date><swrc:journal>J Comput Biol</swrc:journal><swrc:month>May</swrc:month><swrc:number>4</swrc:number><swrc:pages>379--93</swrc:pages><swrc:title>A pseudo-boolean framework for computing rearrangement distances between genomes with duplicates</swrc:title><swrc:volume>14</swrc:volume><swrc:year>2007</swrc:year><swrc:keywords>Analysis, DNA, Biology, Sequence Gene Genome, Gammaproteobacteria, Duplication, Computational Bacterial Algorithms, Software, </swrc:keywords><swrc:abstract>Computing genomic distances between whole genomes is a fundamental problem in comparative genomics. Recent researches have resulted in different genomic distance definitions, for example, number of breakpoints, number of common intervals, number of conserved intervals, and Maximum Adjacency Disruption number. Unfortunately, it turns out that, in presence of duplications, most problems are NP-hard, and hence several heuristics have been recently proposed. However, while it is relatively easy to compare heuristics between them, until now very little is known about the absolute accuracy of these heuristics. Therefore, there is a great need for algorithmic approaches that compute exact solutions for these genomic distances. In this paper, we present a novel generic pseudo-boolean approach for computing the exact genomic distance between two whole genomes in presence of duplications, and put strong emphasis on common intervals under the maximum matching model. Of particular importance, we show three heuristics which provide very good results on a well-known public dataset of gamma-Proteobacteria.</swrc:abstract><swrc:hasExtraField><swrc:Field swrc:value="17572018" swrc:key="pmid"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="4" swrc:key="issue"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="Laboratoire d&#039;Informatique de Nantes-Atlantique, FRE CNRS 2729, Universit{\&#039;e" swrc:key="affiliation"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="English" swrc:key="language"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="papers://055852FE-1648-42FE-91D0-8CA474D2B905/Paper/p39" swrc:key="uri"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="file://localhost/Users/danielzerbino/Documents/Papers/2007/Angibaud/J%20Comput%20Biol%202007%20Angibaud.pdf" swrc:key="url"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="10.1089/cmb.2007.A001" swrc:key="doi"/></swrc:hasExtraField><swrc:author><rdf:Seq><rdf:_1><swrc:Person swrc:name="S{\&#039;e}bastien Angibaud"/></rdf:_1><rdf:_2><swrc:Person swrc:name="Guillaume Fertin"/></rdf:_2><rdf:_3><swrc:Person swrc:name="Irena Rusu"/></rdf:_3><rdf:_4><swrc:Person swrc:name="St{\&#039;e}phane Vialette"/></rdf:_4></rdf:Seq></swrc:author></rdf:Description></burst:publication></item><item rdf:about="http://www.bibsonomy.org/bibtex/285853a4fe7db3494508d6631d10f55ca/dzerbino"><title>An Eulerian path approach to DNA fragment assembly</title><link>http://www.bibsonomy.org/bibtex/285853a4fe7db3494508d6631d10f55ca/dzerbino</link><dc:creator>dzerbino</dc:creator><dc:date>2007-09-17T20:19:41+02:00</dc:date><dc:subject>Analysis, Genome, Alignment, Software, Theoretical, lactis Campylobacter jejuni, meningitidis, Neisseria Algorithms, DNA, Sequence Contig Bacterial, Lactococcus Mapping, Models, </dc:subject><content:encoded>&lt;span style=&#034;color:#555555;&#034;&gt;P A &lt;a href=&#034;http://www.bibsonomy.org/author/Pevzner&#034;&gt;Pevzner&lt;/a&gt;  und H &lt;a href=&#034;http://www.bibsonomy.org/author/Tang&#034;&gt;Tang&lt;/a&gt;  und M S &lt;a href=&#034;http://www.bibsonomy.org/author/Waterman&#034;&gt;Waterman&lt;/a&gt;  &lt;/span&gt;&lt;em&gt;Proc Natl Acad Sci USA&lt;/em&gt;&lt;em&gt;98(17):9748--53&lt;/em&gt;&lt;em&gt;Aug2001. &lt;/em&gt;</content:encoded><taxo:topics><rdf:Bag><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Analysis,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Genome,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Alignment,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Software,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Theoretical,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/lactis"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Campylobacter"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/jejuni,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/meningitidis,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Neisseria"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Algorithms,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/DNA,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Sequence"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Contig"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Bacterial,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Lactococcus"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Mapping,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Models,"/></rdf:Bag></taxo:topics><burst:publication><rdf:Description rdf:about="http://www.bibsonomy.org/bibtex/285853a4fe7db3494508d6631d10f55ca/dzerbino"><owl:sameAs rdf:resource="http://www.bibsonomy.org/uri/bibtex/285853a4fe7db3494508d6631d10f55ca/dzerbino"/><rdf:type rdf:resource="http://swrc.ontoware.org/ontology#Article"/><swrc:date>Mon Sep 17 20:19:41 CEST 2007</swrc:date><swrc:journal>Proc Natl Acad Sci USA</swrc:journal><swrc:month>Aug</swrc:month><swrc:number>17</swrc:number><swrc:pages>9748--53</swrc:pages><swrc:title>An Eulerian path approach to DNA fragment assembly</swrc:title><swrc:volume>98</swrc:volume><swrc:year>2001</swrc:year><swrc:keywords>Analysis, Genome, Alignment, Software, Theoretical, lactis Campylobacter jejuni, meningitidis, Neisseria Algorithms, DNA, Sequence Contig Bacterial, Lactococcus Mapping, Models, </swrc:keywords><swrc:abstract>For the last 20 years, fragment assembly in DNA sequencing followed the &#034;overlap-layout-consensus&#034; paradigm that is used in all currently available assembly tools. Although this approach proved useful in assembling clones, it faces difficulties in genomic shotgun assembly. We abandon the classical &#034;overlap-layout-consensus&#034; approach in favor of a new euler algorithm that, for the first time, resolves the 20-year-old &#034;repeat problem&#034; in fragment assembly. Our main result is the reduction of the fragment assembly to a variation of the classical Eulerian path problem that allows one to generate accurate solutions of large-scale sequencing problems. euler, in contrast to the celera assembler, does not mask such repeats but uses them instead as a powerful fragment assembly tool.</swrc:abstract><swrc:hasExtraField><swrc:Field swrc:value="98/17/9748" swrc:key="pii"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="11504945" swrc:key="pmid"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="17" swrc:key="issue"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="Department of Computer Science and Engineering, University of California, San Diego, La Jolla, USA." swrc:key="affiliation"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="English" swrc:key="language"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="papers://055852FE-1648-42FE-91D0-8CA474D2B905/Paper/p15" swrc:key="uri"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="file://localhost/Users/danielzerbino/Documents/Papers/2001/Pevzner/Proc%20Natl%20Acad%20Sci%20USA%202001%20Pevzner.pdf" swrc:key="url"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="10.1073/pnas.171285098" swrc:key="doi"/></swrc:hasExtraField><swrc:author><rdf:Seq><rdf:_1><swrc:Person swrc:name="P A Pevzner"/></rdf:_1><rdf:_2><swrc:Person swrc:name="H Tang"/></rdf:_2><rdf:_3><swrc:Person swrc:name="M S Waterman"/></rdf:_3></rdf:Seq></swrc:author></rdf:Description></burst:publication></item><item rdf:about="http://www.bibsonomy.org/bibtex/26fefee7c3e5e1c27b6a90750a6c4c153/dzerbino"><title>Using guide trees to construct multiple-sequence evolutionary HMMs</title><link>http://www.bibsonomy.org/bibtex/26fefee7c3e5e1c27b6a90750a6c4c153/dzerbino</link><dc:creator>dzerbino</dc:creator><dc:date>2007-09-17T20:19:41+02:00</dc:date><dc:subject>Regulation, Homology Chains, Software, Profiling, Alignment, Molecular, Gene Statistical, DNA, Markov Expression Sequence Protein, Genetic, Analysis, Models, Evolution, Cluster Algorithms, </dc:subject><content:encoded>&lt;span style=&#034;color:#555555;&#034;&gt;I &lt;a href=&#034;http://www.bibsonomy.org/author/Holmes&#034;&gt;Holmes&lt;/a&gt;  &lt;/span&gt;&lt;em&gt;Bioinformatics&lt;/em&gt;&lt;em&gt;Jan2003. &lt;/em&gt;</content:encoded><taxo:topics><rdf:Bag><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Regulation,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Homology"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Chains,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Software,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Profiling,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Alignment,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Molecular,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Gene"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Statistical,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/DNA,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Markov"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Expression"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Sequence"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Protein,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Genetic,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Analysis,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Models,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Evolution,"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Cluster"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/Algorithms,"/></rdf:Bag></taxo:topics><burst:publication><rdf:Description rdf:about="http://www.bibsonomy.org/bibtex/26fefee7c3e5e1c27b6a90750a6c4c153/dzerbino"><owl:sameAs rdf:resource="http://www.bibsonomy.org/uri/bibtex/26fefee7c3e5e1c27b6a90750a6c4c153/dzerbino"/><rdf:type rdf:resource="http://swrc.ontoware.org/ontology#Article"/><swrc:date>Mon Sep 17 20:19:41 CEST 2007</swrc:date><swrc:journal>Bioinformatics</swrc:journal><swrc:month>Jan</swrc:month><swrc:pages>i147--57</swrc:pages><swrc:title>Using guide trees to construct multiple-sequence evolutionary HMMs</swrc:title><swrc:volume>19 Suppl 1</swrc:volume><swrc:year>2003</swrc:year><swrc:keywords>Regulation, Homology Chains, Software, Profiling, Alignment, Molecular, Gene Statistical, DNA, Markov Expression Sequence Protein, Genetic, Analysis, Models, Evolution, Cluster Algorithms, </swrc:keywords><swrc:abstract>MOTIVATION: Score-based progressive alignment algorithms do dynamic programming on successive branches of a guide tree. The analogous probabilistic construct is an Evolutionary HMM. This is a multiple-sequence hidden Markov model (HMM) made by combining transducers (conditionally normalised Pair HMMs) on the branches of a phylogenetic tree. METHODS: We present general algorithms for constructing an Evolutionary HMM from any Pair HMM and for doing dynamic programming to any Multiple-sequence HMM. RESULTS: Our prototype implementation, Handel, is based on the Thorne-Kishino-Felsenstein evolutionary model and is benchmarked using structural reference alignments.</swrc:abstract><swrc:hasExtraField><swrc:Field swrc:value="12855451" swrc:key="pmid"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="Department of Statistics, University of Oxford. 1 South Parks Road, Oxford OX1 3TG, UK." swrc:key="affiliation"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="English" swrc:key="language"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="papers://055852FE-1648-42FE-91D0-8CA474D2B905/Paper/p6" swrc:key="uri"/></swrc:hasExtraField><swrc:hasExtraField><swrc:Field swrc:value="file://localhost/Users/danielzerbino/Documents/Papers/2003/Holmes/Bioinformatics%202003%20Holmes.pdf" swrc:key="url"/></swrc:hasExtraField><swrc:author><rdf:Seq><rdf:_1><swrc:Person swrc:name="I Holmes"/></rdf:_1></rdf:Seq></swrc:author></rdf:Description></burst:publication></item></rdf:RDF>