<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>gb-2009-10-2-401</ui>
   <ji>GBJ</ji>
   <fm>
      <dochead>Correspondence</dochead>
      <bibl>
         <title>
            <p>Annotations for all by all - the BioSapiens network</p>
         </title>
         <aug>
            <au ca="yes" id="A1">
               <snm>Thornton</snm>
               <fnm>Janet</fnm>
               <insr iid="I1"/>
               <email>thornton@ebi.ac.uk</email>
            </au>
            <au id="A2" type="on_behalf">
               <cnm>the BioSapiens Network</cnm>
               <insr iid="I1"/>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>European Bioinformatics Institute, Hinxton CB10 1SD, UK</p>
            </ins>
         </insg>
         <source>Genome Biology</source>
         <issn>1465-6906</issn>
         <pubdate>2009</pubdate>
         <volume>10</volume>
         <issue>2</issue>
         <fpage>401</fpage>
         <url>http://genomebiology.com/2009/10/2/401</url>
         <xrefbib>
            
         <pubidlist><pubid idtype="pmpid">19232072</pubid><pubid idtype="doi">10.1186/gb-2009-10-2-401</pubid></pubidlist></xrefbib>
      </bibl>
      <history>
         <pub>
            <date>
               <day>10</day>
               <month>02</month>
               <year>2009</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2009</year>
         <collab>BioMed Central Ltd</collab>
      </cpyrt>
      <shorttitle>
         <p>Annotations for all by all - the BioSapiens network</p>
      </shorttitle>
      <shortabs>
         <p>The BioSapiens network has developed a distributed infrastructure for genome and proteome annotation
by laboratories anywhere in the world.</p>
      </shortabs>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <p>The BioSapiens network has developed a distributed infrastructure for genome and proteome annotation by laboratories anywhere in the world.</p>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification id="BioSapiens" subtype="theme_series_title" type="BMC">BioSapiens</classification>
         <classification id="BioSapiens" subtype="theme_series_editor" type="BMC"/>
         <classification id="30010002" subtype="man_spc_id" type="BMC">Bioinformatics</classification>
         <classification id="30010009" subtype="man_spc_id" type="BMC">Genetics</classification>
         <classification id="30010010" subtype="man_spc_id" type="BMC">Genome studies</classification>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p/>
         </st>
         <p>Over the last five years, the BioSapiens network has developed a distributed infrastructure to facilitate the combined annotation of genomes and proteomes by laboratories scattered throughout Europe. In a series of four review articles, published in <it>Genome Biology </it><abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>, members of the consortium have collaborated to provide an overview of current methods and challenges for the future.</p>
         <p>In total, there are now thousands of completed genomes in the public domain and with the second revolution in DNA sequencing technology, many, many more will be determined. However, DNA sequence is merely a string of letters; it must be interpreted in terms of the RNA and proteins that it encodes and the promoter and regulatory regions that control transcription and translation. Annotation can be described as the process of 'defining the biological role of a molecule in all its complexity' and mapping this knowledge onto the relevant gene products encoded by genomes (Figure <figr fid="F1">1</figr>).</p>
         <fig id="F1">
            <title>
               <p>Figure 1</p>
            </title>
            <caption>
               <p>Steps in the analysis and annotation of genomes</p>
            </caption>
            <text>
               <p>Steps in the analysis and annotation of genomes.</p>
            </text>
            <graphic file="gb-2009-10-2-401-1"/>
         </fig>
         <p>The main objective of BioSapiens, a Network of Excellence funded by the European Commission, is to provide an infrastructure and tools to support a large-scale, concerted effort to annotate genome and proteome data by laboratories distributed around Europe. The Network brought together 26 laboratories in Europe to create a Virtual Institute for Genome Annotation, divided into nodes, each focused on one aspect of genome annotation. The network provides a focus for annotation and through the organization of meetings and workshops encourages cooperation, rather than duplication of effort. The annotations generated are all available in the public domain and easily accessible through a single portal on the web <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>.</p>
         <p>The review by Harrow <it>et al</it>. <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> tackles the challenge of identifying protein-coding genes from genomic sequences. Even the concept of a 'gene' is under revision. The review focuses on the strategies being applied to delineate a number of reference human gene sets - the ones most widely used by researchers in biology - and to assess their quality and completeness. Once the genes are defined, the next challenge is to unravel how regulatory information is encoded in the genome. Gene-expression data has illuminated the consequences of transcriptional activation and propelled the quest to find common regulatory sequences in coexpressed groups of genes. Vingron <it>et al</it>. <abbrgrp><abbr bid="B2">2</abbr></abbrgrp> attempt to summarize progress in integrating these approaches for the purpose of identifying regulatory sequence elements and their function. The other two reviews focus on annotating the proteins and their functions. As reviewed by Juncker <it>et al</it>. <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>, these tasks include identifying functionally important residues, such as those involved in catalysis or binding, and predicting post-translational modifications and cellular localization. Finally, Loewenstein <it>et al</it>. <abbrgrp><abbr bid="B4">4</abbr></abbrgrp> show how both sequence and structural data can be used to illuminate the function of the protein by recognizing a homolog. A recent trend is that many prediction tools are combined in complex workflows and pipelines that facilitate the analysis of feature combinations and use a variety of data and methods.</p>
         <p>A key to integrated annotation is the ability to combine annotations of different types from different laboratories. Within BioSapiens, the Distributed Annotation System (DAS) is used as a lightweight data-integration infrastructure. Originally developed by Dowell <it>et al.</it><abbrgrp><abbr bid="B6">6</abbr></abbrgrp> for genomic sequences, DAS defines a framework for the annotation of reference sequences by multiple independent sites. The DAS concept was extended <abbrgrp><abbr bid="B7">7</abbr></abbrgrp> from genomic sequences to protein sequences, structures, and protein interactions. DAS clients such as DASTY <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr></abbrgrp> now visualize the results of many different approaches for functional protein annotation in a consistent framework. One consequence of this was the need to develop an ontology for annotating sequences <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>, so that annotations from different laboratories are consistent.</p>
         <p>This infrastructure is open to all, allowing any laboratory to generate its own annotations for proteins or genes, and to view their results in the light of other annotations, derived in other laboratories. More detail is available in a book, written by the consortium <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>.</p>
      </sec>
      <sec>
         <st>
            <p>Author information</p>
         </st>
         <p>Members of the BioSapiens Network: Janet Thornton, Ewan Birney, Alvis Brazma, Rolf Apweiler, Kim Henrick, European Bioinformatics Institute, Hinxton CB10 1SD, UK; Peer Bork, European Molecular Biology Laboratory, D-69117 Heidelberg, Germany; Jacques van Helden, BiGRe - Universit&#233; Libre de Bruxelles, Campus Plaine, Bvd du Triomphe - CP263, B-1050 Bruxelles, Belgium; Alfonso Valencia, Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), Melchor Fern&#225;ndez Almagro, 3, E-28029, Madrid, Spain; Roderic Guig&#243;, Centre de Regulaci&#243; Gen&#242;mica, Institut Municipal d'Investigaci&#243; M&#232;dica, Universitat Pompeu Fabra, E-08003 Barcelona, Catalonia, Spain; Richard Durbin, Tim Hubbard, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK; Thomas Lengauer, Max-Planck-Institut f&#252;r Informatik, 66123 Saarbr&#252;cken, Germany; Martin Vingron, Computational Molecular Biology, Max-Planck-Institut f&#252;r molekulare Genetik, Ihnestrasse 73, D-14195 Berlin, Germany; Dmitrij Frishman, Helmholtz Zentrum, German Research Center for Environmental Health, Munich 85764, Germany; Michal Linial, Department of Biological Chemistry, The Hebrew University of Jerusalem, Sudarsky Center, Jerusalem 91904, Israel; Anna Tramontano, Department of Biochemical Sciences, University of Rome "La Sapienza", Rome 00185, Italy; Gunnar von Heijne, Center for Biomembrane Research and Stockholm Bioinformatics Center, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden; Richard Mott, Bioinformatics and Statistical Genetics, University of Oxford, Wellcome Trust Centre for Human Genetics, Roosevelt Drive, Oxford OX3 7BN, UK; Christine Orengo, Research Department of Structural and Molecular Biology, University College, London WC1E, UK; Gert Vriend, Radboud University Medical Centre, 6500 HB Nijmegen, The Netherlands; Christos Ouzounis, Centre for Research and Technology, Hellas (CERTH), Thermi Road, Thessaloniki, Greece; Anne-Lise Veuthey, Swiss Institute of Bioinformatics, rue Michel Servet, CH-1211 Geneva, Switzerland; S&#248;ren Brunak, Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, DK-2800 Lyngby, Denmark; Esko Ukkonen, Helsinki Institute for Information Technology, Helsinki University of Technology and University of Helsinki, 00014 Helsinki, Finland; Stylianos Antonarakis, Department of Genetic Medicine and Development, University of Geneva Medical School and University Hospitals of Geneva, Geneva 1211, Switzerland; L&#225;szl&#243; Patthy, Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, H-1113 Budapest, Hungary; Dietmar Schomburg, Department of Bioinformatics and Biochemistry, Institute for Biochemistry and Biotechnology, Technical University of Braunschweig, Langer Kamp, D-38106 Braunschweig, Germany; Antoine Danchin, Institut Pasteur, rue du Docteur Roux, Paris CEDEX 15, France; Leszek Rychlewski, BioInfoBank Institute, Pozna&#241; Limanowskiego 24A16 60-744, Poland; Vincent Schachter, Genoscope Centre National de Sequencage Institut de genomique, Direction des Sciences du vivant, rue Gaston Cremieux, CP5706 91 057 Evry Cedex, France.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>The BioSapiens project is funded by the European Commission within its FP6 Programme, under the thematic area 'Life sciences, genomics and biotechnology for health', contract number LSHG-CT-2003-503265.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Identifying protein-coding genes in genomic sequences.</p>
            </title>
            <aug>
               <au>
                  <snm>Harrow</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Nagy</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Reymond</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Alioto</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Patthy</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Antonarakis</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Guig&#243;</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2009</pubdate>
            <volume>10</volume>
            <fpage>201</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1186/gb-2009-10-1-201</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Integrating sequence, evolution and functional genomics in regulatory genomics.</p>
            </title>
            <aug>
               <au>
                  <snm>Vingron</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Brazma</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Coulson</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Helden</snm>
                  <fnm>Jv</fnm>
               </au>
               <au>
                  <snm>Manke</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Palin</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Sand</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Ukkonen</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2009</pubdate>
            <volume>10</volume>
            <fpage>202</fpage>
            <xrefbib>
               <pubid idtype="doi">10.1186/gb-2009-10-1-202</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Sequence-based feature prediction and annotation of proteins.</p>
            </title>
            <aug>
               <au>
                  <snm>Juncker</snm>
                  <fnm>AS</fnm>
               </au>
               <au>
                  <snm>Jensen</snm>
                  <fnm>LJ</fnm>
               </au>
               <au>
                  <snm>Pierleoni</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Bernsel</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Tress</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Heijne</snm>
                  <fnm>Gv</fnm>
               </au>
               <au>
                  <snm>Valencia</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Ouzounis</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Casadio</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Brunak</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2009</pubdate>
            <volume>10</volume>
            <fpage>206</fpage>
            <xrefbib>
               <pubid idtype="doi">10.1186/gb-2009-10-2-206</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Protein function annotation by homology-based inference.</p>
            </title>
            <aug>
               <au>
                  <snm>Loewenstein</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Raimondo</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Redfern</snm>
                  <fnm>OC</fnm>
               </au>
               <au>
                  <snm>Watson</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Frishman</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Linial</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Orengo</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Thornton</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Tramontano</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2009</pubdate>
            <volume>10</volume>
            <fpage>207</fpage>
            <xrefbib>
               <pubid idtype="doi">10.1186/gb-2009-10-2-207</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>A European virtual institute for genome annotation</p>
            </title>
            <url>http://www.biosapiens.info/</url>
         </bibl>
         <bibl id="B6">
            <title>
               <p>The distributed annotation system</p>
            </title>
            <aug>
               <au>
                  <snm>Dowell</snm>
                  <fnm>RD</fnm>
               </au>
               <au>
                  <snm>Jokerst</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Day</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Eddy</snm>
                  <fnm>SR</fnm>
               </au>
               <au>
                  <snm>Stein</snm>
                  <fnm>L</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2001</pubdate>
            <volume>2</volume>
            <fpage>7</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">11667947</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-2-7</pubid>
                  <pubid idtype="pmcid">58584</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Integrating biological data - the Distributed Annotation System</p>
            </title>
            <aug>
               <au>
                  <snm>Jenkinson</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Albrecht</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Birney</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Blankenburg</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Down</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Finn</snm>
                  <fnm>RD</fnm>
               </au>
               <au>
                  <snm>Hermjakob</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Hubbard</snm>
                  <fnm>TJ</fnm>
               </au>
               <au>
                  <snm>Jimenez</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Jones</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>K&#228;h&#228;ri</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kulesha</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Mac&#237;as</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Reeves</snm>
                  <fnm>GA</fnm>
               </au>
               <au>
                  <snm>Prlic</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2008</pubdate>
            <volume>9 Suppl 8</volume>
            <fpage>S3</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmpid" link="fulltext">18673527</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-9-S8-S3</pubid>
                  <pubid idtype="pmcid">2500094</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Dasty2, an Ajax protein DAS client.</p>
            </title>
            <aug>
               <au>
                  <snm>Jimenez</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Quinn</snm>
                  <fnm>AF</fnm>
               </au>
               <au>
                  <snm>Garcia</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Labarga</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>O'Neill</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Martinez</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Salazar</snm>
                  <fnm>GA</fnm>
               </au>
               <au>
                  <snm>Hermjakob</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2008</pubdate>
            <volume>24</volume>
            <fpage>2119</fpage>
            <lpage>2121</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btn387</pubid>
                  <pubid idtype="pmpid" link="fulltext">18694895</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Dasty2</p>
            </title>
            <url>http://www.ebi.ac.uk/dasty</url>
         </bibl>
         <bibl id="B10">
            <title>
               <p>The Protein Feature Ontology: A Tool for the Unification of Protein Feature Annotations.</p>
            </title>
            <aug>
               <au>
                  <snm>Reeves</snm>
                  <fnm>GA</fnm>
               </au>
               <au>
                  <snm>Eilbeck</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Magrane</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>O'Donovan</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Montecchi-Palazzi</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Harris</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Orchard</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Jimenez</snm>
                  <fnm>RC</fnm>
               </au>
               <au>
                  <snm>Prlic</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hubbard</snm>
                  <fnm>TJP</fnm>
               </au>
               <au>
                  <snm>Hermjakob</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Thornton</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2008</pubdate>
            <volume>24</volume>
            <fpage>2767</fpage>
            <lpage>2772</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btn528</pubid>
                  <pubid idtype="pmpid" link="fulltext">18936051</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <aug>
               <au>
                  <snm>Frishman</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Valencia</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <cnm>(Eds)</cnm>
               </au>
            </aug>
            <source>Modern Genome Annotation. The BioSapiens Network</source>
            <publisher>New York: Springer</publisher>
            <pubdate>2009</pubdate>
         </bibl>
      </refgrp>
   </bm>
</art>
