Open Access Highly Accessed Open Badges Research

A de novo assembly of the newt transcriptome combined with proteomic validation identifies new protein families expressed during tissue regeneration

Mario Looso1, Jens Preussner1, Konstantinos Sousounis3, Marc Bruckskotten1, Christian S Michel1, Ettore Lignelli1, Richard Reinhardt2, Sabrina Höffner1, Marcus Krüger1, Panagiotis A Tsonis3*, Thilo Borchardt1* and Thomas Braun1*

Author Affiliations

1 Max-Planck-Institute for Heart and Lung Research, Ludwigstrasse 43, 61231 Bad Nauheim, Germany

2 Max-Planck Genome Centre Cologne, Carl-von-Linné-Weg 10, 50829 Köln, Germany

3 Department of Biology and Center for Tissue Regeneration and Engineering at Dayton, University of Dayton, OH 45469-2320, USA

For all author emails, please log on.

Genome Biology 2013, 14:R16  doi:10.1186/gb-2013-14-2-r16

Published: 20 February 2013



Notophthalmus viridescens, an urodelian amphibian, represents an excellent model organism to study regenerative processes, but mechanistic insights into molecular processes driving regeneration have been hindered by a paucity and poor annotation of coding nucleotide sequences. The enormous genome size and the lack of a closely related reference genome have so far prevented assembly of the urodelian genome.


We describe the de novo assembly of the transcriptome of the newt Notophthalmus viridescens and its experimental validation. RNA pools covering embryonic and larval development, different stages of heart, appendage and lens regeneration, as well as a collection of different undamaged tissues were used to generate sequencing datasets on Sanger, Illumina and 454 platforms. Through a sequential de novo assembly strategy, hybrid datasets were converged into one comprehensive transcriptome comprising 120,922 non-redundant transcripts with a N50 of 975. From this, 38,384 putative transcripts were annotated and around 15,000 transcripts were experimentally validated as protein coding by mass spectrometry-based proteomics. Bioinformatical analysis of coding transcripts identified 826 proteins specific for urodeles. Several newly identified proteins establish novel protein families based on the presence of new sequence motifs without counterparts in public databases, while others containing known protein domains extend already existing families and also constitute new ones.


We demonstrate that our multistep assembly approach allows de novo assembly of the newt transcriptome with an annotation grade comparable to well characterized organisms. Our data provide the groundwork for mechanistic experiments to answer the question whether urodeles utilize proprietary sets of genes for tissue regeneration.