As a representative eukaryote example, the non-redundant set of human proteins (NCBI's Refseq database ) was compared using BLAST to a data set containing all proteins from 224 prokaryotic genomes: (a) 24 archaebacteria and (b) 200 eubacteria. In each panel, individual genomes are represented by columns and individual proteins by rows; numbers of proteins are indicated on the left and percentage amino-acid identity by the color scale shown on the right. BLAST hits with an e-value ≤ 10-20 and ≥ 20% amino-acid identity were recorded. The percent identity of the best blast hit for each human protein in each prokaryote was color coded as shown on the right and plotted with MATLAB©. The 31 proteins that were used in the recent tree of life  are marked with ticks in column (c). A table containing the numbers, genes, and species underlying the figure is available as additional data file 1.
Dagan and Martin Genome Biology 2006 7:118 doi:10.1186/gb-2006-7-10-118