A novel sodium bicarbonate cotransporter-like gene in an ancient duplicated region: SLC4A9 at 5q31
1 Department of Molecular Biotechnology, University of Washington, Seattle, WA 98195, USA
2 Departments of Medicine and Genetics, University of Washington, Seattle, WA 98195, USA
Genome Biology 2001, 2:research0011-research0011.13 doi:10.1186/gb-2001-2-4-research0011Published: 22 March 2001
Sodium bicarbonate cotransporter (NBC) genes encode proteins that execute coupled Na+ and HCO3- transport across epithelial cell membranes. We report the discovery, characterization, and genomic context of a novel human NBC-like gene, SLC4A9, on chromosome 5q31.
SLC4A9 was initially discovered by genomic sequence annotation and further characterized by sequencing of long-insert cDNA library clones. The predicted protein of 990 amino acids has 12 transmembrane domains and high sequence similarity to other NBCs. The 23-exon gene has 14 known mRNA isoforms. In three regions, mRNA sequence variation is generated by the inclusion or exclusion of portions of an exon. Noncoding SLC4A9 cDNAs were recovered multiple times from different libraries. The 3' untranslated region is fragmented into six alternatively spliced exons and contains expressed Alu, LINE and MER repeats. SLC4A9 has two alternative stop codons and six polyadenylation sites. Its expression is largely restricted to the kidney. In silico approaches were used to characterize two additional novel SLC4A genes and to place SLC4A9 within the context of multiple paralogous gene clusters containing members of the epidermal growth factor (EGF), ankyrin (ANK) and fibroblast growth factor (FGF) families. Seven human EGF-SLC4A-ANK-FGF clusters were found.
The novel sodium bicarbonate cotransporter-like gene SLC4A9 demonstrates abundant alternative mRNA processing. It belongs to a growing class of functionally diverse genes characterized by inefficient highly variable splicing. The evolutionary history of the EGF-SLC4A-ANK-FGF gene clusters involves multiple rounds of duplication, apparently followed by large insertions and deletions at paralogous loci and genome-wide gene shuffling.