A quantitative literature-curated gold standard for kinase-substrate pairs
- Equal contributors
1 Department of Molecular Genetics, The Donnelly Centre for Cellular and Biomolecular Research, University of Toronto,160 College Street, Toronto, M3S 3E1, Canada
2 Banting and Best Department of Medical Research, University of Toronto, 112 College Street, Toronto, M5G 1L6, Canada
3 Department of Cell and Systems Biology, University of Toronto, 25 Harbord Street, Toronto, M5S 3G5, Canada
4 Centre for the Analysis of Genome Evolution and Function, University of Toronto, 25 Willcocks Street, Toronto, M5S 3B2, Canada
Genome Biology 2011, 12:R39 doi:10.1186/gb-2011-12-4-r39Published: 14 April 2011
Additional file 1:
Supplementary tables. Table S1: kinases in Yeast KID. The list of kinases was compiled from the review by Rubenstein and Schmidt . Kinases highlighted in blue were not curated in full. Table S2: distribution of kinase interactions in Yeast KID. Of the 127 kinases in budding yeast, all have been curated for HTP and 108 have been curated in LTP categories in Yeast KID, with the remaining 19 in progress (highlighted in blue). The mitogen activated protein kinases (MAPKs) have the highest number of interactions, whereas less characterized kinases (Rio1) have only a few interactions inputted. Table S3: positive training set of curated kinase-substrate pairs. List of bona fide kinase-substrate pairs defined based on curator's consensus. PMIDs for all pairs and the type of interactions used for selection are shown.
Format: XLSX Size: 42KB Download file
Additional file 2:
Supplementary figures. Figure S1: Yeast KID user interface. A screen-shot of the Yeast KID homepage is shown. Experimental categories are hierarchically displayed and queried individually or in combination using the color box (left). Kinases, genes/proteins or PMIDs can be queried either individually or in combination, as single or multiple genes/proteins separated by commas or spaces. For multiple queries, overlapping interactions can be searched using the 'compute gene overlap' and 'compute kinase overlap' functions. Definition of each category and function is displayed by clicking on the small bubble icon for each category. See text for details. Figure S2: hierarchical division of Yeast KID categories. Chart showing 31 experimental categories hierarchically organized in three levels: 1) HTP and LTP categories (green); 2) overall subdivision of genetic, phenotypic, chemical, physical, cell biological or biochemical approaches (blue); 3) specific experimental assays (purple). Figure S3: KID weights of different LTP and HTP experimental categories. Relative contribution of different experimental categories in identifying the positive training kinase-substrate set. The bar graph indicates the contribution of each category to the KID score. Bars highlighted with a red star show significance when comparing categories relative to a random assignment of positive classes. The total number of interactions entered in each KID category is also presented. Red, genetic; pink, physical; blue, biochemical; yellow, phenotypic; purple, cell biological; orange, chemical. Figure S4: distribution of kinase substrates in Yeast KID. The graph shows the distribution of kinase targets reported in Yeast KID at the stringent cutoff (P < 0.01). Cdc28, Cdc5, Snf1 and Pho85 kinases have the largest number of targets in the literature. Thirty-seven curated kinases have no targets in Yeast KID at the stringent cutoff and are not represented on the graph. Figure S5: assessing the quality of HTP datasets in identifying LTP interactions of the same type. Overlap of reported HTP interactions with the respective LTP interactions of equivalent assays. HTP assays enriched for their LTP counterparts are shown in bold. P-values indicate significance. Figure S6: KID schema. The back-end is managed through a customized user control panel that uses a relational database schema to enforce consistent entries. Curated interactions are compiled in a single interaction table that is used to calibrate the contribution score for each category and the overall KID score. Whole database backups are also generated, including logged tracking of curator modifications. The front-end of the database queries the relational back-end schema via Ajax, allowing rapid feedback of requested information. The customized query system (which allows for multiple inputs) is then parsed by the server to find the appropriate interactions to display on the KID interface. KID output can be downloaded in three different formats for further data manipulation.
Format: PDF Size: 511KB Download file
This file can be viewed with: Adobe Acrobat Reader