We created this webpage in Nov 2016. Since the end of 2019 we have been gradually updating and extending the information. As of July 2020 we updated the URLs and licensing terms of most of the resources. Most of the "pypath methods" and the years of releases are out of date. We will keep updating these, if you find any wrong information please notify us at omnipathdb@gmail.com. About updates of the OmniPath database content please refer to our archive.
This collection is a byproduct of the development of OmniPath, a database built from above 100 resources. Initially OmniPath focused on the literature curatied activity flow networks. Today it covers a much broader range of molecular interaction data, and besides its network database OmniPath has four other databases: enzyme-PTM relationships, protein complexes, molecular annotations (function, localization, structure, etc) and intercellular communication roles. The "omnipath" dataset of the network database follows the principles of the initial release of OmniPath, focusing on high quality, manually curated signaling pathways. The descriptions here cite the relevant sentences about the curation protocols from the original articles and webpages. URLs pointing to the articles and the webpages, and some additional metadata are provided where available. The resources with green title are included by default in OmniPath. pypath methods are listed where available, to know more please look at pypath documentation.
How we collected the license information? We searched for license information in the main, About, Download and FAQ sections of the webpages, and run Google searches for the database name and license. Where we could not find anything about licensing, we assumed no license. Unfortunately due to todays restrictive copyright legislations, users don't have the freedom to use, modify and redistribute the data without a license explicitely granting these to them. Despite the clear intention from the authors to make their data public, and statements on the webpage like "free to use" or "available for download". In these cases we contacted the authors for permission to redistribute their data.
Category || Subcategory >>> Literature curated || Complexes
We retrieved all Biological Units from the PDB (October 2005), which are the protein complexes in their physiological state, according to the PDB curators. [...] After applying these filters, we obtained 21,037 structures, which we use throughout this study.
Category || Subcategory >>> Undefined || Undefined
Contact:
License: GNU General Public License version 2 (GPLv2)
Category || Subcategory >>> Literature curated || Reaction
Released in years: 2008, 2014, 2015, 2016
Created by Curie
Contact:
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Taxons: Human
The map curator studies the body of literature dedicated to the biological process or molecular mechanism of interest. The initial sources of information are the major review articles from high-impact journals that represent the consensus view on the studied topic and also provide a list of original references. The map curator extracts information from review papers and represents it in the form of biochemical reactions in CellDesigner. This level of details reflects the ‘canonical’ mechanisms. Afterwards, the curator extends the search and analyses original papers from the list provided in the review articles and beyond. This information is used to enrich the map with details from the recent discoveries in the field. The rule for confident acceptance and inclusion of a biochemical reaction or a process is the presence of sufficient evidences from more than two studies, preferably from different scientific groups. The content of ACSN is also verified and compared with publicly available databases such as REACTOME, KEGG, WikiPathways, BioCarta, Cell Signalling and others to ensure comprehensive representation of consensus pathways and links on PMIDs of original articles confirmed annotated molecular interactions.
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
Category || Subcategory >>> Undefined || Undefined
Contact:
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Literature curated || Activity flow
Released in years: 2012, 2015
Created by Tokyo Bioinf
Contact:
License: Creative Commons Attribution 3.0 International (CC BY 3.0)
We collected 123 review articles related to AD accessible from PubMed. We then manually curated these review articles, and have built an AD pathway map by using CellDesigner. Molecules are distinguished by the following types: proteins, complexes, simple molecules, genes, RNAs, ions, degraded products, and phenotypes. Gene symbols are pursuant to the HGNC symbols. Reactions are also distinguished by the following categories: state transition, transcription, translation, heterodimer association, dissociation, transport, unknown transition, and omitted transition. All the reactions have evidences to the references in PubMed ID using the MIRIAM scheme. All the references used for constructing the AlzPathway are listed in the ‘References for AlzPathway’. Cellular types are distinguished by the followings: neuron, astrocyte, and microglial cells. Cellular compartments are also distinguished by the followings: brain blood barrier, presynaptic, postsynaptic, and their inner cellular localizations.
References can be fetched only from XML formats, not from the SIF file. Among approx. 150 protein-protein interactions, also contains interactions of many small molecules, denoted by pubchem IDs.
Data integration in pypath: static
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Literature curated || Activity flow
Released in years: 2014
Created by Korcsmaros Group
Contact:
License: Creative Commons Attribution-ShareAlike 3.0 International (CC BY-SA 3.0)
Taxons: Human
From Korcsmaros 2010: ... we first listed signaling proteins and interactions from reviews and then added further signaling interactions of the listed proteins. We used reviews as a starting point, manually looked up interactions three times, and manually searched for interactions of known signaling proteins with no signaling interactions so far in the database.
Category || Subcategory >>> High-throughput || Interaction
Created by Shaw Lab
Contact:
License: Creative Commons Attribution 2.5 International (CC BY 2.5)
Taxons: Human
In order to expand the interaction dataset, we added relevant direct protein–protein interactions from currently available human protein–protein interaction networks (Rual et al., 2005; Stelzl et al., 2005). We also searched public databases, including BIND (Bader et al., 2003), DIP (Xenarios et al., 2002), HPRD (Peri et al., 2003), MINT (Zanzoni et al., 2002), and MIPS (Pagel et al., 2005), to identify literature-based binary interactions involving the 54 ataxia-associated baits and the 561 interacting prey proteins. We identified 4796 binary protein–protein interactions for our Y2H baits and prey proteins (Table S4) and incorporated them in the Y2H protein–protein interaction map (Figures 4A–4C).
The Ataxia network doesn't contain original manual curation effort. The integrated data are very old.
Category || Subcategory >>> Literature curated || Activity flow
Created by Wang Group
Contact:
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Direct data import from: BioCarta, CA1
To construct the human cellular signalling network, we manually curated signalling pathways from literature. The signalling data source for our pathways is the BioCarta database (http://www.biocarta.com/genes/allpathways.asp), which, so far, is the most comprehensive database for human cellular signalling pathways. Our curated pathway database recorded gene names and functions, cellular locations of each gene and relationships between genes such as activation, inhibition, translocation, enzyme digestion, gene transcription and translation, signal stimulation and so on. To ensure the accuracy and the consistency of the database, each referenced pathway was cross-checked by different researchers and finally all the documented pathways were checked by one researcher. In total, 164 signalling pathways were documented (supplementary Table 2). Furthermore, we merged the curated data with another literature-mined human cellular signalling network. As a result, the merged network contains nearly 1100 proteins (SupplementaryNetworkFile). To construct a signalling network, we considered relationships of proteins as links (activation or inactivation as directed links and physical interactions in protein complexes as neutral links) and proteins as nodes.
Category || Subcategory >>> Undefined || Undefined
Contact:
License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Nucleic Acids Research Open Access (NAR Open Access)
"The Nucleic Acids Research Database issue requires the databases to be freely usable, hence any resource published in this journal deemed to be free for both commercial and non-profit use."
Category || Subcategory >>> Literature curated || Activity flow
Released in years: 2006
Created by Community
Contact:
License: BioCarta License
Taxons: Human
Community built pathway database based on expert curation.
This resource includes a huge number of pathways, each curated by experts from a few reviews. The data is not available for download from the original webpage, only from second hand, for example from NCI-PID, in NCI-XML format. However, these files doesn't contain any references, which makes problematic the use of the BioCarta dataset. Also, some pathways are reviewed long time ago, possibly outdated. The Company and the website looks like it was abandoned around 2003-2006.
Category || Subcategory >>> High throughput || Interaction
Released in years: 2003, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019
Created by Tyers Lab
Contact:
License: MIT License (MIT)
Category || Subcategory >>> Literature curated || Activity flow
Released in years: 2005
Created by Iyengar Lab
Contact:
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Taxons: Human, Mouse
Nodes: 545, Edges: 1259
We used published research literature to identify the key components of signaling pathways and cellular machines, and their binary interactions. Most components (~80%) have been described in hippocampal neurons or related neuronal cells. Other components are from other cells, but are included because they are key components in processes known to occur in hippocampal neurons, such as translation. We then established that these interactions were both direct and functionally relevant. All of the connections were individually verified by at least one of the authors of this paper by reading the relevant primary paper(s). We developed a system made of 545 components (nodes) and 1259 links (connections). We used arbitrary but consistent rules to sort components into various groups. For instance, transcription factors are considered a as part of the transcriptional machinery, although it may also be equally valid to consider them as the most downstream component of the central signaling network. Similarly the AMPA receptor-channel (AMPAR) is considered part of the ion channels in the electrical response system since its activity is essential to defining the postsynaptic response, although it binds to and is activated by glutamate, and hence can be also considered a ligand gated receptor-channel in the plasma membrane. The links were specified by two criteria: function and biochemical mechanism. Three types of functional links were specified. This follows the rules used for representation of pathways in Science’s STKE (S1). Links may be activating, inhibitory or neutral. Neutral links do not specify directionality between components, and are mostly used to represent scaffolding and anchoring undirected or bidirectional interactions. The biochemical specification includes defining the reactions as non-covalent binding interactions or enzymatic reactions. Within the enzymatic category, reactions were further specified as phosphorylation, dephosphorylation, hydrolysis, etc. These two criteria for specification are independent and were defined for all interactions. For the analyses in this study we only used the functional criteria: activating, inhibitory or neutral specifications. We chose papers that demonstrated direct interactions that were supported by either biochemical or physiological effects of the interactions. From these papers we identified the components and interactions that make up the system we analyzed. During this specification process we did not consider whether these interactions would come together to form higher order organizational units. Each component and interaction was validated by a reference from the primary literature (1202 papers were used). A list of authors who read the papers to validate the components and interactions is provided under authors contributions.
One of the earliest manually curated networks, available in easily accessible tabular format, including UniProt IDs and PubMed references.
Data integration in pypath: dynamic
Category || Subcategory >>> Literature curated || Interaction
Created by Bader Lab
Contact:
License: Creative Commons Attribution 2.5 International (CC BY 2.5)
Taxons: Human, Mouse, Rat
Manually curated data, unpublished. A team of M.Sc. and Ph.D. biologists at the Institute of Bioinformatics in Bangalore, India read original research papers and hand-entered the pathway data into our database. The quality of the Cancer Cell Map pathways is very high. Half of the pathways were reviewed by experts at Memorial Sloan-Kettering Cancer Center and were found to contain only a few errors, which were subsequently fixed. A pathway is a collection of all genes/proteins that have been described as pathway members in any publication and all the interactions between them that can be found described in the literature.
One of the earliest manually curated datasets, now only available from second hand, e.g. from PathwayCommons. Included in many other resources. Contains binary interactions with PubMed references.
Data integration in pypath: dynamic
Category || Subcategory >>> Undefined || Undefined
License: Unspecified Non-commercial Redistributable (Unspecified NC-SA)
"This is not a real license but a temporary label meaning either that we are working on clarifying the licensing terms (because the copyright holders hasn't specified it anywhere) or we as OmniPath got a permission from the copyright holders to redistribute the data from the resource."
Category || Subcategory >>> Undefined || Undefined
Contact:
License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
Category || Subcategory >>> Literature curated || Pathway
Contact:
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
Contact:
License: Artistic License 2.0 (Artistic-2.0)
Category || Subcategory >>> Undefined || Undefined
Created by Bader Lab
Contact:
License: CellCellInteractions License
Category || Subcategory >>> Undefined || Undefined
Contact:
License: GNU General Public License version 3 (GPLv3)
Category || Subcategory >>> Undefined || Undefined
Contact:
License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
Category || Subcategory >>> Literature curated || Complexes, Annotations, Network, Intercell
License: MIT License (MIT)
Heteromeric receptors and ligands (that is, proteins that are complexes of multiple gene products) were annotated by reviewing the literature and Uniprot descriptions.
Category || Subcategory >>> Undefined || Undefined
License: No license (No license)
Category || Subcategory >>> Undefined || Undefined
Contact:
License: GNU General Public License version 3 (GPLv3)
Category || Subcategory >>> Undefined || Undefined
Contact:
License: Creative Commons Attribution 3.0 International (CC BY 3.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution-ShareAlike 3.0 International (CC BY-SA 3.0)
Category || Subcategory >>> Undefined || Undefined
License: EMBL-EBI terms of use (EMBL-EBI)
Category || Subcategory >>> Undefined || Undefined
License: Constituting databases carry their own licenses (Composite)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Literature curated, Prediction || Complexes
License: Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Literature-based complex information was retrieved from databases such as CORUM, PINdb, CYC2008, GO, KEGG, and Drosophila AP-MS pulldown complexes (table S1). With the exception of protein complexes that are annotated by GO, all the other complexes were mapped across human, Drosophila, and yeast. [...] We applied CFinder to identify protein complexes from human, Drosophila, and yeast PPI networks. We filtered the PPI networks using co-expression values or colocalization information to remove low-confidence PPIs
Category || Subcategory >>> Literature curated || Complexes
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
The Complex Portal is a manually curated, encyclopaedic database that collates and summarizes information on stable, macromolecular complexes of known function.
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Category || Subcategory >>> Undefined || Undefined
Contact:
License: MIT License (MIT)
Category || Subcategory >>> Literature curated || Activity flow
Released in years: 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015
Contact:
License: Constituting databases carry their own licenses (Composite)
Taxons: Human, Mouse, Yeast
Interaction data in ConsensusPathDB currently originates from 12 interaction databases and comprises physical interactions, biochemical reactions and gene regulations. Importantly, the source of physical entities and interactions is always recorded, which allows linking to the original data in the source database.
ConsensusPathDB comprises data from 32 resources. The format is easy to use, tab delimited text file, with UniProtKB names and PubMed IDs. However, the dataset is extremely huge, and several databases containing HTP data is included.
Category || Subcategory >>> Literature curated || Complexes
Released in years: 2007, 2009
Contact:
License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
Taxons: Human, Mouse, Rat
The CORUM database is a collection of experimentally verified mammalian protein complexes. Information is manually derived by critical reading of the scientific literature from expert annotators. Information about protein complexes includes protein complex names, subunits, literature references as well as the function of the complexes. In order to provide a high-quality dataset of mammalian protein complexes, all entries are manually created. Only protein complexes which have been isolated and characterized by reliable experimental evidence are included in CORUM. To be considered for CORUM, a protein complex has to be isolated as one molecule and must not be a construct derived from several experiments. Also, artificial constructs of subcomplexes are not taken into account. Since information from high-throughput experiments contains a significant fraction of false-positive results, this type of data is excluded. References for relevant articles were mainly found in general review articles, cross-references to related protein complexes within analysed literature and comments on referenced articles in UniProt. In order to obtain a high quality and reliability of the data, we only include protein complexes that have been isolated and characterized in individual experiments. [...] Experienced biocurators critically extract information from the scientific literature and transfer it into CORUM using established vocabularies and stable identifiers from well-known resources such as UniProt and Gene Ontology.
CORUM is not part of the OmniPath pathways network, because we did not applied any complex expansion. It is part of the OmniPath protein complex database.
Data integration in pypath: dynamic
Category || Subcategory >>> Undefined || Undefined
License: COSMIC License
Category || Subcategory >>> Undefined || Undefined
Contact:
License: No license (No license)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Literature curated || Activity flow
Released in years: 2005, 2015
Created by CST
Contact:
License: No license (No license)
On these resource pages you can find signaling pathway diagrams, research overviews, relevant antibody products, publications, and other research resources organized by topic. The pathway diagrams associated with these topics have been assembled by CST scientists and outside experts to provide succinct and current overviews of selected signaling pathways.
The pathway diagrams are based on good quality, manually curated data, probably from review articles. However, those are available only in graphical (PDF and InDesign) formats. There is no programmatic way to obtain the interactions and references, as it was confirmed by the authors, who I contacted by mail. Wang's HumanSignalingNetwork includes the data from this resource, which probably has been entered manually, but Wang's data doesn't have source annotations, despite it's compiled from multiple sources. The date of the beginning of this project is estimated using the Internet wayback machine.
Category || Subcategory >>> Undefined || Undefined
License: Unspecified Non-commercial Redistributable (Unspecified NC-SA)
"This is not a real license but a temporary label meaning either that we are working on clarifying the licensing terms (because the copyright holders hasn't specified it anywhere) or we as OmniPath got a permission from the copyright holders to redistribute the data from the resource."
Category || Subcategory >>> Literature curated || Activity flow
Created by Wang Group
Contact:
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Taxons: Human
Nodes: 1528, Edges: 4249
Direct data import from: Awan2007, CancerCellMap
To build up the human signaling network, we manually curated the signaling molecules (most of them are proteins) and the interactions between these molecules from the most comprehensive signaling pathway database, BioCarta (http://www.biocarta.com/). The pathways in the database are illustrated as diagrams. We manually recorded the names, functions, cellular locations, biochemical classifications and the regulatory (including activating and inhibitory) and interaction relations of the signaling molecules for each signaling pathway. To ensure the accuracy of the curation, all the data have been crosschecked four times by different researchers. After combining the curated information with another literature‐mined signaling network that contains ∼500 signaling molecules (Ma'ayan et al, 2005)[this is the CA1], we obtained a signaling network containing ∼1100 proteins (Awan et al, 2007). We further extended this network by extracting and adding the signaling molecules and their relations from the Cancer Cell Map (http://cancer.cellmap.org/cellmap/), a database that contains 10 manually curated signaling pathways for cancer. As a result, the network contains 1634 nodes and 5089 links that include 2403 activation links (positive links), 741 inhibitory links (negative links), 1915 physical links (neutral links) and 30 links whose types are unknown (Supplementary Table 9). To our knowledge, this network is the biggest cellular signaling network at present.
Excellent signaling network with good topology for all those who doesn't mind to use data of unknown origin. Supposedly a manually curated network, but data files doesn't include article references. Merging CA1 network with CancerCellMap and BioCarta (also without references) makes the origin of the data untraceable.
Category || Subcategory >>> Undefined || Undefined
Contact:
License: CytoSig License
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Literature curated || Ptm
Released in years: 2005, 2009, 2012, 2015
Created by ISBLab
Contact:
License: Nucleic Acids Research Open Access (NAR Open Access)
"The Nucleic Acids Research Database issue requires the databases to be freely usable, hence any resource published in this journal deemed to be free for both commercial and non-profit use."
Taxons: Human, Metazoa, Bacteria, Plants, Yeast
Due to the inaccessibility of database contents in several online PTM resources, a total 11 biological databases related to PTMs are integrated in dbPTM, including UniProtKB/SwissProt, version 9.0 of Phospho.ELM, PhosphoSitePlus, PHOSIDA, version 6.0 of O-GLYCBASE, dbOGAP, dbSNO, version 1.0 of UbiProt, PupDB, version 1.1 of SysPTM and release 9.0 of HPRD.
Data integration in pypath: dynamic
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
Category || Subcategory >>> Literature curated || Activity flow
Released in years: 2011, 2012
Created by Myoungji University
Contact:
License: Nucleic Acids Research Open Access (NAR Open Access)
"The Nucleic Acids Research Database issue requires the databases to be freely usable, hence any resource published in this journal deemed to be free for both commercial and non-profit use."
Taxons: Human
Nodes: 99, Edges: 175
The PubMed database was used as the primary source for collecting information and constructing the DD database. After finding synonyms for each of the 99 DD superfamily proteins using UniProtKB and Entrez Gene, we obtained a list of articles using each name of the proteins and its synonyms on a PubMed search, and we selected the articles that contained evidence for physical binding among the proteins denoted. We also manually screened information that was in other databases, such as DIP, IntAct, MINT, STRING and Entrez Gene. All of the 295 articles used for database construction are listed on our database website.
Detailful dataset with many references. Sadly the data can be extracted only by parsing HTML. It doesn't mean more difficulty than parsing XML formats, just these are not intended to use for this purpose.
Data integration in pypath: static
Category || Subcategory >>> Literature curated || Post-translational modification
Released in years: 2013, 2014, 2016
Created by EMBL & EMBL-EBI
Contact:
License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
Taxons: Human
DEPOD the human DEPhOsphorylation Database (version 1.0) is a manually curated database collecting human active phosphatases, their experimentally verified protein and non-protein substrates and dephosphorylation site information, and pathways in which they are involved. It also provides links to popular kinase databases and protein-protein interaction databases for these phosphatases and substrates. DEPOD aims to be a valuable resource for studying human phosphatases and their substrate specificities and molecular mechanisms; phosphatase-targeted drug discovery and development; connecting phosphatases with kinases through their common substrates; completing the human phosphorylation/dephosphorylation network.
Nice manually curated dataset with PubMed references, in easily accessible MITAB format with UniProt IDs, comprises 832 dephosphorylation reactions on protein substrates, and few hundreds on small molecules.
Category || Subcategory >>> Undefined || Undefined
License: MIT License (MIT)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Literature curated || Interaction
Released in years: 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016
Created by UCLA, Eisenberg Group
Contact:
License: Creative Commons Attribution-NoDerivatives 3.0 International (CC BY-ND 3.0)
In the beginning (near 2000), it was a entirely manually curated database:
The 'core' dataset contains manually curated interactions from small-scale studies. Interactions are well annotated with PubMed IDs, evidences, and mechanism (binding, chemical reaction, etc). The format is esily accessible (MITAB).
Data integration in pypath: static
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Disease-gene associations mined from the literature
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Unspecified Non-commercial Redistributable (Unspecified NC-SA)
"This is not a real license but a temporary label meaning either that we are working on clarifying the licensing terms (because the copyright holders hasn't specified it anywhere) or we as OmniPath got a permission from the copyright holders to redistribute the data from the resource."
Category || Subcategory >>> Literature curated || Ptm
Released in years: 2006
Created by Cesareni Group
Contact:
License: Creative Commons Attribution 2.5 International (CC BY 2.5)
Taxons: Human, Yeast, C. elegans, Mouse, Rat, HIV, D. melanogaster, A. thaliana, X. laevis, B. taurus, G. gallus, O. cuniculus, Plasmodium falciparum
DOMINO aims at annotating all the available information about domain-peptide and domain–domain interactions. The core of DOMINO, of July 24, 2006 consists of more than 3900 interactions extracted from peer-reviewed articles and annotated by expert biologists. A total of 717 manuscripts have been processed, thus covering a large fraction of the published information about domain–peptide interactions. The curation effort has focused on the following domains: SH3, SH2, 14-3-3, PDZ, PTB, WW, EVH, VHS, FHA, EH, FF, BRCT, Bromo, Chromo and GYF. However, interactions mediated by as many as 150 different domain families are stored in DOMINO.
Category || Subcategory >>> Undefined || Undefined
License: Constituting databases carry their own licenses (Composite)
Category || Subcategory >>> Undefined || Undefined
License: Constituting databases carry their own licenses (Composite)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Category || Subcategory >>> Literature curated || Post-translational modifications
Released in years: 2003, 2008, 2009, 2012, 2013, 2014, 2016
Created by ELM Consortium
Contact:
License: ELM Software License Agreement
Data integration in pypath: dynamic
Category || Subcategory >>> Undefined || Undefined
Contact:
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Inter-cellular ligand-receptor interaction networks were calculated based on the publically available database collated by Ramilowski and co-workers (Ramilowski et al., 2015).
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Academic Free License v3.0 (AFL 3.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Literature curated || Activity flow, Complexes
Released in years: 2007, 2008, 2009, 2011, 2013, 2014, 2015, 2016
Contact:
License: Creative Commons Attribution-ShareAlike 3.0 International (CC BY-SA 3.0)
Presently, the resource describes the interactions between target proteins and 6064 distinct ligand entities (Table 1). Ligands are listed against targets by their action (e.g. activator, inhibitor), and also classified according to substance types and their status as approved drugs. Classes include metabolites (a general category for all biogenic, non-peptide, organic molecules including lipids, hormones and neurotransmitters), synthetic organic chemicals (e.g. small molecule drugs), natural products, mammalian endogenous peptides, synthetic and other peptides including toxins from non-mammalian organisms, antibodies, inorganic substances and other, not readily classifiable compounds.
Category || Subcategory >>> Undefined || Undefined
License: No license (No license)
Category || Subcategory >>> Original experiment, Prediction || Complexes
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
To isolate human protein complexes in a sensitive and unbiased manner, we subjected cytoplasmic and nuclear soluble protein extracts isolated from human HeLa S3 and HEK293 cells grown as suspension and adherent cultures, respectively, to extensive complementary biochemical fractionation procedures. [...] Because physically interacting cocomplexed proteins often perform related biological functions and are often evolutionarily coconserved , we devised a machine learning procedure to score and select higher-confidence physical interactions based on both the experimentally measured coelution profiles and the existence of additional supporting functional association evidence inferred from correlated evolutionary rates and functional genomics data sets compiled for H. sapiens, S. cerevisiae, D. melanogaster, and C. elegans.
Category || Subcategory >>> Undefined || Undefined
License: HGNC License
"It is a condition of our funding from NIH and the Wellcome Trust that the nomenclature and information we provide is freely available to all."
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution-ShareAlike 3.0 International (CC BY-SA 3.0)
Category || Subcategory >>> Literature curated || Activity flow, Complexes, Annotations, Intercell
License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
This review provides a global categorization of most known signal transduction-associated receptors as enzymes, recruiters, and latent transcription factors. [...] The human receptor families involved in signaling (with the exception of channels) are presented in the Human Plasma Membrane Receptome database.
Category || Subcategory >>> Undefined || Undefined
License: HPO License
Category || Subcategory >>> Literature curated || Post-translational modification
Released in years: 2002, 2005, 2009, 2010
Contact:
License: HPRD License
The information about protein-protein interactions was cataloged after a critical reading of the published literature. Exhaustive searches were done based on keywords and medical subject headings (MeSH) by using Entrez. The type of experiments that served as the basis for establishing protein-protein interactions was also annotated. Experiments such as coimmunoprecipitation were designated in vivo, GST fusion and similar “pull-down” type of experiments were designated in vitro, and those identified by yeast two-hybrid were annotated as yeast two-hybrid.
Category || Subcategory >>> Undefined || Undefined
License: GNU Lesser General Public License version 3 (LGPLv3)
Category || Subcategory >>> Literature curated, Original experiment || Complexes
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Here, through the synthesis of over 9,000 published mass spectrometry experiments, we present hu.MAP, the most comprehensive and accurate human protein complex map to date, containing > 4,600 total complexes, > 7,700 proteins, and > 56,000 unique interactions, including thousands of confident protein interactions not identified by the original publications. [...] Second, we developed a machine learning framework that can easily incorporate new data types to build more comprehensive protein complex maps by integrating evidence across many experiments.
Category || Subcategory >>> Literature curated, Original experiment || Complexes
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
Contact:
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> High throughput and literature curated || Post-translational modification
Released in years: 2012, 2015
Created by Cesareni Group
Contact:
License: No license (No license)
In order to offer a proteome-wide perspective of the phosphatase interactome, we have embarked on an extensive text-mining-assisted literature curation effort to extend phosphatase interaction information that was not yet covered by protein–protein interaction (PPI) databases. Interaction evidence captured by expert curators was annotated in the protein interaction database MINT according to the rapid curation standard. This data set was next integrated with protein interaction information from three additional major PPI databases, IntAct, BioGRID and DIP. These databases are part of the PSIMEx consortium and adopt a common data model and common controlled vocabularies, thus facilitating data integration. Duplicated entries were merged and redundant interactions have been removed.
The database is dynamically updated, so is up to date at any given time. That's why it is marked as up to date in 2015, despite it has no new release after 2012.
Category || Subcategory >>> High-throughput || Yeast 2 hybrid
Released in years: 2012, 2014, 2016
Contact:
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> High-throughput || Yeast 2 hybrid
Released in years: 2013, 2017
Created by CCSB
Contact:
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
High-quality non-systematic Literature dataset. In 2013, we extracted interaction data from BIND, BioGRID, DIP, HPRD, MINT, IntAct, and PDB to generate a high-quality binary literature dataset comprising ~11,000 protein-protein interactions that are binary and supported by at least two traceable pieces of evidence (publications and/or methods) (Rolland et al Cell 2014). Although this dataset does not result from a systematic investigation of the interactome search space and should thus be used with caution for any network topology analyses, it represents valuable interactions for targeted studies and is freely available to the research community through the search engine or via download.
Category || Subcategory >>> Undefined || Undefined
License: I2D License
Category || Subcategory >>> Literature curated || Activity flow, Annotations, Intercell, Complexes
Contact:
License: GNU General Public License version 3 (GPLv3)
ICELLNET, a transcriptomic-based framework integrating: 1) an original expert-curated database of ligand-receptor interactions accounting for multiple subunits expression, 2) quantification of communication scores, 3) the possibility to connect a cell population of interest with 31 reference human cell types (BioGPS), and 4) three visualization modes to facilitate biological interpretation.
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 2.5 International (CC BY 2.5)
Category || Subcategory >>> Literature curated || Interaction
Released in years: 2008, 2010, 2013, 2014, 2015
Created by Brinkman Lab, Hancock Lab, Lynn Group
Contact:
License: Design Science License (DSL)
InnateDB (www.innatedb.com) is a database and integrated analysis platform specifically designed to facilitate systems-level analyses of the mammalian innate immune response (Lynn et al. 2008; 2010, 2013). To enrich our knowledge of innate immunity networks and pathways, the InnateDB curation team has contextually annotated >25,000 human and mouse innate immunity-relevant molecular interactions through the review of >5,000 biomedical articles. Curation adheres to the MIMIx guidelines and new interactions are added weekly. Importantly, interactions are curated between molecules with a documented role in an innate immunity relevant biological process or pathway and all other interactors regardless of whether the interacting molecule has any known role in innate immunity. This approach captures interactions between the innate immune system and other systems. InnateDB is not limited to data on the innate immune system. It is a comprehensive database of human, mouse and bovine molecular interactions and pathways, consisting of more than 300,000 molecular interactions and 3,000+ pathways, integrated from major public molecular interaction and pathway databases. InnateDB is also an analysis platform offering user-friendly bioinformatics tools, including pathway and ontology analysis, network visualization and analysis and the ability to upload and analyze user-supplied gene expression or other quantitative data in a network and/or pathway context. The platform has a global profile and is utilised by >10,000 users per annum and is widely cited. A mirror of the site hosted in Australia is also available at innatedb.sahmri.com. Note that new interactions and gene annotations are added to InnateDB on an almost weekly database so the data is being continuously updated.
Probably the largest manually curated binary protein interaction dataset, developed by a dedicated full time team of curators. Formats are clear and accessible, comprising UniProt IDs, PubMed references, experimental evidences and mechanisms.
Data integration in pypath: static
Category || Subcategory >>> Literature curated and high-throughput || Interaction
Released in years: 2003, 2006, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019
Created by EBI
Contact:
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Direct data import from: InnateDB, MINT
The information within the IntAct database primarily consists of protein–protein interaction (PPI) data. The majority of the PPI data within the database is annotated to IMEx standards, as agreed by the IMEx consortium. All such records contain a full description of the experimental conditions in which the interaction was observed. This includes full details of the constructs used in each experiment, such as the presence and position of tags, the minimal binding region defined by deletion mutants and the effect of any point mutations, referenced to UniProtKB, the underlying protein sequence database. Protein interactions can be described down to the isoform level, or indeed to the post-translationally cleaved mature peptide level if such information is available in the publication, using the appropriate UniProtKB identifiers.
We can not draw a sharp distinction between low and high throughput methods, and I can agree, that this is not the only and best measure of quality considering experimental data. I see that IntAct came up with a good solution to estimate the confidence of interactions. The mi-score system gives a comprehensive way to synthetize information from multiple experiments, and weight interactions according to experimental methods, interaction type, and number of evidences.
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Zero 1.0 Universal (CC0 1.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Zero 1.0 Universal (CC0 1.0)
Category || Subcategory >>> Undefined || Undefined
Contact:
License: Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
Category || Subcategory >>> Undefined || Undefined
Contact:
License: Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International (CC BY-NC-SA 3.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Unspecified Non-commercial Redistributable (Unspecified NC-SA)
"This is not a real license but a temporary label meaning either that we are working on clarifying the licensing terms (because the copyright holders hasn't specified it anywhere) or we as OmniPath got a permission from the copyright holders to redistribute the data from the resource."
Category || Subcategory >>> Literature curated || Process description
Released in years: 2000, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016
Contact:
License: KEGG License
From 2011, KEGG data is not freely available. The downloadable KGML files contain binary interactions, most of them between large complexes. No references available.
Category || Subcategory >>> Literature curated || Activity flow, Complexes
Released in years: 2017, 2018, 2019
License: Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Kinase.com License (Kinase.com)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution-NonCommercial-NoDerivs 3.0 International (CC BY-NC-ND 3.0)
Category || Subcategory >>> Undefined || Undefined
License: Elsevier User License (EUL)
Category || Subcategory >>> Combined || Mixed
Released in years: 2014
Created by Laudanna Lab
Contact:
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Direct data import from: BioGRID, ConsensusPathDB, dbPTM, DIP, HumanSignalingNetwork, IntAct, MINT, MPPI, PathwayCommons, phospho.ELM, PhosphoPoint, PhosphoSite, SignaLink
Data sets are compiled from public data-bases and from literature and manually curated for accuracy. They are intended for network reconstruction, topological and multidimensional analysis in cell biology.
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution-NonCommercial 3.0 International (CC BY-NC 3.0)
Category || Subcategory >>> Literature curated || Post-translational modifications
Released in years: 2015
Created by Bose Institute
Contact:
License: Creative Commons Attribution-NonCommercial 2.0 International (CC BY-NC 2.0)
LMPID (Linear Motif mediated Protein Interaction Database) is a manually curated database which provides comprehensive experimentally validated information about the LMs mediating PPIs from all organisms on a single platform. About 2200 entries have been compiled by detailed manual curation of PubMed abstracts, of which about 1000 LM entries were being annotated for the first time, as compared with the Eukaryotic LM resource.
Category || Subcategory >>> Undefined || Undefined
License: Nucleic Acids Research Open Access (NAR Open Access)
"The Nucleic Acids Research Database issue requires the databases to be freely usable, hence any resource published in this journal deemed to be free for both commercial and non-profit use."
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Nucleic Acids Research Open Access (NAR Open Access)
"The Nucleic Acids Research Database issue requires the databases to be freely usable, hence any resource published in this journal deemed to be free for both commercial and non-profit use."
Category || Subcategory >>> Undefined || Undefined
Contact:
License: GNU General Public License version 3 (GPLv3)
Category || Subcategory >>> Literature curated || Activity flow
Contact:
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Ongoing analysis of macrophage-related datasets and an interest in consolidating our knowledge of a number of signalling pathways directed our choice of pathways to be mapped (see Figure 1). Public and propriety databases were initially used as resources for data mining, but ultimately all molecular interaction data was sourced from published literature. Manual curation of the literature was performed to firstly evaluate the quality of the evidence supporting an interaction and secondly, to extract the necessary and additional pieces of information required to 'understand' the pathway and construct an interaction diagram. We have drawn pathways based on our desire to model pathways active in a human macrophage and therefore all components have been depicted using standard human gene nomenclature (HGNC). However, our understanding of the pathway components and the interactions between them, have been drawn largely from a consensus view of literature knowledge. As such the pathways presented here are based on data derived from a range of different cellular systems and mammalian species (human and mouse).
Data integration in pypath: static
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
Category || Subcategory >>> Literature curated || Interaction
Released in years: 2009, 2011, 2015
Contact:
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Taxons: Mammalia
Protein data were imported from the UniProtKB/Swiss-Prot database (Bairoch et al., 2005) and identified by UniProtKB/SwissProt accession numbers. In order to list all the partners of a protein, interactions are associated by default to the accession number of the human protein. The actual source species used in experiments is indicated in the page reporting interaction data. Intracellular and membrane proteins were included to obtain a comprehensive network of the partners of extracellular molecules. Indeed, ECM proteins and GAGs bind to a number of membrane proteins or cell-associated proteoglycans and some of them interact with intracellular partners upon internalization (Dixelius et al., 2000). ECM proteins were identified by the UniProtKB/Swiss-Prot keyword ‘extracellular matrix’ and by the GO terms ‘extracellular matrix’, ‘proteinaceous extracellular matrix’ and their child terms. The proteins annotated with the GO terms ‘extracellular region’ and ‘extracellular space’, which are used for proteins found in biological fluids, were not included because circulating molecules do not directly contribute to the extracellular scaffold. Additionally, 96 proteins were manually (re-)annotated through literature curation. MatrixDB integrates 1378 interactions from the Human Protein Reference Database (HPRD, Prasad et al., 2009), 211 interactions from the Molecular INTeraction database (MINT, Chatr-Aryamontri et al., 2007), 46 interactions from the Database of Interacting Proteins (DIP, Salwinski et al., 2004), 232 interactions from IntAct (Kerrien et al., 2007a) and 839 from BioGRID (Breitkreutz et al., 2008) involving at least one extracellular biomolecule of mammalian origin. We added 283 interactions from manual literature curation and 65 interactions from protein and GAG array experiments.
The interactions imported from IMEX databases or any other database, are collected separately, in the PSICQUIC-extended dataset. The MatrixDB-core dataset is curated manually by the MatrixDB team.
Data integration in pypath: static
Category || Subcategory >>> Undefined || Undefined
License: Unspecified Non-commercial Redistributable (Unspecified NC-SA)
"This is not a real license but a temporary label meaning either that we are working on clarifying the licensing terms (because the copyright holders hasn't specified it anywhere) or we as OmniPath got a permission from the copyright holders to redistribute the data from the resource."
Category || Subcategory >>> Undefined || Undefined
License: Apache License, version 2.0 (Apache 2.0)
Category || Subcategory >>> Undefined || Undefined
License: GNU Lesser General Public License version 3 (LGPLv3)
Category || Subcategory >>> Literature curated and high-throughput || Interaction
Released in years: 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015
Contact:
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution-NonCommercial 2.0 International (CC BY-NC 2.0)
Category || Subcategory >>> Undefined || Undefined
Contact:
License: Creative Commons Zero 1.0 Universal (CC0 1.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
Category || Subcategory >>> Undefined || Undefined
License: miRecords License
"All information available from this site is within the public domain."
Category || Subcategory >>> Undefined || Undefined
License: miRTarBase License
Category || Subcategory >>> Literature curated || Interaction
Released in years: 2000, 2005
Created by MIPS Munich
Contact:
License: MPPI License
Taxons: Human, Mammalia
The first and foremost principle of our MPPI database is to favor quality over completeness. Therefore, we decided to include only published experimental evidence derived from individual experiments as opposed to large-scale surveys. High-throughput data may be integrated later, but will be marked to distinguish it from evidence derived from individual experiments.
This database contains hundreds of interactions curated manually from original papers. The format is perfect, with UniProt IDs, and PubMed references.
Data integration in pypath: static
Category || Subcategory >>> Undefined || Undefined
Contact:
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Literature curated || Process description
Released in years: 2008, 2012
Created by NCI
Contact:
License: 3-clause BSD License (BSD)
Taxons: Human
Direct data import from: BioCarta, Reactome
In curating, editors synthesize meaningful networks of events into defined pathways and adhere to the PID data model for consistency in data representation: molecules and biological processes are annotated with standardized names and unambiguous identifiers; and signaling and regulatory events are annotated with evidence codes and references. To ensure accurate data representation, editors assemble pathways from data that is principally derived from primary research publications. The majority of data in PID is human; however, if a finding discovered in another mammal is also deemed to occur in humans, editors may decide to include this finding, but will also record that the evidence was inferred from another species. Prior to publication, all pathways are reviewed by one or more experts in a field for accuracy and completeness.
From the NCI-XML interactions with references, directions and signs can be extracted. Complexes are ommited.
From the end of 2015, the original NCI-PID webpage is not accessible anymore, and the data is available through the NDEx webserver and API.
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
Category || Subcategory >>> Literature curated || Negative
Contact:
License: No license (No license)
Annotation of the manual dataset was performed analogous to the annotation of protein–protein interactions and protein complexes in previous projects published by our group. Information about NIPs was extracted from scientific literature using only data from individual experiments but not from high-throughput experiments. Only mammalian proteins were considered. Data from high-throughput experiments were omitted in order to maintain the highest possible standard of reliability.
Data integration in pypath: static
Category || Subcategory >>> Literature curated || Process description
Released in years: 2010, 2011, 2012, 2013, 2014, 2015
Created by Pandey Lab, IOB Bangalore
Contact:
License: Creative Commons Attribution 2.5 International (CC BY 2.5)
Direct data import from: CancerCellMap
Includes data from: CancerCellMap
The initial annotation process of any signaling pathway involves gathering and reading of review articles to achieve a brief overview of the pathway. This process is followed by listing all the molecules that arereported to be involved in the pathway under annotation. Information regarding potential pathway authorities are also gathered at this initial stage. Pathway experts are involved in initial screening of the molecules listed to check for any obvious omissions. In the second phase, annotators manually perform extensive literature searches using search keys, which include all the alter native names of the molecules involved, the name of the pathway, the names of reactions, and so on. In addition, the iHOP resource is also used to perform advanced PubMed-based literature searches to collect the reactions that were reported to be implicated in a given pathway. The collected reactions are manually entered using the PathBuilder annotation interface, which is subjected to an internal review process involving PhD level scientists with expertise in the areas of molecular biology, immunology and biochemistry. However, there are instances where a molecule has been implicated in a pathway in a published report but the associated experimental evidence is either weak or differs from experiments carried out by other groups. For this purpose, we recruit several investigators as pathway authorities based on their expertise in individual signaling pathways. The review by pathway authorities occasionally leads to correction of errors or, more commonly, to inclusion of additional information. Finally, the pathway authorities help in assessing whether the work of all major laboratories has been incorporated for the given signaling pathway.
Formats are unclear. The tab delimited format contains the pathway memberships of genes, PubMed references, but not the interaction partners! The Excel file is very weird, in fact it is not an excel table, and contains only a few rows from the tab file. The PSI-MI XML is much better. By writing a simple parser, a lot of details can be extracted.
Data integration in pypath: dynamic
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: GNU Lesser General Public License version 3 (LGPLv3)
Category || Subcategory >>> Literature curated || Activity flow
Released in years: 2013
Created by Korcsmaros Group
Contact:
License: Creative Commons Attribution-ShareAlike 3.0 International (CC BY-SA 3.0)
Taxons: Human
From Korcsmaros 2010: ... we first listed signaling proteins and interactions from reviews and then added further signaling interactions of the listed proteins. We used reviews as a starting point, manually looked up interactions three times, and manually searched for interactions of known signaling proteins with no signaling interactions so far in the database.
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: No license (No license)
Drug side effects and drug-drug interactions were mined from publicly available data. OffSIDES is a database of drug side-effects that were found, but are not listed on the official FDA label. TwoSIDES is the only comprehensive database drug-drug-effect relationships. Over 3,300 drugs and 63,000 combinations connected to millions of potential adverse reactions.
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution-ShareAlike 3.0 International (CC BY-SA 2.5)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Open Targets mixed license
Category || Subcategory >>> Undefined || Undefined
License: Apache License, version 2.0 (Apache 2.0)
Category || Subcategory >>> Literature curated & high throughput || Transcription regulation
Contact:
License: GNU Lesser General Public License version 3 (LGPLv3)
One of the largest TF-target databases. Covers at least 18 organisms and contains data from literature curation and many screening technologies and in silico prediction.
Category || Subcategory >>> Undefined || Undefined
Contact:
License: Informal license (Informal)
Category || Subcategory >>> Literature curated || Process description
Released in years: 2000, 2001, 2002, 2003, 2005, 2006, 2010, 2011, 2012, 2014, 2016
Contact:
License: GNU General Public License version 2 (GPLv2)
References are captured at three levels. First, each pathway as a whole requires a reference. For signaling pathways, at least three references, usually review papers, are required in order to provide a more objective view of the scope of the pathway. For metabolic pathways, a textbook reference is usually sufficient. Second, references are often associated to each molecule class in the pathway. Most of these references are OMIM records or review papers. Third, references are provided to support association of specific protein sequences with a particular molecule class, e.g., the SWISS-PROT sequence P53_HUMAN annotated as an instance of the molecule class ‘‘P53’’ appearing in the pathway class ‘‘P53 pathway’’. These are usually research papers that report the experimental evidence that a particular protein or gene participates in the reactions represented in the pathway diagram.
Category || Subcategory >>> Undefined || Undefined
License: No license (No license)
A database of pathogens and their phenotypes for diagnostic support in infections.
Category || Subcategory >>> Combined || Interaction
Released in years: 2010, 2011, 2012, 2013, 2014, 2015, 2016
Created by Bader Lab, MSKCC cBio
Contact:
License: PathwayCommons License
Direct data import from: Reactome, NCI-PID, CancerCellMap, BioCarta, HPRD, PhosphoSite, PANTHER, DIP, IntAct, BioGRID, BIND, CORUM
Pathway Commons is a collection of publicly available pathway information from multiple organisms. It provides researchers with convenient access to a comprehensive collection of biological pathways from multiple sources represented in a common language for gene and metabolic pathway analysis.
Pathway Commons integrates a number of pathway and molecular interaction databases supporting BioPAX and PSI-MI formats into one large BioPAX model, which can be queried using our web API (documented below). This API can be used by computational biologists to download custom subsets of Pathway Commons for analysis, or can be used to incorporate powerful biological pathway and network information retrieval and query functionality into websites, scripts and software. For computational biologists looking for comprehensive biological pathway data for analysis, we also make available batch downloads of the data in several formats.
Warehouse data (canonical molecules, ontologies) are converted to BioPAX utility classes, such as EntityReference, ControlledVocabulary, EntityFeature sub-classes, and saved as the initial BioPAX model, which forms the foundation for integrating pathway data and for id-mapping.
Pathway and binary interaction data (interactions, participants) are normalized next and merged into the database. Original reference molecules are replaced with the corresponding BioPAX warehouse objects.
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Literature curated & high throughput || Transcription regulation
Released in years: 2007
Created by Wasserman Lab
Contact:
License: GNU Lesser General Public License version 3 (LGPLv3)
One of the oldest and largest TF-target databases. From the Wasserman Lab, who also developed JASPAR and many other tools. Unfortunately the website is down at the moment (April 2019). Which was, by the way, in that time (2007) a super nice and innovative design for a molecular database webpage.
Category || Subcategory >>> Original experiment, Literature curated || Complexes
License: PDB License
"Data files contained in the PDB archive (ftp://ftp.wwpdb.org) are free of all copyright restrictions and made fully and freely available for both non-commercial and commercial use."
The Research Collaboratory for Structural Bioinformatics (RCSB), the Macromolecular Structure Database (MSD) at the European Bioinformatics Institute (EBI) and the Protein Data Bank Japan (PDBj) at the Institute for Protein Research in Osaka University will serve as custodians of the wwPDB, with the goal of maintaining a single archive of macromolecular structural data that is freely and publicly available to the global community.
Category || Subcategory >>> Literature curated || Activity flow
Released in years: 2004
Created by Weinstein Group
Contact:
License: No license (No license)
Taxons: Human
PDZBase is a database that aims to contain all known PDZ-domain-mediated protein-protein interactions. Currently, PDZBase contains approximately 300 such interactions, which have been manually extracted from >200 articles. PDZBase currently contains ∼300 interactions, all of which have been manually extracted from the literature, and have been independently verified by two curators. The extracted information comes from in vivo (co-immunoprecipitation) or in vitro experiments (GST-fusion or related pull-down experiments). Interactions identified solely from high throughput methods (e.g. yeast two-hybrid or mass spectrometry) were not included in PDZBase. Other prerequisites for inclusion in the database are: that knowledge of the binding sites on both interacting proteins must be available (for instance through a truncation or mutagenesis experiment); that interactions must be mediated directly by the PDZ-domain, and not by any other possible domain within the protein.
Category || Subcategory >>> Undefined || Undefined
License: Unspecified Non-commercial Redistributable (Unspecified NC-SA)
"This is not a real license but a temporary label meaning either that we are working on clarifying the licensing terms (because the copyright holders hasn't specified it anywhere) or we as OmniPath got a permission from the copyright holders to redistribute the data from the resource."
Being a composite database of about 80 original resources, Pharos points to the licenses of the constituting databases, just like OmniPath does it: https://pharos.nih.gov/about. That means, at least redistribution with attribution and non-commercial use should be fine with all these resources.
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Literature curated || Ptm
Released in years: 2004, 2007, 2010
Contact:
License: ELM Software License Agreement (ELM)
Phospho.ELM http://phospho.elm.eu.org is a new resource containing experimentally verified phosphorylation sites manually curated from the literature and is developed as part of the ELM (Eukaryotic Linear Motif) resource. Phospho.ELM constitutes the largest searchable collection of phosphorylation sites available to the research community. The Phospho.ELM entries store information about substrate proteins with the exact positions of residues known to be phosphorylated by cellular kinases. Additional annotation includes literature references, subcellular compartment, tissue distribution, and information about the signaling pathways involved as well as links to the molecular interaction database MINT. Phospho.ELM version 2.0 contains 1,703 phosphorylation site instances for 556 phosphorylated proteins. (Diella 2004)
Data integration in pypath: dynamic
Category || Subcategory >>> Undefined || Undefined
License: PhosphoNetworks License
"The content in the database is free to academic and non-profit organizations. Users for commercial purpose please contact the authors before download the data sets."
Category || Subcategory >>> Literature curated and prediction || Post-translational modification
Contact:
License: Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Taxons: Human
We have integrated three existing databases, including Phospho.ELM (release 6.0, total 9236 phosphorylation sites), HPRD (release 6, total 8992 phosphorylation sites), SwissProt (release 51.5, total 6529 phosphorylation sites), and our manually curated 400 kinase–substrate pairs, which are primarily from review articles.
It contains 400 manually curated interactions and much more from HTP methods. The manually curated set can not be distinguished in the data formats offered.
Data integration in pypath: static
Category || Subcategory >>> Literature curated and high throughput || Post-translational modification
Released in years: 2011, 2015, 2016
Created by CST
Contact:
License: Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International (CC BY-NC-SA 3.0)
Taxons: Human, Mouse, Eubacteria, Eukarya
PSP integrates both low- and high-throughput (LTP and HTP) data sources into a single reliable and comprehensive resource. Nearly 10,000 journal articles , including both LTP and HTP reports, have been manually curated by expert scientists from over 480 different journals since 2001.
Data integration in pypath: dynamic
Category || Subcategory >>> Undefined || Undefined
License: Apache License, version 2.0 (Apache 2.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Unspecified Non-commercial Redistributable (Unspecified NC-SA)
"This is not a real license but a temporary label meaning either that we are working on clarifying the licensing terms (because the copyright holders hasn't specified it anywhere) or we as OmniPath got a permission from the copyright holders to redistribute the data from the resource."
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Literature curated || Process description
Released in years: 2004, 2008, 2010, 2012, 2014, 2015, 2016
Contact:
License: Creative Commons Zero 1.0 Universal (CC0 1.0)
Once the content of the module is approved by the author and curation staff, it is peer-reviewed on the development web-site, by one or more bench biologists selected by the curator in consultation with the author. The peer review is open and the reviewers are acknowledged in the database by name. Any issues raised in the review are resolved, and the new module is scheduled for release.
No binary interactions can be exported programmatically from any format of the Reactome dataset. Reactome's curation method doesn't cover binary interactions, the inferred lists on the webpage are based on automatic expansion of complexes and reactions, and thus are unreliable. In lack of information, references cannot be assigned to interactions.
Data integration in pypath: dynamic
Category || Subcategory >>> Undefined || Undefined
License: GNU Lesser General Public License version 3 (LGPLv3)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
Contact:
License: MIT License (MIT)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
Category || Subcategory >>> Literature curated || Activity flow
Released in years: 2010, 2012, 2016, 2021
Created by Korcsmaros Group
Contact:
License: Creative Commons Attribution-ShareAlike 3.0 International (CC BY-SA 3.0)
Taxons: Human, D. melanogaster, C. elegans
In each of the three organisms, we first listed signaling proteins and interactions from reviews (and from WormBook in C.elegans) and then added further signaling interactions of the listed proteins. To identify additional interactions in C.elegans, we examined all interactions (except for transcription regulation) of the signaling proteins listed in WormBase and added only those to SignaLink that we could manually identify in the literature as an experimentally verified signaling interaction. For D.melanogaster, we added to SignaLink those genetic interactions from FlyBase that were also reported in at least one yeast-2-hybrid experiment. For humans, we manually checked the reliability and directions for the PPIs found with the search engines iHop and Chilibot.
For OmniPath we used the literature curated part of version 3 of SignaLink, which is unpublished yet. Version 2 is publicly available, and format definitions in pypath exist to load the version 2 alternatively.
Category || Subcategory >>> Literature curated || Activity flow, Complexes
Released in years: 2015
Created by Cesareni Group
Contact:
License: Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Direct data import from: SignaLink3, PhosphoSite
SIGNOR, the SIGnaling Network Open Resource, organizes and stores in a structured format signaling information published in the scientific literature. The captured information is stored as binary causative relationships between biological entities and can be represented graphically as activity flow. The entire network can be freely downloaded and used to support logic modeling or to interpret high content datasets. The core of this project is a collection of more than 11000 manually-annotated causal relationships between proteins that participate in signal transduction. Each relationship is linked to the literature reporting the experimental evidence. In addition each node is annotated with the chemical inhibitors that modulate its activity. The signaling information is mapped to the human proteome even if the experimental evidence is based on experiments on mammalian model organisms. SIGNOR 2.0 now stores almost 23 000 manually-annotated causal relationships between proteins and other biologically relevant entities: chemicals, phenotypes, complexes, etc.
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Literature curated || Activity flow
Released in years: 2008, 2011, 2012
Created by Shamir Group, Shiloh Group
Contact:
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
SPIKE’s data on relationships between entities come from three sources: (i) Highly curated data submitted directly to SPIKE database by SPIKE curators and experts in various biomedical domains. (ii) Data imported from external signaling pathway databaes. At present, SPIKE database imports such data from Reactome, KEGG, NetPath and The Transcription Factor Encyclopedia (http://www.cisreg.ca/cgi-bin/tfe/home.pl). (iii) Data on protein–protein interactions (PPIs) imported either directly from wide-scale studies that recorded such interactions [to date,PPI data were imported from Stelzl et al., Rual et al. and Lim et al.] or from external PPI databases [IntAct and MINT]. Relationship data coming from these different sources vary greatly in their quality and this is reflected by a quality level attribute, which is attached to each relationship in SPIKE database (Supplementary Data). Each relationship in SPIKE is linked to at least one PubMed reference that supports it.
Data integration in pypath: static
Category || Subcategory >>> High-throughput and prediction || Interaction
Released in years: 2016, 2015, 2013, 2011, 2009, 2007, 2005, 2003, 2000
Created by Bork Lab
Contact:
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
Contact:
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
Contact:
License: GNU General Public License version 3 (GPLv3)
Category || Subcategory >>> Undefined || Undefined
Created by Saier Lab
Contact:
License: Creative Commons Attribution-ShareAlike 3.0 International (CC BY-SA 3.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution-ShareAlike 3.0 International (CC BY-SA 3.0)
Category || Subcategory >>> Literature curated || Model
License: No license (No license)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Nucleic Acids Research Open Access (NAR Open Access)
"The Nucleic Acids Research Database issue requires the databases to be freely usable, hence any resource published in this journal deemed to be free for both commercial and non-profit use."
Category || Subcategory >>> Literature curated || Activity flow
Released in years: 2010, 2012
Contact:
License: Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International (CC BY-NC-SA 3.0)
Taxons: Human, Mouse, Rat
Nodes: 468, Edges: 744
The literature on TRP channel PPIs found in the PubMed database serve as the primary information source for constructing the TRIP Database. First, a list of synonyms for the term ‘TRP channels’ was constructed from UniprotKB, Entrez Gene, membrane protein databases (Supplementary Table S2) and published review papers for nomenclature. Second, using these synonyms, a list of articles was obtained through a PubMed search. Third, salient articles were collected through a survey of PubMed abstracts and subsequently by search of full-text papers. Finally, we selected articles that contain evidence for physical binding among the proteins denoted. To prevent omission of relevant papers, we manually screened information in other databases, such as DIP, IntAct, MINT, STRING, BioGRID, Entrez Gene, IUPHAR-DB and ISI Web of Knowledge (from Thomson Reuters). All 277 articles used for database construction are listed in our database website.
Good manually curated dataset focusing on TRP channel proteins, with ~800 binary interactions. The provided formats are not well suitable for bioinformatics use because of the non standard protein names, with greek letters and only human understandable formulas. Using HTML processing from 5-6 different tables, with couple hundreds lines of code, one have a chance to compile a usable table.
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Category || Subcategory >>> Undefined || Undefined
License: No license (No license)
Category || Subcategory >>> Undefined || Undefined
Contact:
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Literature curated || Activity flow
Released in years: 2009, 2010, 2011, 2012, 2013, 2014
Created by Wang Group
Contact:
License: Creative Commons Attribution-NonCommercial 3.0 International (CC BY-NC 3.0)
Taxons: Human, Mouse, Rat
Direct data import from: Cui2007, BioCarta, CST, NCI-PID, iHOP
Composed from multiple manually curated datasets, and contains own manual cuartion effort. Methods are unclear, and the dataset has not been published in reviewed paper. Based on the Cui et al 2007.
This network aims to merge multiple manually curated networks. Unfortunately a precise description of the sources and methods is missing. Also, the dataset does not include the references. Moreover, the data file misses header and key, so users can only guess about the meaning of columns and values.
Data integration in pypath: dynamic
Category || Subcategory >>> Literature curated || Process description
Released in years: 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016
Contact:
License: Creative Commons Attribution 3.0 International (CC BY 3.0)
The goal of WikiPathways is to capture knowledge about biological pathways (the elements, their interactions and layout) in a form that is both human readable and amenable to computational analysis.
Category || Subcategory >>> Undefined || Undefined
Contact:
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Category || Subcategory >>> Literature curated || Activity flow
Created by Wang Lab
Contact:
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
The human signaling network (Version 4, containing more than 6,000 genes and more than 50,000 relations) includes our previous data obtained from manually curated signaling networks (Awan et al., 2007; Cui et al., 2007; Li et al., 2012) and by PID (http://pid.nci.nih.gov/) and our recent manual curations using the iHOP database (http://www.ihop-net.org/UniPub/iHOP/).
Category || Subcategory >>> Undefined || Undefined
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Dénes Türei, Nicolàs Palacio, Olga Ivanova, Saez Lab 2016-2020. Feedback: omnipathdb@gmail.com