Metadata about resources in OmniPath and pypath

We created this webpage in Nov 2016. Since the end of 2019 we have been gradually updating and extending the information. As of July 2020 we updated the URLs and licensing terms of most of the resources. Most of the "pypath methods" and the years of releases are out of date. We will keep updating these, if you find any wrong information please notify us at omnipathdb@gmail.com. About updates of the OmniPath database content please refer to our archive.

This collection is a byproduct of the development of OmniPath, a database built from above 100 resources. Initially OmniPath focused on the literature curatied activity flow networks. Today it covers a much broader range of molecular interaction data, and besides its network database OmniPath has four other databases: enzyme-PTM relationships, protein complexes, molecular annotations (function, localization, structure, etc) and intercellular communication roles. The "omnipath" dataset of the network database follows the principles of the initial release of OmniPath, focusing on high quality, manually curated signaling pathways. The descriptions here cite the relevant sentences about the curation protocols from the original articles and webpages. URLs pointing to the articles and the webpages, and some additional metadata are provided where available. The resources with green title are included by default in OmniPath. pypath methods are listed where available, to know more please look at pypath documentation.

How we collected the license information? We searched for license information in the main, About, Download and FAQ sections of the webpages, and run Google searches for the database name and license. Where we could not find anything about licensing, we assumed no license. Unfortunately due to todays restrictive copyright legislations, users don't have the freedom to use, modify and redistribute the data without a license explicitely granting these to them. Despite the clear intention from the authors to make their data public, and statements on the webpage like "free to use" or "available for download". In these cases we contacted the authors for permission to redistribute their data.

Contents


ABS – Annotated Regulatory Binding Sites

Category || Subcategory >>> Undefined || Undefined

Contact:

License: GNU General Public License version 2 (GPLv2)

Webpages


ACSN – Atlas of Cancer Signalling Networks

Category || Subcategory >>> Literature curated || Reaction

Released in years: 2008, 2014, 2015, 2016

Created by Curie

Contact:

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages

Articles

PubMed

Taxons: Human

Quotes

The map curator studies the body of literature dedicated to the biological process or molecular mechanism of interest. The initial sources of information are the major review articles from high-impact journals that represent the consensus view on the studied topic and also provide a list of original references. The map curator extracts information from review papers and represents it in the form of biochemical reactions in CellDesigner. This level of details reflects the ‘canonical’ mechanisms. Afterwards, the curator extends the search and analyses original papers from the list provided in the review articles and beyond. This information is used to enrich the map with details from the recent discoveries in the field. The rule for confident acceptance and inclusion of a biochemical reaction or a process is the presence of sufficient evidences from more than two studies, preferably from different scientific groups. The content of ACSN is also verified and compared with publicly available databases such as REACTOME, KEGG, WikiPathways, BioCarta, Cell Signalling and others to ensure comprehensive representation of consensus pathways and links on PMIDs of original articles confirmed annotated molecular interactions.

Methods in pypath

Data source (URLs and files)

Data format definition

Data input methods

Interactions


Adhesome

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages


Almen2009

Category || Subcategory >>> Undefined || Undefined

Contact:

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages


AlzPathway

Category || Subcategory >>> Literature curated || Activity flow

Released in years: 2012, 2015

Created by Tokyo Bioinf

Contact:

License: Creative Commons Attribution 3.0 International (CC BY 3.0)

Articles

Webpages

PubMed

Quotes

We collected 123 review articles related to AD accessible from PubMed. We then manually curated these review articles, and have built an AD pathway map by using CellDesigner. Molecules are distinguished by the following types: proteins, complexes, simple molecules, genes, RNAs, ions, degraded products, and phenotypes. Gene symbols are pursuant to the HGNC symbols. Reactions are also distinguished by the following categories: state transition, transcription, translation, heterodimer association, dissociation, transport, unknown transition, and omitted transition. All the reactions have evidences to the references in PubMed ID using the MIRIAM scheme. All the references used for constructing the AlzPathway are listed in the ‘References for AlzPathway’. Cellular types are distinguished by the followings: neuron, astrocyte, and microglial cells. Cellular compartments are also distinguished by the followings: brain blood barrier, presynaptic, postsynaptic, and their inner cellular localizations.

Notes

References can be fetched only from XML formats, not from the SIF file. Among approx. 150 protein-protein interactions, also contains interactions of many small molecules, denoted by pubchem IDs.

Data integration in pypath: static

Methods in pypath

Data source (URLs and files)

Data format definition


ARACNe-GTEx

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)


ARN – Autophagy Regulatory Network

Category || Subcategory >>> Literature curated || Activity flow

Released in years: 2014

Created by NetBiol Group

Contact:

License: Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International (CC BY-NC-SA 3.0)

Webpages

Articles

PubMed

Taxons: Human

Quotes

From Korcsmaros 2010: ... we first listed signaling proteins and interactions from reviews and then added further signaling interactions of the listed proteins. We used reviews as a starting point, manually looked up interactions three times, and manually searched for interactions of known signaling proteins with no signaling interactions so far in the database.


Ataxia

Category || Subcategory >>> High-throughput || Interaction

Created by Shaw Lab

Contact:

License: Creative Commons Attribution 2.5 International (CC BY 2.5)

Webpages

Articles

Taxons: Human

Quotes

In order to expand the interaction dataset, we added relevant direct protein–protein interactions from currently available human protein–protein interaction networks (Rual et al., 2005; Stelzl et al., 2005). We also searched public databases, including BIND (Bader et al., 2003), DIP (Xenarios et al., 2002), HPRD (Peri et al., 2003), MINT (Zanzoni et al., 2002), and MIPS (Pagel et al., 2005), to identify literature-based binary interactions involving the 54 ataxia-associated baits and the 561 interacting prey proteins. We identified 4796 binary protein–protein interactions for our Y2H baits and prey proteins (Table S4) and incorporated them in the Y2H protein–protein interaction map (Figures 4A–4C).

Notes

The Ataxia network doesn't contain original manual curation effort. The integrated data are very old.


Awan 2007

Category || Subcategory >>> Literature curated || Activity flow

Created by Wang Group

Contact:

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Articles

PubMed

Direct data import from: BioCarta, CA1

Quotes

To construct the human cellular signalling network, we manually curated signalling pathways from literature. The signalling data source for our pathways is the BioCarta database (http://www.biocarta.com/genes/allpathways.asp), which, so far, is the most comprehensive database for human cellular signalling pathways. Our curated pathway database recorded gene names and functions, cellular locations of each gene and relationships between genes such as activation, inhibition, translocation, enzyme digestion, gene transcription and translation, signal stimulation and so on. To ensure the accuracy and the consistency of the database, each referenced pathway was cross-checked by different researchers and finally all the documented pathways were checked by one researcher. In total, 164 signalling pathways were documented (supplementary Table 2). Furthermore, we merged the curated data with another literature-mined human cellular signalling network. As a result, the merged network contains nearly 1100 proteins (SupplementaryNetworkFile). To construct a signalling network, we considered relationships of proteins as links (activation or inactivation as directed links and physical interactions in protein complexes as neutral links) and proteins as nodes.


Baccin 2019

Category || Subcategory >>> Undefined || Undefined

Contact:

License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)

Webpages


BEL-Large-Corpus

Category || Subcategory >>> Undefined || Undefined

License: Nucleic Acids Research Open Access (NAR Open Access)

"The Nucleic Acids Research Database issue requires the databases to be freely usable, hence any resource published in this journal deemed to be free for both commercial and non-profit use."


BioCarta

Category || Subcategory >>> Literature curated || Activity flow

Released in years: 2006

Created by Community

Contact:

License: BioCarta License

Webpages

Taxons: Human

Quotes

Community built pathway database based on expert curation.

Notes

This resource includes a huge number of pathways, each curated by experts from a few reviews. The data is not available for download from the original webpage, only from second hand, for example from NCI-PID, in NCI-XML format. However, these files doesn't contain any references, which makes problematic the use of the BioCarta dataset. Also, some pathways are reviewed long time ago, possibly outdated. The Company and the website looks like it was abandoned around 2003-2006.


BioGRID – Biological General Repository for Interaction Datasets

Category || Subcategory >>> High throughput || Interaction

Released in years: 2003, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019

Created by Tyers Lab

Contact:

License: MIT License (MIT)

Webpages

Articles

PubMed

Collections

Methods in pypath

Data source (URLs and files)

Data format definition

Interactions


Ma'ayan 2005 – Human Hippocampal CA1 Region Neurons Signaling Network

Category || Subcategory >>> Literature curated || Activity flow

Released in years: 2005

Created by Iyengar Lab

Contact:

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Articles

PubMed

Taxons: Human, Mouse

Nodes: 545, Edges: 1259

Quotes

We used published research literature to identify the key components of signaling pathways and cellular machines, and their binary interactions. Most components (~80%) have been described in hippocampal neurons or related neuronal cells. Other components are from other cells, but are included because they are key components in processes known to occur in hippocampal neurons, such as translation. We then established that these interactions were both direct and functionally relevant. All of the connections were individually verified by at least one of the authors of this paper by reading the relevant primary paper(s). We developed a system made of 545 components (nodes) and 1259 links (connections). We used arbitrary but consistent rules to sort components into various groups. For instance, transcription factors are considered a as part of the transcriptional machinery, although it may also be equally valid to consider them as the most downstream component of the central signaling network. Similarly the AMPA receptor-channel (AMPAR) is considered part of the ion channels in the electrical response system since its activity is essential to defining the postsynaptic response, although it binds to and is activated by glutamate, and hence can be also considered a ligand gated receptor-channel in the plasma membrane. The links were specified by two criteria: function and biochemical mechanism. Three types of functional links were specified. This follows the rules used for representation of pathways in Science’s STKE (S1). Links may be activating, inhibitory or neutral. Neutral links do not specify directionality between components, and are mostly used to represent scaffolding and anchoring undirected or bidirectional interactions. The biochemical specification includes defining the reactions as non-covalent binding interactions or enzymatic reactions. Within the enzymatic category, reactions were further specified as phosphorylation, dephosphorylation, hydrolysis, etc. These two criteria for specification are independent and were defined for all interactions. For the analyses in this study we only used the functional criteria: activating, inhibitory or neutral specifications. We chose papers that demonstrated direct interactions that were supported by either biochemical or physiological effects of the interactions. From these papers we identified the components and interactions that make up the system we analyzed. During this specification process we did not consider whether these interactions would come together to form higher order organizational units. Each component and interaction was validated by a reference from the primary literature (1202 papers were used). A list of authors who read the papers to validate the components and interactions is provided under authors contributions.

Notes

One of the earliest manually curated networks, available in easily accessible tabular format, including UniProt IDs and PubMed references.

Data integration in pypath: dynamic

Methods in pypath

Data source (URLs and files)

Data format definition

Interactions


CancerCellMap

Category || Subcategory >>> Literature curated || Interaction

Created by Bader Lab

Contact:

License: Creative Commons Attribution 2.5 International (CC BY 2.5)

Webpages

Collections

Taxons: Human, Mouse, Rat

Quotes

Manually curated data, unpublished. A team of M.Sc. and Ph.D. biologists at the Institute of Bioinformatics in Bangalore, India read original research papers and hand-entered the pathway data into our database. The quality of the Cancer Cell Map pathways is very high. Half of the pathways were reviewed by experts at Memorial Sloan-Kettering Cancer Center and were found to contain only a few errors, which were subsequently fixed. A pathway is a collection of all genes/proteins that have been described as pathway members in any publication and all the interactions between them that can be found described in the literature.

Notes

One of the earliest manually curated datasets, now only available from second hand, e.g. from PathwayCommons. Included in many other resources. Contains binary interactions with PubMed references.

Data integration in pypath: dynamic

Methods in pypath

Data source (URLs and files)

Data format definition

Interactions


CancerSEA

Category || Subcategory >>> Undefined || Undefined

Contact:

License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)

Webpages


CARFMAP

Category || Subcategory >>> Literature curated || Pathway

Contact:

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Articles

Webpages

PubMed


CellCellInteractions

Category || Subcategory >>> Undefined || Undefined

Created by Bader Lab

Contact:

License: CellCellInteractions License

Webpages


CellPhoneDB

Category || Subcategory >>> Undefined || Undefined

License: MIT License (MIT)

Webpages


cellsignal.com

Category || Subcategory >>> Undefined || Undefined

License: No license (No license)


CFinder

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)


Compleat – COMPLEAT protein COMPLex Enrichment Analysis Tool

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)

Webpages


ComplexPortal

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages


ComPPI

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)

Webpages


ConsensusPathDB

Category || Subcategory >>> Literature curated || Activity flow

Released in years: 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015

Contact:

License: Constituting databases carry their own licenses (Composite)

Webpages

Articles

PubMed

Collections

Taxons: Human, Mouse, Yeast

Quotes

Interaction data in ConsensusPathDB currently originates from 12 interaction databases and comprises physical interactions, biochemical reactions and gene regulations. Importantly, the source of physical entities and interactions is always recorded, which allows linking to the original data in the source database.

Notes

ConsensusPathDB comprises data from 32 resources. The format is easy to use, tab delimited text file, with UniProtKB names and PubMed IDs. However, the dataset is extremely huge, and several databases containing HTP data is included.


CORUM – Comprehensive Resource of Mammalian protein complexes

Category || Subcategory >>> Literature curated || Complexes

Released in years: 2007, 2009

Contact:

License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)

Articles

Webpages

PubMed

Collections

Taxons: Human, Mouse, Rat

Quotes

The CORUM database is a collection of experimentally verified mammalian protein complexes. Information is manually derived by critical reading of the scientific literature from expert annotators. Information about protein complexes includes protein complex names, subunits, literature references as well as the function of the complexes. In order to provide a high-quality dataset of mammalian protein complexes, all entries are manually created. Only protein complexes which have been isolated and characterized by reliable experimental evidence are included in CORUM. To be considered for CORUM, a protein complex has to be isolated as one molecule and must not be a construct derived from several experiments. Also, artificial constructs of subcomplexes are not taken into account. Since information from high-throughput experi ments contains a significant fraction of false-positive results, this type of data is excluded. References for relevant articles were mainly found in general review articles, cross-references to related protein complexes within analysed literature and comments on referenced articles in UniProt.

Notes

CORUM is not part of the OmniPath pathways network, because we did not applied any complex expansion. But it has an interface built in the pypath module.

Data integration in pypath: dynamic

Methods in pypath

Data source (URLs and files)

Data input methods


COSMIC

Category || Subcategory >>> Undefined || Undefined

License: COSMIC License

Webpages


CPAD

Category || Subcategory >>> Undefined || Undefined

Contact:

License: No license (No license)

Webpages


CSPA – Cell Surface Protein Atlas

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages


CST Pathways – Cell Signaling Technology Pathways

Category || Subcategory >>> Literature curated || Activity flow

Released in years: 2005, 2015

Created by CST

Contact:

License: No license (No license)

Webpages

Quotes

On these resource pages you can find signaling pathway diagrams, research overviews, relevant antibody products, publications, and other research resources organized by topic. The pathway diagrams associated with these topics have been assembled by CST scientists and outside experts to provide succinct and current overviews of selected signaling pathways.

Notes

The pathway diagrams are based on good quality, manually curated data, probably from review articles. However, those are available only in graphical (PDF and InDesign) formats. There is no programmatic way to obtain the interactions and references, as it was confirmed by the authors, who I contacted by mail. Wang's HumanSignalingNetwork includes the data from this resource, which probably has been entered manually, but Wang's data doesn't have source annotations, despite it's compiled from multiple sources. The date of the beginning of this project is estimated using the Internet wayback machine.


Cui 2007

Category || Subcategory >>> Literature curated || Activity flow

Created by Wang Group

Contact:

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Articles

PubMed

Taxons: Human

Nodes: 1528, Edges: 4249

Direct data import from: Awan2007, CancerCellMap

Quotes

To build up the human signaling network, we manually curated the signaling molecules (most of them are proteins) and the interactions between these molecules from the most comprehensive signaling pathway database, BioCarta (http://www.biocarta.com/). The pathways in the database are illustrated as diagrams. We manually recorded the names, functions, cellular locations, biochemical classifications and the regulatory (including activating and inhibitory) and interaction relations of the signaling molecules for each signaling pathway. To ensure the accuracy of the curation, all the data have been crosschecked four times by different researchers. After combining the curated information with another literature‐mined signaling network that contains ∼500 signaling molecules (Ma'ayan et al, 2005)[this is the CA1], we obtained a signaling network containing ∼1100 proteins (Awan et al, 2007). We further extended this network by extracting and adding the signaling molecules and their relations from the Cancer Cell Map (http://cancer.cellmap.org/cellmap/), a database that contains 10 manually curated signaling pathways for cancer. As a result, the network contains 1634 nodes and 5089 links that include 2403 activation links (positive links), 741 inhibitory links (negative links), 1915 physical links (neutral links) and 30 links whose types are unknown (Supplementary Table 9). To our knowledge, this network is the biggest cellular signaling network at present.

Notes

Excellent signaling network with good topology for all those who doesn't mind to use data of unknown origin. Supposedly a manually curated network, but data files doesn't include article references. Merging CA1 network with CancerCellMap and BioCarta (also without references) makes the origin of the data untraceable.


dbPTM

Category || Subcategory >>> Literature curated || Ptm

Released in years: 2005, 2009, 2012, 2015

Created by ISBLab

Contact:

License: Nucleic Acids Research Open Access (NAR Open Access)

"The Nucleic Acids Research Database issue requires the databases to be freely usable, hence any resource published in this journal deemed to be free for both commercial and non-profit use."

Webpages

Articles

PubMed

Collections

Taxons: Human, Metazoa, Bacteria, Plants, Yeast

Quotes

Due to the inaccessibility of database contents in several online PTM resources, a total 11 biological databases related to PTMs are integrated in dbPTM, including UniProtKB/SwissProt, version 9.0 of Phospho.ELM, PhosphoSitePlus, PHOSIDA, version 6.0 of O-GLYCBASE, dbOGAP, dbSNO, version 1.0 of UbiProt, PupDB, version 1.1 of SysPTM and release 9.0 of HPRD.

Data integration in pypath: dynamic

Methods in pypath

Data source (URLs and files)

Data format definition

Data input methods

Interactions

Enzyme-substrate relationships and PTMs


DeathDomain

Category || Subcategory >>> Literature curated || Activity flow

Released in years: 2011, 2012

Created by Myoungji University

Contact:

License: Nucleic Acids Research Open Access (NAR Open Access)

"The Nucleic Acids Research Database issue requires the databases to be freely usable, hence any resource published in this journal deemed to be free for both commercial and non-profit use."

Articles

Webpages

PubMed

Collections

Taxons: Human

Nodes: 99, Edges: 175

Quotes

The PubMed database was used as the primary source for collecting information and constructing the DD database. After finding synonyms for each of the 99 DD superfamily proteins using UniProtKB and Entrez Gene, we obtained a list of articles using each name of the proteins and its synonyms on a PubMed search, and we selected the articles that contained evidence for physical binding among the proteins denoted. We also manually screened information that was in other databases, such as DIP, IntAct, MINT, STRING and Entrez Gene. All of the 295 articles used for database construction are listed on our database website.

Notes

Detailful dataset with many references. Sadly the data can be extracted only by parsing HTML. It doesn't mean more difficulty than parsing XML formats, just these are not intended to use for this purpose.

Data integration in pypath: static

Methods in pypath

Data source (URLs and files)

Data format definition


DEPOD – Human Dephosphorylation Database

Category || Subcategory >>> Literature curated || Post-translational modification

Released in years: 2013, 2014, 2016

Created by EMBL & EMBL-EBI

Contact:

License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)

Articles

Webpages

PubMed

Collections

Taxons: Human

Quotes

DEPOD the human DEPhOsphorylation Database (version 1.0) is a manually curated database collecting human active phosphatases, their experimentally verified protein and non-protein substrates and dephosphorylation site information, and pathways in which they are involved. It also provides links to popular kinase databases and protein-protein interaction databases for these phosphatases and substrates. DEPOD aims to be a valuable resource for studying human phosphatases and their substrate specificities and molecular mechanisms; phosphatase-targeted drug discovery and development; connecting phosphatases with kinases through their common substrates; completing the human phosphorylation/dephosphorylation network.

Notes

Nice manually curated dataset with PubMed references, in easily accessible MITAB format with UniProt IDs, comprises 832 dephosphorylation reactions on protein substrates, and few hundreds on small molecules.

Methods in pypath

Data source (URLs and files)

Data format definition

Data input methods

Enzyme-substrate relationships and PTMs


DGIdb

Category || Subcategory >>> Undefined || Undefined

License: MIT License (MIT)

Webpages


Dinarello 2013

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)


DIP – Database of Interacting Proteins

Category || Subcategory >>> Literature curated || Interaction

Released in years: 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016

Created by UCLA, Eisenberg Group

Contact:

License: Creative Commons Attribution-NoDerivatives 3.0 International (CC BY-ND 3.0)

Articles

Webpages

PubMed

Collections

Quotes

In the beginning (near 2000), it was a entirely manually curated database:

Notes

The 'core' dataset contains manually curated interactions from small-scale studies. Interactions are well annotated with PubMed IDs, evidences, and mechanism (binding, chemical reaction, etc). The format is esily accessible (MITAB).

Data integration in pypath: static

Methods in pypath

Data source (URLs and files)

Data format definition


DisGeNet

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)

Webpages


DOMINO

Category || Subcategory >>> Literature curated || Ptm

Released in years: 2006

Created by Cesareni Group

Contact:

License: Creative Commons Attribution 2.5 International (CC BY 2.5)

Webpages

Articles

PubMed

Collections

Taxons: Human, Yeast, C. elegans, Mouse, Rat, HIV, D. melanogaster, A. thaliana, X. laevis, B. taurus, G. gallus, O. cuniculus, Plasmodium falciparum

Quotes

DOMINO aims at annotating all the available information about domain-peptide and domain–domain interactions. The core of DOMINO, of July 24, 2006 consists of more than 3900 interactions extracted from peer-reviewed articles and annotated by expert biologists. A total of 717 manuscripts have been processed, thus covering a large fraction of the published information about domain–peptide interactions. The curation effort has focused on the following domains: SH3, SH2, 14-3-3, PDZ, PTB, WW, EVH, VHS, FHA, EH, FF, BRCT, Bromo, Chromo and GYF. However, interactions mediated by as many as 150 different domain families are stored in DOMINO.

Methods in pypath

Data source (URLs and files)

Domain-domain interactions

Domain-motif interactions

Data format definition

Data input methods

Interactions

Enzyme-substrate relationships and PTMs


DoRothEA

Category || Subcategory >>> Undefined || Undefined

License: Constituting databases carry their own licenses (Composite)

Webpages


DoRothEA-reviews

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)


ELM – Eukaryotic Linear Motif resource

Category || Subcategory >>> Literature curated || Post-translational modifications

Released in years: 2003, 2008, 2009, 2012, 2013, 2014, 2016

Created by ELM Consortium

Contact:

License: ELM Software License Agreement

Webpages

Articles

PubMed

Collections

Data integration in pypath: dynamic

Methods in pypath

Data source (URLs and files)

Domain-motif interactions

Data format definition

Data input methods

Interactions


EMBRACE

Category || Subcategory >>> Undefined || Undefined

Contact:

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages


ENCODE – Encyclopedia of DNA Elements

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages


Exocarta

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages


Fantom4

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages


GO – Gene Ontology

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages


GPCRdb

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)


Guide to Pharmacology – Guide to Pharmacology

Category || Subcategory >>> Literature curated || Activity flow

Released in years: 2007, 2008, 2009, 2011, 2013, 2014, 2015, 2016

Contact:

License: Creative Commons Attribution-ShareAlike 3.0 International (CC BY-SA 3.0)

Webpages

Articles

PubMed

Collections

Quotes

Presently, the resource describes the interactions between target proteins and 6064 distinct ligand entities (Table 1). Ligands are listed against targets by their action (e.g. activator, inhibitor), and also classified according to substance types and their status as approved drugs. Classes include metabolites (a general category for all biogenic, non-peptide, organic molecules including lipids, hormones and neurotransmitters), synthetic organic chemicals (e.g. small molecule drugs), natural products, mammalian endogenous peptides, synthetic and other peptides including toxins from non-mammalian organisms, antibodies, inorganic substances and other, not readily classifiable compounds.

Methods in pypath

Data source (URLs and files)

Data format definition


Havugimana 2012 – Census of Human Soluble Protein Complexes

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages


HGNC – Human Gene Nomenclature Committee

Category || Subcategory >>> Undefined || Undefined

License: HGNC License

"It is a condition of our funding from NIH and the Wellcome Trust that the nomenclature and information we provide is freely available to all."

Webpages


HIPPIE – Human Integrated Protein-Protein Interaction rEference

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages


HOCOMOCO

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)


HPA – Human Protein Atlas

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution-ShareAlike 3.0 International (CC BY-SA 3.0)

Webpages


HPMR – Human Plasma Membrane Receptome

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)

Webpages


HPRD – Human Protein Reference Database

Category || Subcategory >>> Literature curated || Post-translational modification

Released in years: 2002, 2005, 2009, 2010

Contact:

License: HPRD License

Webpages

Articles

PubMed

Collections

Quotes

The information about protein-protein interactions was cataloged after a critical reading of the published literature. Exhaustive searches were done based on keywords and medical subject headings (MeSH) by using Entrez. The type of experiments that served as the basis for establishing protein-protein interactions was also annotated. Experiments such as coimmunoprecipitation were designated in vivo, GST fusion and similar “pull-down” type of experiments were designated in vitro, and those identified by yeast two-hybrid were annotated as yeast two-hybrid.

Methods in pypath

Data source (URLs and files)

Data format definition

Data input methods

Interactions

Enzyme-substrate relationships and PTMs


HTRIdb – Human Transcriptional Reference Interactome

Category || Subcategory >>> Undefined || Undefined

License: GNU Lesser General Public License version 3 (LGPLv3)


hu.MAP – Human Protein Complex Map

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages


HuPho – Human Phosphatase Portal

Category || Subcategory >>> High throughput and literature curated || Post-translational modification

Released in years: 2012, 2015

Created by Cesareni Group

Contact:

License: No license (No license)

Webpages

Articles

PubMed

Collections

Quotes

In order to offer a proteome-wide perspective of the phosphatase interactome, we have embarked on an extensive text-mining-assisted literature curation effort to extend phosphatase interaction information that was not yet covered by protein–protein interaction (PPI) databases. Interaction evidence captured by expert curators was annotated in the protein interaction database MINT according to the rapid curation standard. This data set was next integrated with protein interaction information from three additional major PPI databases, IntAct, BioGRID and DIP. These databases are part of the PSIMEx consortium and adopt a common data model and common controlled vocabularies, thus facilitating data integration. Duplicated entries were merged and redundant interactions have been removed.

Notes

The database is dynamically updated, so is up to date at any given time. That's why it is marked as up to date in 2015, despite it has no new release after 2012.


HuRI HI-III – Human Reference Interactome

Category || Subcategory >>> High-throughput || Yeast 2 hybrid

Released in years: 2012, 2014, 2016

Contact:

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Articles

Webpages

PubMed


HuRI Lit-BM-17 – Human Reference Interactome Literature Benchmark

Category || Subcategory >>> High-throughput || Yeast 2 hybrid

Released in years: 2013, 2017

Created by CCSB

Contact:

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Articles

Webpages

PubMed

Quotes

High-quality non-systematic Literature dataset. In 2013, we extracted interaction data from BIND, BioGRID, DIP, HPRD, MINT, IntAct, and PDB to generate a high-quality binary literature dataset comprising ~11,000 protein-protein interactions that are binary and supported by at least two traceable pieces of evidence (publications and/or methods) (Rolland et al Cell 2014). Although this dataset does not result from a systematic investigation of the interactome search space and should thus be used with caution for any network topology analyses, it represents valuable interactions for targeted studies and is freely available to the research community through the search engine or via download.

Methods in pypath

Data source (URLs and files)

Data input methods


I2D – Interologous Interaction Database

Category || Subcategory >>> Undefined || Undefined

License: I2D License


ICELLNET

Category || Subcategory >>> Undefined || Undefined

Contact:

License: GNU General Public License version 3 (GPLv3)

Webpages


iMEX

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)


IMEx – International Molecular Interaction Exchange Consortium

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 2.5 International (CC BY 2.5)

Webpages


InnateDB

Category || Subcategory >>> Literature curated || Interaction

Released in years: 2008, 2010, 2013, 2014, 2015

Created by Brinkman Lab, Hancock Lab, Lynn Group

Contact:

License: Design Science License (DSL)

Articles

Webpages

PubMed

Collections

Quotes

InnateDB (www.innatedb.com) is a database and integrated analysis platform specifically designed to facilitate systems-level analyses of the mammalian innate immune response (Lynn et al. 2008; 2010, 2013). To enrich our knowledge of innate immunity networks and pathways, the InnateDB curation team has contextually annotated >25,000 human and mouse innate immunity-relevant molecular interactions through the review of >5,000 biomedical articles. Curation adheres to the MIMIx guidelines and new interactions are added weekly. Importantly, interactions are curated between molecules with a documented role in an innate immunity relevant biological process or pathway and all other interactors regardless of whether the interacting molecule has any known role in innate immunity. This approach captures interactions between the innate immune system and other systems. InnateDB is not limited to data on the innate immune system. It is a comprehensive database of human, mouse and bovine molecular interactions and pathways, consisting of more than 300,000 molecular interactions and 3,000+ pathways, integrated from major public molecular interaction and pathway databases. InnateDB is also an analysis platform offering user-friendly bioinformatics tools, including pathway and ontology analysis, network visualization and analysis and the ability to upload and analyze user-supplied gene expression or other quantitative data in a network and/or pathway context. The platform has a global profile and is utilised by >10,000 users per annum and is widely cited. A mirror of the site hosted in Australia is also available at innatedb.sahmri.com. Note that new interactions and gene annotations are added to InnateDB on an almost weekly database so the data is being continuously updated.

Notes

Probably the largest manually curated binary protein interaction dataset, developed by a dedicated full time team of curators. Formats are clear and accessible, comprising UniProt IDs, PubMed references, experimental evidences and mechanisms.

Data integration in pypath: static

Methods in pypath

Data source (URLs and files)

Data format definition


IntAct – IntAct Molecular Interaction Database

Category || Subcategory >>> Literature curated and high-throughput || Interaction

Released in years: 2003, 2006, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019

Created by EBI

Contact:

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Articles

Webpages

PubMed

Collections

Direct data import from: InnateDB, MINT

Quotes

The information within the IntAct database primarily consists of protein–protein interaction (PPI) data. The majority of the PPI data within the database is annotated to IMEx standards, as agreed by the IMEx consortium. All such records contain a full description of the experimental conditions in which the interaction was observed. This includes full details of the constructs used in each experiment, such as the presence and position of tags, the minimal binding region defined by deletion mutants and the effect of any point mutations, referenced to UniProtKB, the underlying protein sequence database. Protein interactions can be described down to the isoform level, or indeed to the post-translationally cleaved mature peptide level if such information is available in the publication, using the appropriate UniProtKB identifiers.

Notes

We can not draw a sharp distinction between low and high throughput methods, and I can agree, that this is not the only and best measure of quality considering experimental data. I see that IntAct came up with a good solution to estimate the confidence of interactions. The mi-score system gives a comprehensive way to synthetize information from multiple experiments, and weight interactions according to experimental methods, interaction type, and number of evidences.

Methods in pypath

Data source (URLs and files)

Data format definition


Integrins

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages


IntOGen

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)

Webpages


iPTMnet

Category || Subcategory >>> Undefined || Undefined

Contact:

License: Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)

Webpages


iTALK

Category || Subcategory >>> Undefined || Undefined

Contact:

License: Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International (CC BY-NC-SA 3.0)

Webpages


JASPAR

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)


KEA – Kinase Enrichment Analysis

Category || Subcategory >>> Undefined || Undefined

License: Unspecified Non-commercial Redistributable (Unspecified NC-SA)

"This is not a real license but a temporary label meaning either that we are working on clarifying the licensing terms (because the copyright holders hasn't specified it anywhere) or we as OmniPath got a permission from the copyright holders to redistribute the data from the resource."

Webpages


KEGG – Kyoto Encyclopedia of Genes and Genomes

Category || Subcategory >>> Literature curated || Process description

Released in years: 2000, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016

Contact:

License: KEGG License

Webpages

Articles

Collections

Quotes

Notes

From 2011, KEGG data is not freely available. The downloadable KGML files contain binary interactions, most of them between large complexes. No references available.

Methods in pypath

Data source (URLs and files)

Data input methods

Miscellaneous


KEGG-MEDICUS

Category || Subcategory >>> Undefined || Undefined

Released in years: 2017, 2018, 2019

License: Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)

Webpages


kinase.com

Category || Subcategory >>> Undefined || Undefined

License: Kinase.com License (Kinase.com)

Webpages


Kinexus

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)


Kirouac 2010

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution-NonCommercial-NoDerivs 3.0 International (CC BY-NC-ND 3.0)

Webpages


Laudanna – Compiled Datasets for Network Analysis from Laudanna Lab

Category || Subcategory >>> Combined || Mixed

Released in years: 2014

Created by Laudanna Lab

Contact:

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages

Direct data import from: BioGRID, ConsensusPathDB, dbPTM, DIP, HumanSignalingNetwork, IntAct, MINT, MPPI, PathwayCommons, phospho.ELM, PhosphoPoint, PhosphoSite, SignaLink

Quotes

Notes

Data sets are compiled from public data-bases and from literature and manually curated for accuracy. They are intended for network reconstruction, topological and multidimensional analysis in cell biology.

Methods in pypath

Data source (URLs and files)

Data input methods

Miscellaneous


Li 2012 – Human Phosphotyrosine Signaling Network

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution-NonCommercial 3.0 International (CC BY-NC 3.0)


LMPID

Category || Subcategory >>> Literature curated || Post-translational modifications

Released in years: 2015

Created by Bose Institute

Contact:

License: Creative Commons Attribution-NonCommercial 2.0 International (CC BY-NC 2.0)

Webpages

Articles

PubMed

Collections

Quotes

LMPID (Linear Motif mediated Protein Interaction Database) is a manually curated database which provides comprehensive experimentally validated information about the LMs mediating PPIs from all organisms on a single platform. About 2200 entries have been compiled by detailed manual curation of PubMed abstracts, of which about 1000 LM entries were being annotated for the first time, as compared with the Eukaryotic LM resource.

Methods in pypath

Data source (URLs and files)

Domain-motif interactions

Data format definition

Data input methods

Interactions


lncrnadb

Category || Subcategory >>> Undefined || Undefined

License: Nucleic Acids Research Open Access (NAR Open Access)

"The Nucleic Acids Research Database issue requires the databases to be freely usable, hence any resource published in this journal deemed to be free for both commercial and non-profit use."

Webpages


lncRNADisease

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)

Webpages


LOCATE

Category || Subcategory >>> Undefined || Undefined

License: Nucleic Acids Research Open Access (NAR Open Access)

"The Nucleic Acids Research Database issue requires the databases to be freely usable, hence any resource published in this journal deemed to be free for both commercial and non-profit use."

Webpages


LRdb

Category || Subcategory >>> Undefined || Undefined

Contact:

License: GNU General Public License version 3 (GPLv3)

Webpages


Macrophage

Category || Subcategory >>> Literature curated || Activity flow

Contact:

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Articles

Webpages

PubMed

Collections

Quotes

Ongoing analysis of macrophage-related datasets and an interest in consolidating our knowledge of a number of signalling pathways directed our choice of pathways to be mapped (see Figure 1). Public and propriety databases were initially used as resources for data mining, but ultimately all molecular interaction data was sourced from published literature. Manual curation of the literature was performed to firstly evaluate the quality of the evidence supporting an interaction and secondly, to extract the necessary and additional pieces of information required to 'understand' the pathway and construct an interaction diagram. We have drawn pathways based on our desire to model pathways active in a human macrophage and therefore all components have been depicted using standard human gene nomenclature (HGNC). However, our understanding of the pathway components and the interactions between them, have been drawn largely from a consensus view of literature knowledge. As such the pathways presented here are based on data derived from a range of different cellular systems and mammalian species (human and mouse).

Data integration in pypath: static

Methods in pypath

Data source (URLs and files)

Data format definition


Matrisome

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)

Webpages


MatrixDB

Category || Subcategory >>> Literature curated || Interaction

Released in years: 2009, 2011, 2015

Contact:

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Articles

Webpages

PubMed

Collections

Taxons: Mammalia

Quotes

Protein data were imported from the UniProtKB/Swiss-Prot database (Bairoch et al., 2005) and identified by UniProtKB/SwissProt accession numbers. In order to list all the partners of a protein, interactions are associated by default to the accession number of the human protein. The actual source species used in experiments is indicated in the page reporting interaction data. Intracellular and membrane proteins were included to obtain a comprehensive network of the partners of extracellular molecules. Indeed, ECM proteins and GAGs bind to a number of membrane proteins or cell-associated proteoglycans and some of them interact with intracellular partners upon internalization (Dixelius et al., 2000). ECM proteins were identified by the UniProtKB/Swiss-Prot keyword ‘extracellular matrix’ and by the GO terms ‘extracellular matrix’, ‘proteinaceous extracellular matrix’ and their child terms. The proteins annotated with the GO terms ‘extracellular region’ and ‘extracellular space’, which are used for proteins found in biological fluids, were not included because circulating molecules do not directly contribute to the extracellular scaffold. Additionally, 96 proteins were manually (re-)annotated through literature curation. MatrixDB integrates 1378 interactions from the Human Protein Reference Database (HPRD, Prasad et al., 2009), 211 interactions from the Molecular INTeraction database (MINT, Chatr-Aryamontri et al., 2007), 46 interactions from the Database of Interacting Proteins (DIP, Salwinski et al., 2004), 232 interactions from IntAct (Kerrien et al., 2007a) and 839 from BioGRID (Breitkreutz et al., 2008) involving at least one extracellular biomolecule of mammalian origin. We added 283 interactions from manual literature curation and 65 interactions from protein and GAG array experiments.

Notes

The interactions imported from IMEX databases or any other database, are collected separately, in the PSICQUIC-extended dataset. The MatrixDB-core dataset is curated manually by the MatrixDB team.

Data integration in pypath: static

Methods in pypath

Data source (URLs and files)

Data format definition


MCAM – Mammalian Cell Adhesion Molecule Database

Category || Subcategory >>> Undefined || Undefined

License: Unspecified Non-commercial Redistributable (Unspecified NC-SA)

"This is not a real license but a temporary label meaning either that we are working on clarifying the licensing terms (because the copyright holders hasn't specified it anywhere) or we as OmniPath got a permission from the copyright holders to redistribute the data from the resource."

Webpages


Membranome

Category || Subcategory >>> Undefined || Undefined

License: Apache License, version 2.0 (Apache 2.0)

Webpages


MIMP – Mutations IMpact on Phosphorylation

Category || Subcategory >>> Undefined || Undefined

License: GNU Lesser General Public License version 3 (LGPLv3)

Webpages


MINT – Molecular Interaction Database

Category || Subcategory >>> Literature curated and high-throughput || Interaction

Released in years: 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015

Contact:

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages

Articles

Collections


miR2Disease

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution-NonCommercial 2.0 International (CC BY-NC 2.0)

Webpages


miRBase

Category || Subcategory >>> Undefined || Undefined

Contact:

License: Creative Commons Zero 1.0 Universal (CC0 1.0)

Webpages


miRDeathDB

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)

Webpages


miRecords

Category || Subcategory >>> Undefined || Undefined

License: miRecords License

"All information available from this site is within the public domain."

Webpages


miRTarBase

Category || Subcategory >>> Undefined || Undefined

License: miRTarBase License

Webpages


MPPI – The MIPS Mammalian Protein-Protein Interaction Database

Category || Subcategory >>> Literature curated || Interaction

Released in years: 2000, 2005

Created by MIPS Munich

Contact:

License: MPPI License

Articles

Webpages

PubMed

Collections

Taxons: Human, Mammalia

Quotes

The first and foremost principle of our MPPI database is to favor quality over completeness. Therefore, we decided to include only published experimental evidence derived from individual experiments as opposed to large-scale surveys. High-throughput data may be integrated later, but will be marked to distinguish it from evidence derived from individual experiments.

Notes

This database contains hundreds of interactions curated manually from original papers. The format is perfect, with UniProt IDs, and PubMed references.

Data integration in pypath: static

Methods in pypath

Data source (URLs and files)

Data format definition


MSigDB – Molecular Signatures Database

Category || Subcategory >>> Undefined || Undefined

Contact:

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages


NCI-PID – NCI-Nature Pathway Interaction Database

Category || Subcategory >>> Literature curated || Process description

Released in years: 2008, 2012

Created by NCI

Contact:

License: 3-clause BSD License (BSD)

Webpages

Articles

PubMed

Collections

Taxons: Human

Direct data import from: BioCarta, Reactome

Quotes

In curating, editors synthesize meaningful networks of events into defined pathways and adhere to the PID data model for consistency in data representation: molecules and biological processes are annotated with standardized names and unambiguous identifiers; and signaling and regulatory events are annotated with evidence codes and references. To ensure accurate data representation, editors assemble pathways from data that is principally derived from primary research publications. The majority of data in PID is human; however, if a finding discovered in another mammal is also deemed to occur in humans, editors may decide to include this finding, but will also record that the evidence was inferred from another species. Prior to publication, all pathways are reviewed by one or more experts in a field for accuracy and completeness.

Notes

From the NCI-XML interactions with references, directions and signs can be extracted. Complexes are ommited.

From the end of 2015, the original NCI-PID webpage is not accessible anymore, and the data is available through the NDEx webserver and API.

Methods in pypath

Data source (URLs and files)

Data format definition

Data input methods

Interactions


ncRDeathDB

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)


Negatome

Category || Subcategory >>> Literature curated || Negative

Contact:

License: No license (No license)

Articles

Webpages

PubMed

Collections

Quotes

Annotation of the manual dataset was performed analogous to the annotation of protein–protein interactions and protein complexes in previous projects published by our group. Information about NIPs was extracted from scientific literature using only data from individual experiments but not from high-throughput experiments. Only mammalian proteins were considered. Data from high-throughput experiments were omitted in order to maintain the highest possible standard of reliability.

Data integration in pypath: static

Methods in pypath

Data source (URLs and files)

Data format definition

Miscellaneous


NetPath

Category || Subcategory >>> Literature curated || Process description

Released in years: 2010, 2011, 2012, 2013, 2014, 2015

Created by Pandey Lab, IOB Bangalore

Contact:

License: Creative Commons Attribution 2.5 International (CC BY 2.5)

Articles

Webpages

PubMed

Collections

Direct data import from: CancerCellMap

Includes data from: CancerCellMap

Quotes

The initial annotation process of any signaling pathway involves gathering and reading of review articles to achieve a brief overview of the pathway. This process is followed by listing all the molecules that arereported to be involved in the pathway under annotation. Information regarding potential pathway authorities are also gathered at this initial stage. Pathway experts are involved in initial screening of the molecules listed to check for any obvious omissions. In the second phase, annotators manually perform extensive literature searches using search keys, which include all the alter native names of the molecules involved, the name of the pathway, the names of reactions, and so on. In addition, the iHOP resource is also used to perform advanced PubMed-based literature searches to collect the reactions that were reported to be implicated in a given pathway. The collected reactions are manually entered using the PathBuilder annotation interface, which is subjected to an internal review process involving PhD level scientists with expertise in the areas of molecular biology, immunology and biochemistry. However, there are instances where a molecule has been implicated in a pathway in a published report but the associated experimental evidence is either weak or differs from experiments carried out by other groups. For this purpose, we recruit several investigators as pathway authorities based on their expertise in individual signaling pathways. The review by pathway authorities occasionally leads to correction of errors or, more commonly, to inclusion of additional information. Finally, the pathway authorities help in assessing whether the work of all major laboratories has been incorporated for the given signaling pathway.

Notes

Formats are unclear. The tab delimited format contains the pathway memberships of genes, PubMed references, but not the interaction partners! The Excel file is very weird, in fact it is not an excel table, and contains only a few rows from the tab file. The PSI-MI XML is much better. By writing a simple parser, a lot of details can be extracted.

Data integration in pypath: dynamic

Methods in pypath

Data source (URLs and files)

Data format definition

Data input methods

Interactions


NetworkBlast

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)


NetworKIN

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)


NFIRegulomeDB

Category || Subcategory >>> Undefined || Undefined

License: GNU Lesser General Public License version 3 (LGPLv3)

Webpages


NRF2ome

Category || Subcategory >>> Literature curated || Activity flow

Released in years: 2013

Created by NetBiol Group

Contact:

License: Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International (CC BY-NC-SA 3.0)

Webpages

Articles

PubMed

Taxons: Human

Quotes

From Korcsmaros 2010: ... we first listed signaling proteins and interactions from reviews and then added further signaling interactions of the listed proteins. We used reviews as a starting point, manually looked up interactions three times, and manually searched for interactions of known signaling proteins with no signaling interactions so far in the database.


OmniPath

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)


OPM – Orientations of Proteins in Membranes

Category || Subcategory >>> Undefined || Undefined

License: Apache License, version 2.0 (Apache 2.0)

Webpages


ORegAnno – Open Regulatory Annotation

Category || Subcategory >>> Literature curated & high throughput || Transcription regulation

Contact:

License: GNU Lesser General Public License version 3 (LGPLv3)

Articles

Webpages

PubMed

Quotes

Notes

One of the largest TF-target databases. Covers at least 18 organisms and contains data from literature curation and many screening technologies and in silico prediction.

Methods in pypath

Data source (URLs and files)

Data input methods


PANTHER – Pathway Analysis Through Evolutionary Relationships

Category || Subcategory >>> Literature curated || Process description

Released in years: 2000, 2001, 2002, 2003, 2005, 2006, 2010, 2011, 2012, 2014, 2016

Contact:

License: GNU General Public License version 2 (GPLv2)

Articles

Webpages

PubMed

Collections

Quotes

References are captured at three levels. First, each pathway as a whole requires a reference. For signaling pathways, at least three references, usually review papers, are required in order to provide a more objective view of the scope of the pathway. For metabolic pathways, a textbook reference is usually sufficient. Second, references are often associated to each molecule class in the pathway. Most of these references are OMIM records or review papers. Third, references are provided to support association of specific protein sequences with a particular molecule class, e.g., the SWISS-PROT sequence P53_HUMAN annotated as an instance of the molecule class ‘‘P53’’ appearing in the pathway class ‘‘P53 pathway’’. These are usually research papers that report the experimental evidence that a particular protein or gene participates in the reactions represented in the pathway diagram.


PathwayCommons

Category || Subcategory >>> Combined || Interaction

Released in years: 2010, 2011, 2012, 2013, 2014, 2015, 2016

Created by Bader Lab, MSKCC cBio

Contact:

License: PathwayCommons License

Webpages

Articles

PubMed

Collections

Direct data import from: Reactome, NCI-PID, CancerCellMap, BioCarta, HPRD, PhosphoSite, PANTHER, DIP, IntAct, BioGRID, BIND, CORUM

Quotes

Notes

Pathway Commons is a collection of publicly available pathway information from multiple organisms. It provides researchers with convenient access to a comprehensive collection of biological pathways from multiple sources represented in a common language for gene and metabolic pathway analysis.

Pathway Commons integrates a number of pathway and molecular interaction databases supporting BioPAX and PSI-MI formats into one large BioPAX model, which can be queried using our web API (documented below). This API can be used by computational biologists to download custom subsets of Pathway Commons for analysis, or can be used to incorporate powerful biological pathway and network information retrieval and query functionality into websites, scripts and software. For computational biologists looking for comprehensive biological pathway data for analysis, we also make available batch downloads of the data in several formats.

Warehouse data (canonical molecules, ontologies) are converted to BioPAX utility classes, such as EntityReference, ControlledVocabulary, EntityFeature sub-classes, and saved as the initial BioPAX model, which forms the foundation for integrating pathway data and for id-mapping.

Pathway and binary interaction data (interactions, participants) are normalized next and merged into the database. Original reference molecules are replaced with the corresponding BioPAX warehouse objects.


PAZAR – A Public Database of Transcription Factor and Regulatory Sequence Annotation

Category || Subcategory >>> Literature curated & high throughput || Transcription regulation

Released in years: 2007

Created by Wasserman Lab

Contact:

License: GNU Lesser General Public License version 3 (LGPLv3)

Articles

Webpages

PubMed

Quotes

Notes

One of the oldest and largest TF-target databases. From the Wasserman Lab, who also developed JASPAR and many other tools. Unfortunately the website is down at the moment (April 2019). Which was, by the way, in that time (2007) a super nice and innovative design for a molecular database webpage.

Methods in pypath

Data source (URLs and files)

Data input methods


PDB – Protein Data Bank

Category || Subcategory >>> Undefined || Undefined

License: PDB License

"Data files contained in the PDB archive (ftp://ftp.wwpdb.org) are free of all copyright restrictions and made fully and freely available for both non-commercial and commercial use."

Webpages


PDZBase

Category || Subcategory >>> Literature curated || Activity flow

Released in years: 2004

Created by Weinstein Group

Contact:

License: No license (No license)

Webpages

Articles

PubMed

Collections

Taxons: Human

Quotes

PDZBase is a database that aims to contain all known PDZ-domain-mediated protein-protein interactions. Currently, PDZBase contains approximately 300 such interactions, which have been manually extracted from >200 articles. PDZBase currently contains ∼300 interactions, all of which have been manually extracted from the literature, and have been independently verified by two curators. The extracted information comes from in vivo (co-immunoprecipitation) or in vitro experiments (GST-fusion or related pull-down experiments). Interactions identified solely from high throughput methods (e.g. yeast two-hybrid or mass spectrometry) were not included in PDZBase. Other prerequisites for inclusion in the database are: that knowledge of the binding sites on both interacting proteins must be available (for instance through a truncation or mutagenesis experiment); that interactions must be mediated directly by the PDZ-domain, and not by any other possible domain within the protein.

Methods in pypath

Data source (URLs and files)

Data format definition

Interactions


Phobius

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)


Phosphatome

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages


phospho.ELM

Category || Subcategory >>> Literature curated || Ptm

Released in years: 2004, 2007, 2010

Contact:

License: ELM Software License Agreement (ELM)

Webpages

Articles

PubMed

Collections

Quotes

Phospho.ELM http://phospho.elm.eu.org is a new resource containing experimentally verified phosphorylation sites manually curated from the literature and is developed as part of the ELM (Eukaryotic Linear Motif) resource. Phospho.ELM constitutes the largest searchable collection of phosphorylation sites available to the research community. The Phospho.ELM entries store information about substrate proteins with the exact positions of residues known to be phosphorylated by cellular kinases. Additional annotation includes literature references, subcellular compartment, tissue distribution, and information about the signaling pathways involved as well as links to the molecular interaction database MINT. Phospho.ELM version 2.0 contains 1,703 phosphorylation site instances for 556 phosphorylated proteins. (Diella 2004)

Data integration in pypath: dynamic

Methods in pypath

Data source (URLs and files)

Data format definition

Data input methods

Interactions

Enzyme-substrate relationships and PTMs


PhosphoNetworks

Category || Subcategory >>> Undefined || Undefined

License: PhosphoNetworks License

"The content in the database is free to academic and non-profit organizations. Users for commercial purpose please contact the authors before download the data sets."

Webpages


PhosphoPoint

Category || Subcategory >>> Literature curated and prediction || Post-translational modification

Contact:

License: Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)

Articles

Webpages

PubMed

Collections

Taxons: Human

Quotes

We have integrated three existing databases, including Phospho.ELM (release 6.0, total 9236 phosphorylation sites), HPRD (release 6, total 8992 phosphorylation sites), SwissProt (release 51.5, total 6529 phosphorylation sites), and our manually curated 400 kinase–substrate pairs, which are primarily from review articles.

Notes

It contains 400 manually curated interactions and much more from HTP methods. The manually curated set can not be distinguished in the data formats offered.

Data integration in pypath: static

Methods in pypath

Data source (URLs and files)

Data format definition

Data input methods

Miscellaneous


PhosphoSite – PhosphoSitePlus

Category || Subcategory >>> Literature curated and high throughput || Post-translational modification

Released in years: 2011, 2015, 2016

Created by CST

Contact:

License: Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International (CC BY-NC-SA 3.0)

Articles

Webpages

PubMed

Collections

Taxons: Human, Mouse, Eubacteria, Eukarya

Quotes

PSP integrates both low- and high-throughput (LTP and HTP) data sources into a single reliable and comprehensive resource. Nearly 10,000 journal articles , including both LTP and HTP reports, have been manually curated by expert scientists from over 480 different journals since 2001.

Data integration in pypath: dynamic

Methods in pypath

Data source (URLs and files)

Data format definition

Data input methods

Interactions

Miscellaneous

Enzyme-substrate relationships and PTMs


ProtMapper

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages


Ramilowski 2015

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages


REACH

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)


Reactome

Category || Subcategory >>> Literature curated || Process description

Released in years: 2004, 2008, 2010, 2012, 2014, 2015, 2016

Contact:

License: Creative Commons Zero 1.0 Universal (CC0 1.0)

Webpages

Articles

PubMed

Collections

Quotes

Once the content of the module is approved by the author and curation staff, it is peer-reviewed on the development web-site, by one or more bench biologists selected by the curator in consultation with the author. The peer review is open and the reviewers are acknowledged in the database by name. Any issues raised in the review are resolved, and the new module is scheduled for release.

Notes

No binary interactions can be exported programmatically from any format of the Reactome dataset. Reactome's curation method doesn't cover binary interactions, the inferred lists on the webpage are based on automatic expansion of complexes and reactions, and thus are unreliable. In lack of information, references cannot be assigned to interactions.

Data integration in pypath: dynamic

Methods in pypath

Data source (URLs and files)

Data format definition

Data input methods

Interactions


RegNetwork

Category || Subcategory >>> Undefined || Undefined

License: GNU Lesser General Public License version 3 (LGPLv3)

Webpages


ReMap

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)


RLIMS-P

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)


SignaLink

Category || Subcategory >>> Literature curated || Activity flow

Released in years: 2010, 2012, 2016

Created by NetBiol Group

Contact:

License: Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International (CC BY-NC-SA 3.0)

Webpages

Articles

PubMed

Collections

Taxons: Human, D. melanogaster, C. elegans

Quotes

In each of the three organisms, we first listed signaling proteins and interactions from reviews (and from WormBook in C.elegans) and then added further signaling interactions of the listed proteins. To identify additional interactions in C.elegans, we examined all interactions (except for transcription regulation) of the signaling proteins listed in WormBase and added only those to SignaLink that we could manually identify in the literature as an experimentally verified signaling interaction. For D.melanogaster, we added to SignaLink those genetic interactions from FlyBase that were also reported in at least one yeast-2-hybrid experiment. For humans, we manually checked the reliability and directions for the PPIs found with the search engines iHop and Chilibot.

Notes

For OmniPath we used the literature curated part of version 3 of SignaLink, which is unpublished yet. Version 2 is publicly available, and format definitions in pypath exist to load the version 2 alternatively.


SIGNOR – Signaling Network Open Resource

Category || Subcategory >>> Literature curated || Activity flow

Released in years: 2015

Created by Cesareni Group

Contact:

License: Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)

Webpages

Articles

PubMed

Collections

Direct data import from: SignaLink3, PhosphoSite

Quotes

SIGNOR, the SIGnaling Network Open Resource, organizes and stores in a structured format signaling information published in the scientific literature. The captured information is stored as binary causative relationships between biological entities and can be represented graphically as activity flow. The entire network can be freely downloaded and used to support logic modeling or to interpret high content datasets. The core of this project is a collection of more than 11000 manually-annotated causal relationships between proteins that participate in signal transduction. Each relationship is linked to the literature reporting the experimental evidence. In addition each node is annotated with the chemical inhibitors that modulate its activity. The signaling information is mapped to the human proteome even if the experimental evidence is based on experiments on mammalian model organisms.

Methods in pypath

Data source (URLs and files)

Data format definition

Interactions

Enzyme-substrate relationships and PTMs


Sparser

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)


SPIKE – Signaling Pathway Integrated Knowledge Engine

Category || Subcategory >>> Literature curated || Activity flow

Released in years: 2008, 2011, 2012

Created by Shamir Group, Shiloh Group

Contact:

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Articles

Webpages

PubMed

Collections

Quotes

SPIKE’s data on relationships between entities come from three sources: (i) Highly curated data submitted directly to SPIKE database by SPIKE curators and experts in various biomedical domains. (ii) Data imported from external signaling pathway databaes. At present, SPIKE database imports such data from Reactome, KEGG, NetPath and The Transcription Factor Encyclopedia (http://www.cisreg.ca/cgi-bin/tfe/home.pl). (iii) Data on protein–protein interactions (PPIs) imported either directly from wide-scale studies that recorded such interactions [to date,PPI data were imported from Stelzl et al., Rual et al. and Lim et al.] or from external PPI databases [IntAct and MINT]. Relationship data coming from these different sources vary greatly in their quality and this is reflected by a quality level attribute, which is attached to each relationship in SPIKE database (Supplementary Data). Each relationship in SPIKE is linked to at least one PubMed reference that supports it.

Data integration in pypath: static

Methods in pypath

Data format definition


STRING

Category || Subcategory >>> High-throughput and prediction || Interaction

Released in years: 2016, 2015, 2013, 2011, 2009, 2007, 2005, 2003, 2000

Created by Bork Lab

Contact:

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages

Articles

PubMed

Collections


Surfaceome

Category || Subcategory >>> Undefined || Undefined

Contact:

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages


TCDB – Transporter Classification Database

Category || Subcategory >>> Undefined || Undefined

Created by Saier Lab

Contact:

License: Creative Commons Attribution-ShareAlike 3.0 International (CC BY-SA 3.0)

Webpages


TfactS

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages


TFcensus – Census of Human Transcription Factors

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages


TFe – Transcription Factor Encyclopedia

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution-ShareAlike 3.0 International (CC BY-SA 3.0)

Webpages


TLR

Category || Subcategory >>> Literature curated || Model

License: No license (No license)

Articles


TopDB – Topology Data Bank of Transmembrane Proteins

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)

Webpages


TransmiR

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)

Webpages


TRED

Category || Subcategory >>> Undefined || Undefined

License: Nucleic Acids Research Open Access (NAR Open Access)

"The Nucleic Acids Research Database issue requires the databases to be freely usable, hence any resource published in this journal deemed to be free for both commercial and non-profit use."

Webpages


TRIP – Mammalian Transient Receptor Potential Channel-Interacting Protein Database

Category || Subcategory >>> Literature curated || Activity flow

Released in years: 2010, 2012

Contact:

License: Creative Commons Attribution-NonCommercial-ShareAlike 3.0 International (CC BY-NC-SA 3.0)

Articles

Webpages

PubMed

Collections

Taxons: Human, Mouse, Rat

Nodes: 468, Edges: 744

Quotes

The literature on TRP channel PPIs found in the PubMed database serve as the primary information source for constructing the TRIP Database. First, a list of synonyms for the term ‘TRP channels’ was constructed from UniprotKB, Entrez Gene, membrane protein databases (Supplementary Table S2) and published review papers for nomenclature. Second, using these synonyms, a list of articles was obtained through a PubMed search. Third, salient articles were collected through a survey of PubMed abstracts and subsequently by search of full-text papers. Finally, we selected articles that contain evidence for physical binding among the proteins denoted. To prevent omission of relevant papers, we manually screened information in other databases, such as DIP, IntAct, MINT, STRING, BioGRID, Entrez Gene, IUPHAR-DB and ISI Web of Knowledge (from Thomson Reuters). All 277 articles used for database construction are listed in our database website.

Notes

Good manually curated dataset focusing on TRP channel proteins, with ~800 binary interactions. The provided formats are not well suitable for bioinformatics use because of the non standard protein names, with greek letters and only human understandable formulas. Using HTML processing from 5-6 different tables, with couple hundreds lines of code, one have a chance to compile a usable table.

Methods in pypath

Data source (URLs and files)

Data format definition

Data input methods

Interactions


TRRD

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages


TRRUST

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)


UniProt

Category || Subcategory >>> Undefined || Undefined

Contact:

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages


Vesiclepedia

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages


HumanSignalingNetwork – Human Signaling Network version 6

Category || Subcategory >>> Literature curated || Activity flow

Released in years: 2009, 2010, 2011, 2012, 2013, 2014

Created by Wang Group

Contact:

License: Creative Commons Attribution-NonCommercial 3.0 International (CC BY-NC 3.0)

Webpages

Taxons: Human, Mouse, Rat

Direct data import from: Cui2007, BioCarta, CST, NCI-PID, iHOP

Quotes

Composed from multiple manually curated datasets, and contains own manual cuartion effort. Methods are unclear, and the dataset has not been published in reviewed paper. Based on the Cui et al 2007.

Notes

This network aims to merge multiple manually curated networks. Unfortunately a precise description of the sources and methods is missing. Also, the dataset does not include the references. Moreover, the data file misses header and key, so users can only guess about the meaning of columns and values.

Data integration in pypath: dynamic

Methods in pypath

Data source (URLs and files)

Data format definition

Interactions


WikiPathways

Category || Subcategory >>> Literature curated || Process description

Released in years: 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016

Contact:

License: Creative Commons Attribution 3.0 International (CC BY 3.0)

Webpages

Articles

Collections

Quotes

The goal of WikiPathways is to capture knowledge about biological pathways (the elements, their interactions and layout) in a form that is both human readable and amenable to computational analysis.


Zaman 2013

Category || Subcategory >>> Literature curated || Activity flow

Created by Wang Lab

Contact:

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Articles

PubMed

Quotes

The human signaling network (Version 4, containing more than 6,000 genes and more than 50,000 relations) includes our previous data obtained from manually curated signaling networks (Awan et al., 2007; Cui et al., 2007; Li et al., 2012) and by PID (http://pid.nci.nih.gov/) and our recent manual curations using the iHOP database (http://www.ihop-net.org/UniPub/iHOP/).


Zhong 2015

Category || Subcategory >>> Undefined || Undefined

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Webpages



Dénes Türei, Nicolàs Palacio, Olga Ivanova, Saez Lab 2016-2020. Feedback: omnipathdb@gmail.com

Valid HTML5 Valid CSS3