

HTO platforms promoted the holistic theory that both complex and common diseases are due to the interactions of several muted genes or proteins, contrasting with the old theory that a disease is due to the mutation of a single gene. In addition, experimental techniques such as yeast two-hybrid (Y2H) and mass spectrometry (MS), are commonly used to detect interacting proteins. High-Throughput Omics (HTO) experimental platforms that include protein, Single-Nucleotide polymorphisms (SNPs) and gene expression microarrays, Genome-Wide Association Studies (GWAS) and Next-Generation Sequencing (NGS) can simultaneously investigate thousands of genes for a single experiment.

We have identified that network alignment presents a lot of open problems worthy of further investigation, especially concerning pathway alignment. Finally, we discuss the challenges and the limitations of pathways and PPI network representation and analysis. In this paper, we discuss the main models for standard representation of pathways and PPI networks, the data models for the representation and exchange of pathway and protein interaction data, the main databases in which they are stored and the alignment algorithms for the comparison of pathways and PPI networks of different organisms. Otherwise, biological networks such as Protein–Protein Interaction (PPI) Networks, represent the biochemical interactions among proteins by using nodes that model the proteins from a given organism, and edges that model the protein–protein interactions, whereas pathway networks enable the representation of biochemical-reaction cascades that happen within the cells or tissues. For instance, in biological pathway representation, the nodes can represent proteins, RNA and fat molecules, while the edges represent the interaction between molecules. Pathway and interactomics data are represented as graphs and add a new dimension of analysis, allowing, among other features, graph-based comparison of organisms’ properties. High-Throughput technologies are producing an increasing volume of data that needs large amounts of data storage, effective data models and efficient, possibly parallel analysis algorithms.
