
Systems Biology Approaches for Autoimmune Diseases

To read the full-text of this research, you can request a copy directly from the authors.


Autoimmune diseases are complicated conditions defined by immune system dysregulation, which leads to abnormal immunological reactions to self-antigens. A systems biology approach is needed to fully understand the underlying processes and interactions between the many immune system components. The importance of each of the four genomics disciplines, namely, genomics, transcriptomics, proteomics, and metabolomics provided a new path to the study of autoimmune diseases. The integration of data from different omics disciplines is a necessary step toward developing a thorough understanding of autoimmune diseases. New developments in autoimmune disease systems biology methods such as the use of single-cell technologies, the function of systems biology in individualized treatment for autoimmune disorders, the use of machine learning techniques, the blending of multi-omics data, and the use of computational/in silico modeling made it possible to modify the treatment possibilities. This review describes systems biology and provides the function and significance of systems biology in the understanding of autoimmune diseases.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Full-text available
Systems biology is an approach to interrogate complex biological systems through large-scale quantification of numerous biomolecules. The immune system involves >1,500 genes/proteins in many interconnected pathways and processes, and a systems-level approach is critical in broadening our understanding of the immune response to vaccination. Changes in molecular pathways can be detected using high-throughput omics datasets (e.g., transcriptomics, proteomics, and metabolomics) by using methods such as pathway enrichment, network analysis, machine learning, etc. Importantly, integration of multiple omic datasets is becoming key to revealing novel biological insights. In this perspective article, we highlight the use of protein-protein interaction (PPI) networks as a multi-omics integration approach to unravel information flow and mechanisms during complex biological events, with a focus on the immune system. This involves a combination of tools, including: InnateDB, a database of curated interactions between genes and protein products involved in the innate immunity; NetworkAnalyst, a visualization and analysis platform for InnateDB interactions; and MetaBridge, a tool to integrate metabolite data into PPI networks. The application of these systems techniques is demonstrated for a variety of biological questions, including: the developmental trajectory of neonates during the first week of life, mechanisms in host-pathogen interaction, disease prognosis, biomarker discovery, and drug discovery and repurposing. Overall, systems biology analyses of omics data have been applied to a variety of immunology-related questions, and here we demonstrate the numerous ways in which PPI network analysis can be a powerful tool in contributing to our understanding of the immune system and the study of vaccines.
Full-text available
The use of multiple omics techniques (i.e., genomics, transcriptomics, proteomics, and metabolomics) is becoming increasingly popular in all facets of life science. Omics techniques provide a more holistic molecular perspective of studied biological systems compared to traditional approaches. However, due to their inherent data differences, integrating multiple omics platforms remains an ongoing challenge for many researchers. As metabolites represent the downstream products of multiple interactions between genes, transcripts, and proteins, metabolomics, the tools and approaches routinely used in this field could assist with the integration of these complex multi-omics data sets. The question is, how? Here we provide some answers (in terms of methods, software tools and databases) along with a variety of recommendations and a list of continuing challenges as identified during a peer session on multi-omics integration that was held at the recent ‘Australian and New Zealand Metabolomics Conference’ (ANZMET 2018) in Auckland, New Zealand (Sept. 2018). We envisage that this document will serve as a guide to metabolomics researchers and other members of the community wishing to perform multi-omics studies. We also believe that these ideas may allow the full promise of integrated multi-omics research and, ultimately, of systems biology to be realized.
Full-text available
Systems biology is an integrative discipline connecting the molecular components within a single biological scale and also among different scales (e.g. cells, tissues and organ systems) to physiological functions and organismal phenotypes through quantitative reasoning, computational models and high-throughput experimental technologies. Systems biology uses a wide range of quantitative experimental and computational methodologies to decode information flow from genes, proteins and other subcellular components of signaling, regulatory and functional pathways to control cell, tissue, organ and organismal level functions. The computational methods used in systems biology provide systems-level insights to understand interactions and dynamics at various scales, within cells, tissues, organs and organisms. In recent years, the systems biology framework has enabled research in quantitative and systems pharmacology and precision medicine for complex diseases. Here, we present a brief overview of current experimental and computational methods used in systems biology.
Full-text available
Ninety per cent of the world's data have been generated in the last 5 years (Machine learning: the power and promise of computers that learn by example Report no. DES4702. Issued April 2017. Royal Society). A small fraction of these data is collected with the aim of validating specific hypotheses. These studies are led by the development of mechanistic models focused on the causality of input-output relationships. However, the vast majority is aimed at supporting statistical or correlation studies that bypass the need for causality and focus exclusively on prediction. Along these lines, there has been a vast increase in the use of machine learning models, in particular in the biomedical and clinical sciences, to try and keep pace with the rate of data generation. Recent successes now beg the question of whether mechanistic models are still relevant in this area. Said otherwise, why should we try to understand the mechanisms of disease progression when we can use machine learning tools to directly predict disease outcome?
Full-text available
Detailed insights into the biological functions of the liver and an understanding of its crosstalk with other human tissues and the gut microbiota can be used to develop novel strategies for the prevention and treatment of liver-associated diseases, including fatty liver disease, cirrhosis, hepatocellular carcinoma and type 2 diabetes mellitus. Biological network models, including metabolic, transcriptional regulatory, protein-protein interaction, signalling and co-expression networks, can provide a scaffold for studying the biological pathways operating in the liver in connection with disease development in a systematic manner. Here, we review studies in which biological network models were used to integrate multiomics data to advance our understanding of the pathophysiological responses of complex liver diseases. We also discuss how this mechanistic approach can contribute to the discovery of potential biomarkers and novel drug targets, which might lead to the design of targeted and improved treatment strategies. Finally, we present a roadmap for the successful integration of models of the liver and other human tissues with the gut microbiota to simulate whole-body metabolic functions in health and disease.
Full-text available
Rheumatoid arthritis is an autoimmune disease that affects several organs and tissues, predominantly the synovial joints. The pathogenesis of this disease is not completely understood, which maybe involved in the genomic variations, gene expression, protein translation and post-translational modifications. These system variations in genomics, transcriptomics and proteomics are dynamic in nature and their crosstalk is overwhelmingly complex, thus analyzing them separately may not be very informative. However, various '-omics' techniques developed in recent years have opened up new possibilities for clarifying disease pathways and thereby facilitating early diagnosis and specific therapies. This review examines how recent advances in the fields of genomics, transcriptomics and proteomics have contributed to our understanding of rheumatoid arthritis.
Full-text available
Proteomics involves the applications of technologies for the identification and quantification of overall proteins present content of a cell, tissue or an organism. It supplements the other “omics” technologies such as genomic and transcriptomics to expound the identity of proteins of an organism, and to cognize the structure and functions of a particular protein. Proteomics-based technologies are utilized in various capacities for different research settings such as detection of various diagnostic markers, candidates for vaccine production, understanding pathogenicity mechanisms, alteration of expression patterns in response to different signals and interpretation of functional protein pathways in different diseases. Proteomics is practically intricate because it includes the analysis and categorization of overall protein signatures of a genome. Mass spectrometry with LC–MS-MS and MALDI-TOF/TOF being widely used equipment is the central among current proteomics. However, utilization of proteomics facilities including the software for equipment, databases and the requirement of skilled personnel substantially increase the costs, therefore limit their wider use especially in the developing world. Furthermore, the proteome is highly dynamic because of complex regulatory systems that control the expression levels of proteins. This review efforts to describe the various proteomics approaches, the recent developments and their application in research and analysis.
Full-text available
Background: Deep learning (DL) is a representation learning approach ideally suited for image analysis challenges in digital pathology (DP). The variety of image analysis tasks in the context of DP includes detection and counting (e.g., mitotic events), segmentation (e.g., nuclei), and tissue classification (e.g., cancerous vs. non-cancerous). Unfortunately, issues with slide preparation, variations in staining and scanning across sites, and vendor platforms, as well as biological variance, such as the presentation of different grades of disease, make these image analysis tasks particularly challenging. Traditional approaches, wherein domain-specific cues are manually identified and developed into task-specific “handcrafted” features, can require extensive tuning to accommodate these variances. However, DL takes a more domain agnostic approach combining both feature discovery and implementation to maximally discriminate between the classes of interest. While DL approaches have performed well in a few DP related image analysis tasks, such as detection and tissue classification, the currently available open source tools and tutorials do not provide guidance on challenges such as (a) selecting appropriate magnification, (b) managing errors in annotations in the training (or learning) dataset, and (c) identifying a suitable training set containing information rich exemplars. These foundational concepts, which are needed to successfully translate the DL paradigm to DP tasks, are non-trivial for (i) DL experts with minimal digital histology experience, and (ii) DP and image processing experts with minimal DL experience, to derive on their own, thus meriting a dedicated tutorial. Aims: This paper investigates these concepts through seven unique DP tasks as use cases to elucidate techniques needed to produce comparable, and in many cases, superior to results from the state-of-the-art hand-crafted feature-based classification approaches. Results: Specifically, in this tutorial on DL for DP image analysis, we show how an open source framework (Caffe), with a singular network architecture, can be used to address: (a) nuclei segmentation (F-score of 0.83 across 12,000 nuclei), (b) epithelium segmentation (F-score of 0.84 across 1735 regions), (c) tubule segmentation (F-score of 0.83 from 795 tubules), (d) lymphocyte detection (F-score of 0.90 across 3064 lymphocytes), (e) mitosis detection (F-score of 0.53 across 550 mitotic events), (f) invasive ductal carcinoma detection (F-score of 0.7648 on 50 k testing patches), and (g) lymphoma classification (classification accuracy of 0.97 across 374 images). Conclusion: This paper represents the largest comprehensive study of DL approaches in DP to date, with over 1200 DP images used during evaluation. The supplemental online material that accompanies this paper consists of step-by-step instructions for the usage of the supplied source code, trained models, and input data.
Full-text available
The measurement of autoantibodies in the clinical care of autoimmune patients allows for diagnosis, monitoring, and even disease prediction. Despite their clinical utility, the functional significance of autoantibody target proteins in many autoimmune diseases remains unclear. Here we present a comprehensive review of 52 autoantigens commonly employed for the serological diagnosis of 24 autoimmune diseases. We discuss their function, whether they have extracellular-exposed epitopes, and whether antibodies to these proteins are known to be pathogenic. Transcriptomics (RNA-Seq) datasets were mined to display messenger RNA (mRNA) expression of the autoantigens across 32 tissues and organs. This analysis revealed that autoantigens cluster into one of three groups: expression in the tissue most strongly affected in the disease (Group I), ubiquitous expression with enrichment in immune tissues (Group II), or expression in other tissues not typically associated with the clinical presentation (Group III). Clustering demonstrated that the autoantigens within Group I were often proteins containing extracellular epitopes, many of which are targets of pathogenic autoantibodies. Group II autoantigens were targets for several rheumatological diseases, including Sjögren syndrome, systemic lupus erythematosus, myositis, and systemic sclerosis, and were ubiquitously expressed with enrichment in immune-rich tissues. This raises the possibility that immune cells in Group II disorders may be the source of autoimmunization and/or targets of immune cell responses. Since tissues showing enriched autoantigen gene expression may contribute to the development of autoantibodies and subsequent autoimmunity, the emergent patterns arising from the autoantigen transcriptomic profiles may provide a new heuristic framework to deconvolute these complex disorders.
Full-text available
When considering the variation in the genome, transcriptome, proteome and metabolome, and their interaction with the environment, every individual can be rightfully considered as a unique biological entity. Individualized medicine promises to take this uniqueness into account to optimize disease treatment and thereby improve health benefits for every patient. The success of individualized medicine relies on a precise understanding of the genotype-phenotype relationship. Although omics technologies advance rapidly, there are several challenges that need to be overcome: Next generation sequencing can efficiently decipher genomic sequences, epigenetic changes, and transcriptomic variation in patients, but it does not automatically indicate how or whether the identified variation will cause pathological changes. This is likely due to the inability to account for (1) the consequences of gene-gene and gene-environment interactions, and (2) (post)transcriptional as well as (post)translational processes that eventually determine the concentration of key metabolites. The technologies to accurately measure changes in these latter layers are still under development, and such measurements in humans are also mainly restricted to blood and circulating cells. Despite these challenges, it is already possible to track dynamic changes in the human interactome in healthy and diseased states by using the integration of multi-omics data. In this review, we evaluate the potential value of current major bioinformatics and systems biology-based approaches, including genome wide association studies, epigenetics, gene regulatory and protein-protein interaction networks, and genome-scale metabolic modeling. Moreover, we address the question whether integrative analysis of personal multi-omics data will help understanding of personal genotype-phenotype relationships.
Full-text available
Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
Full-text available
Along with traditional biomarkers, an empathetic understanding of users can help create new value and also help guide and focus the discovery and development process. Copyright © 2014. Published by Elsevier Ltd.
Full-text available
The rise of systems biology is intertwined with that of genomics, yet their primordial relationship to one another is ill-defined. We discuss how the growth of genomics provided a critical boost to the popularity of systems biology. We describe the parts of genomics that share common areas of interest with systems biology today in the areas of gene expression, network inference, chromatin state analysis, pathway analysis, personalized medicine, and upcoming areas of synergy as genomics continues to expand its scope across all biomedical fields.
Full-text available
Because autoimmune diseases (AIDs) result from a complex combination of genetic and epigenetic factors, as well as an altered immune response to endogenous or exogenous antigens, systems biology approaches have been widely applied. The use of multi-omics approaches, including blood transcriptomics, genomics, epigenetics, proteomics, and metabolomics, not only allow for the discovery of a number of biomarkers but also will provide new directions for further translational AIDs applications. Systems biology approaches rely on high-throughput techniques with data analysis platforms that leverage the assessment of genes, proteins, metabolites, and network analysis of complex biologic or pathways implicated in specific AID conditions. To facilitate the discovery of validated and qualified biomarkers, better-coordinated multi-omics approaches and standardized translational research, in combination with the skills of biologists, clinicians, engineers, and bioinformaticians, are required.
Full-text available
Recent advances in microscope automation provide new opportunities for high-throughput cell biology, such as image-based screening. High-complex image analysis tasks often make the implementation of static and predefined processing rules a cumbersome effort. Machine-learning methods, instead, seek to use intrinsic data structure, as well as the expert annotations of biologists to infer models that can be used to solve versatile data analysis tasks. Here, we explain how machine-learning methods work and what needs to be considered for their successful application in cell biology. We outline how microscopy images can be converted into a data representation suitable for machine learning, and then introduce various state-of-the-art machine-learning algorithms, highlighting recent applications in image-based screening. Our Commentary aims to provide the biologist with a guide to the application of machine learning to microscopy assays and we therefore include extensive discussion on how to optimize experimental workflow as well as the data analysis pipeline.
Full-text available
Drugs are designed for therapy, but medication-related adverse events are common, and risk/benefit analysis is critical for determining clinical use. Rosiglitazone, an efficacious antidiabetic drug, is associated with increased myocardial infarctions (MIs), thus limiting its usage. Because diabetic patients are often prescribed multiple drugs, we searched for usage of a second drug ("drug B") in the Food and Drug Administration's Adverse Event Reporting System (FAERS) that could mitigate the risk of rosiglitazone ("drug A")-associated MI. In FAERS, rosiglitazone usage is associated with increased occurrence of MI, but its combination with exenatide significantly reduces rosiglitazone-associated MI. Clinical data from the Mount Sinai Data Warehouse support the observations from FAERS. Analysis for confounding factors using logistic regression showed that they were not responsible for the observed effect. Using cell biological networks, we predicted that the mitigating effect of exenatide on rosiglitazone-associated MI could occur through clotting regulation. Data we obtained from the db/db mouse model agreed with the network prediction. To determine whether polypharmacology could generally be a basis for adverse event mitigation, we analyzed the FAERS database for other drug combinations wherein drug B reduced serious adverse events reported with drug A usage such as anaphylactic shock and suicidality. This analysis revealed 19,133 combinations that could be further studied. We conclude that this type of crowdsourced approach of using databases like FAERS can help to identify drugs that could potentially be repurposed for mitigation of serious adverse events.
Full-text available
Transcriptomics is one of the most developed fields in the post-genomic era. Transcriptome is the complete set of RNA transcripts in a specific cell type or tissue at a certain developmental stage and/or under a specific physiological condition, including messenger RNA, transfer RNA, ribosomal RNA, and other non-coding RNAs. Transcriptomics focuses on the gene expression at the RNA level and offers the genome-wide information of gene structure and gene function in order to reveal the molecular mechanisms involved in specific biological processes. With the development of next-generation high-throughput sequencing technology, transcriptome analysis has been progressively improving our understanding of RNA-based gene regulatory network. Here, we discuss the concept, history, and especially the recent advances in this inspiring field of study.
Full-text available
Neuroinformatics is the field that merges the power of computational analysis with neuroscience. It is a discipline that has evolved from the original use of computers for data organization to the current development and application of sophisticated computational tools for large-scale data and image management, analysis and modeling of brain function in health and disease. Neuroinformatics has the potential to be a powerful instrument in the discovery of biological markers of neurological diseases, as well as in the development of new and more effective therapies. Owing to the exponential growth in size and complexity of the information available in the neurosciences, neuroinformatic methods are becoming indispensable in modern neurological research. We predict that, in the near future, they will also be essential at the bedside.
Full-text available
Cancer is associated with mutated genes, and analysis of tumour-linked genetic alterations is increasingly used for diagnostic, prognostic and treatment purposes. The genetic profile of solid tumours is currently obtained from surgical or biopsy specimens; however, the latter procedure cannot always be performed routinely owing to its invasive nature. Information acquired from a single biopsy provides a spatially and temporally limited snap-shot of a tumour and might fail to reflect its heterogeneity. Tumour cells release circulating free DNA (cfDNA) into the blood, but the majority of circulating DNA is often not of cancerous origin, and detection of cancer-associated alleles in the blood has long been impossible to achieve. Technological advances have overcome these restrictions, making it possible to identify both genetic and epigenetic aberrations. A liquid biopsy, or blood sample, can provide the genetic landscape of all cancerous lesions (primary and metastases) as well as offering the opportunity to systematically track genomic evolution. This Review will explore how tumour-associated mutations detectable in the blood can be used in the clinic after diagnosis, including the assessment of prognosis, early detection of disease recurrence, and as surrogates for traditional biopsies with the purpose of predicting response to treatments and the development of acquired resistance.
Full-text available
Faced with unsustainable costs and enormous amounts of under-utilized data, health care needs more efficient practices, research, and tools to harness the full benefits of personal health and healthcare-related data. Imagine visiting your physician's office with a list of concerns and questions. What if you could walk out the office with a personalized assessment of your health? What if you could have personalized disease management and wellness plan? These are the goals and vision of the work discussed in this paper. The timing is right for such a research direction-given the changes in health care, reimbursement, reform, meaningful use of electronic health care data, and patient-centered outcome mandate. We present the foundations of work that takes a Big Data driven approach towards personalized healthcare, and demonstrate its applicability to patient-centered outcomes, meaningful use, and reducing re-admission rates.
Full-text available
Autoimmune diseases (ADs) are chronic conditions initiated by the loss of immunological tolerance to self-antigens and represent a heterogeneous group of disorders that afflict specific target organs or multiple organ systems [1]. The chronic nature of these diseases places a significant burden on the utilization of medical care, direct and indirect economic costs, and quality of life. The fact that ADs share several clinical signs and symptoms (i.e., subphenotypes), physiopathological mechanisms, and genetic factors has been called autoimmune tautology and indicates that they have common mechanisms [2–8].
Full-text available
Autoimmune diseases have a complex etiology and despite great progress having been made in our comprehension of these disorders, there has been limited success in the development of approved medications based on these insights. Development of drugs and strategies for application in translational research and medicine are hampered by an inadequate molecular definition of the human autoimmune phenotype and the organizational models that are necessary to clarify this definition.
Full-text available
Invasive nontyphoid Salmonella (iNTS) disease is common and severe in adults with human immunodeficiency virus (HIV) infection in Africa. We previously observed that ex vivo macrophages from HIV-infected subjects challenged with Salmonella Typhimurium exhibit dysregulated proinflammatory cytokine responses. We studied the transcriptional response in whole blood from HIV-positive patients during acute and convalescent iNTS disease compared to other invasive bacterial diseases, and to HIV-positive and -negative controls. During iNTS disease, there was a remarkable lack of a coordinated inflammatory or innate immune signaling response. Few interferon γ (IFNγ)-induced genes or Toll-like receptor/transcription factor nuclear factor κB (TLR/NFκB) gene pathways were upregulated in expression. Ex vivo lipopolysacharide (LPS) or flagellin stimulation of whole blood, however, showed that convalescent iNTS subjects and controls were competent to mount prominent TLR/NFκB-associated patterns of mRNA expression. In contrast, HIV-positive patients with other invasive bacterial infections (Escherichia coli and Streptococcus pneumoniae) displayed a pronounced proinflammatory innate immune transcriptional response. There was also upregulated mRNA expression in cell cycle, DNA replication, translation and repair, and viral replication pathways during iNTS. These patterns persisted for up to 2 months into convalescence. Attenuation of NFκB-mediated inflammation and dysregulation of cell cycle and DNA-function gene pathway expression are key features of the interplay between iNTS and HIV.
Full-text available
Selective immunoglobulin A deficiency (IgAD) is the most common primary immunodeficiency in Caucasians. It has previously been suggested to be associated with a variety of concomitant autoimmune diseases. In this review, we present data on the prevalence of IgAD in patients with Graves disease (GD), systemic lupus erythematosus (SLE), type 1 diabetes (T1D), celiac disease (CD), myasthenia gravis (MG) and rheumatoid arthritis (RA) on the basis of both our own recent large-scale screening results and literature data. Genetic factors are important for the development of both IgAD and various autoimmune disorders, including GD, SLE, T1D, CD, MG and RA, and a strong association with the major histocompatibility complex (MHC) region has been reported. In addition, non-MHC genes, such as interferon-induced helicase 1 (IFIH1) and c-type lectin domain family 16, member A (CLEC16A), are also associated with the development of IgAD and some of the above diseases. This indicates a possible common genetic background. In this review, we present suggestive evidence for a shared genetic predisposition between these disorders.
Alopecia areata is an autoimmune disease that results in non-scarring hair loss, and it is clinically characterised by small patches of baldness on the scalp and/or around the body. It can later progress to total loss of scalp hair (Alopecia totalis) and/or total loss of all body hair (Alopecia universalis). The rapid rate of hair loss and dis-figuration caused by the condition causes anxiety on patients and increases the risks of developing psychological and psychiatric complications. Hair loss in alopecia areata is caused by lymphocytic infiltrations around the hair follicles and IFN-γ. IgG antibodies against the hair follicle cells are also found in alopecia areata sufferers. In addition, the disease coexists with other autoimmune disorders and can come secondary to infections or inflammation. However, despite the growing knowledge about alopecia areata, the aetiology and pathophysiology of disease are not well defined. In this review we discuss various genetic and environmental factors that cause autoimmunity and describe the immune mechanisms that lead to hair loss in alopecia areata patients.
Multiple sclerosis (MS) is a neuroinflammatory disorder characterized by autoimmune-mediated inflammatory lesions in CNS leading to myelin damage and axonal loss. MS is a heterogenous disease with variable and unpredictable disease course. Due to its complex nature, MS is difficult to diagnose and responses to specific treatments may vary between individuals. Therefore, there is an indisputable need for biomarkers for early diagnosis, prediction of disease exacerbations, monitoring the progression of disease, and for measuring responses to therapy. Genomic and proteomic studies have sought to understand the molecular basis of MS and find biomarker candidates. Advances in next-generation sequencing and mass-spectrometry techniques have yielded an unprecedented amount of genomic and proteomic data; yet, translation of the results into the clinic has been underwhelming. This has prompted the development of novel data science techniques for exploring these large datasets to identify biologically relevant relationships and ultimately point towards useful bio-markers. Herein we discuss optimization of omics study designs, advances in the generation of omics data, and systems biology approaches aimed at improving biomarker discovery and translation to the clinic for MS.
Machine learning, a collection of data-analytical techniques aimed at building predictive models from multi-dimensional datasets, is becoming integral to modern biological research. By enabling one to generate models that learn from large datasets and make predictions on likely outcomes, machine learning can be used to study complex cellular systems such as biological networks. Here, we provide a primer on machine learning for life scientists, including an introduction to deep learning. We discuss opportunities and challenges at the intersection of machine learning and network biology, which could impact disease biology, drug discovery, microbiome research, and synthetic biology. Machine-learning approaches are essential for pulling information out of the vast datasets that are being collected across biology and biomedicine. This Review considers the opportunities and challenges at the intersection of network biology and data science.
There have been major advances in our knowledge of the contribution of DNA sequence variations to cardiovascular disease and stroke. However, the inner workings of the body reflect the complex interplay of factors beyond the DNA sequence, including epigenetic modifications, RNA transcripts, proteins, and metabolites, which together can be considered the "expressed genome." The emergence of high-throughput technologies, including epigenomics, transcriptomics, proteomics, and metabolomics, is now making it possible to address the contributions of the expressed genome to cardiovascular disorders. This statement describes how the expressed genome can currently and, in the future, potentially be used to diagnose diseases and to predict who will develop diseases such as coronary artery disease, stroke, heart failure, and arrhythmias.
Efforts to understand autoimmunity have been pursued relentlessly for several decades. It has become apparent that the immune system evolved multiple mechanisms for controlling self-reactivity, and defects in one or more of these mechanisms can lead to a breakdown of tolerance. Among the multitude of lesions associated with disease, the most common seem to affect peripheral tolerance rather than central tolerance. The initial trigger for both systemic autoimmune disorders and organ-specific autoimmune disorders probably involves the recognition of self or foreign molecules, especially nucleic acids, by innate sensors. Such recognition, in turn, triggers inflammatory responses and the engagement of previously quiescent autoreactive T cells and B cells. Here we summarize the most prominent autoimmune pathways and identify key issues that require resolution for full understanding of pathogenic autoimmunity.
Metabolomics generates a profile of small molecules that are derived from cellular metabolism and can directly reflect the outcome of complex networks of biochemical reactions, thus providing insights into multiple aspects of cellular physiology. Technological advances have enabled rapid and increasingly expansive data acquisition with samples as small as single cells; however, substantial challenges in the field remain. In this primer we provide an overview of metabolomics, especially mass spectrometry (MS)-based metabolomics, which uses liquid chromatography (LC) for separation, and discuss its utilities and limitations. We identify and discuss several areas at the frontier of metabolomics. Our goal is to give the reader a sense of what might be accomplished when conducting a metabolomics experiment, now and in the near future.
Transcriptomics, the high-throughput characterization of RNAs, has been instrumental in defining pathogenic signatures in human autoimmunity and autoinflammation. It enabled the identification of new therapeutic targets in IFN-, IL-1- and IL-17-mediated diseases. Applied to immunomonitoring, transcriptomics is starting to unravel diagnostic and prognostic signatures that stratify patients, track molecular changes associated with disease activity, define personalized treatment strategies, and generally inform clinical practice. Herein, we review the use of transcriptomics to define mechanistic, diagnostic, and predictive signatures in human autoimmunity and autoinflammation. We discuss some of the analytical approaches applied to extract biological knowledge from high-dimensional data sets. Finally, we touch upon emerging applications of transcriptomics to study eQTLs, B and T cell repertoire diversity, and isoform usage. Expected final online publication date for the Annual Review of Immunology Volume 35 is April 26, 2017. Please see for revised estimates.
Personalized medicine encompasses a broad and evolving field informed by a patient distinctive information and biomarker profile. Although terminology is evolving and some semantic interpretations exist (e.g., personalized, individualized, precision), in a broad sense personalized medicine can be coined as: "To practice medicine as it once used to be in the past using the current biotechnological tools." A humanized approach to personalized medicine would offer the possibility of exploiting systems biology and its concept of P5 medicine, where predictive factors for developing a disease should be examined within populations in order to establish preventive measures on at-risk individuals, for whom healthcare should be personalized and participatory. Herein, the process of personalized medicine is presented together with the options that can be offered in health care systems with limited resources for diseases like rheumatoid arthritis and type 1 diabetes.
Metabolomics is a rapidly growing field consisting of the analysis of a large number of metabolites at a system scale. The two major goals of metabolomics are the identification of the metabolites characterizing each organism state and the measurement of their dynamics under different situations (e.g. pathological conditions, environmental factors). Knowledge about metabolites is crucial for the understanding of most cellular phenomena, but this information alone is not sufficient to gain a comprehensive view of all the biological processes involved. Integrated approaches combining metabolomics with transcriptomics and proteomics are thus required to obtain much deeper insights than any of these techniques alone. Although this information is available, multilevel integration of different 'omics' data is still a challenge. The handling, processing, analysis and integration of these data require specialized mathematical, statistical and bioinformatics tools, and several technical problems hampering a rapid progress in the field exist. Here, we review four main tools for number of users or provided features (MetaCore(TM), MetaboAnalyst, InCroMAP and 3Omics) out of the several available for metabolomic data analysis and integration with other 'omics' data, highlighting their strong and weak aspects; a number of related issues affecting data analysis and integration are also identified and discussed. Overall, we provide an objective description of how some of the main currently available software packages work, which may help the experimental practitioner in the choice of a robust pipeline for metabolomic data analysis and integration.
This chapter reviews the evidence that associates epigenetic mechanisms with the toxic and carcinogenic effects of certain environmental agents. Many of these chemicals have been shown to modify the same or similar epigenetic marks found in patients with disease states associated with that agent. For example, the increase in global DNA hypomethylation levels and promoter specific hypermethylation of tumor suppressor genes in response to arsenic exposure is consistent with the observed promoter specific hypermethylation and widespread loss in DNA methylation levels exhibited by cancer cells. Exposure to environmental factors such as metals, the semi-metal selenium, peroxisome proliferators, radiation, particulate matter, tobacco smoke, benzene, endocrine disruptors, and polycyclic aromatic hydrocarbons, perturb global and gene-specific DNA methylation levels as well as histone posttranslational modifications. Although the mechanism(s) by which these chemicals perturb the epigenome is unknown, many studies suggest an alteration in the expression and/or activity of enzymes that modify DNA and histone tails such as DNA methyltransferases and histone methyltransferases, deacetylases, and demethylases. Therefore, the evidence emphasizes the importance of applying to toxicological research the understanding that genotoxic mechanisms are not the sole mechanism underlying the changes in gene expression leading to cancer and other disease states.
In this paper, we provide an introduction to machine learning tasks that address important problems in genomic medicine. One of the goals of genomic medicine is to determine how variations in the DNA of individuals can affect the risk of different diseases, and to find causal explanations so that targeted therapies can be designed. Here we focus on how machine learning can help to model the relationship between DNA and the quantities of key molecules in the cell, with the premise that these quantities, which we refer to as cell variables, may be associated with disease risks. Modern biology allows high-throughput measurement of many such cell variables, including gene expression, splicing, and proteins binding to nucleic acids, which can all be treated as training targets for predictive models. With the growing availability of large-scale data sets and advanced computational techniques such as deep learning, researchers can help to usher in a new era of effective genomic medicine.
In this review, we discuss whether novel techniques in mass spectrometry, from ultrahigh resolution detection to data-independent MS/MS and ion mobility methods, have advanced so far that selectivity and sensitivity of untargeted analyses have indeed reached a point at which hypothesis-driven validation studies can be conducted by accurate mass profiling methods rather than classic triple-quadrupole multiple-reaction monitoring. To this end, we reviewed original MS-based metabolomics and lipidomics papers published mainly in years 2012–2015 with focus on sample extraction, LC separation, MS detection, and data processing.
Spurred by advances in processing power, memory, storage, and an unprecedented wealth of data, computers are being asked to tackle increasingly complex learning tasks, often with astonishing success. Computers have now mastered a popular variant of poker, learned the laws of physics from experimental data, and become experts in video games - tasks that would have been deemed impossible not too long ago. In parallel, the number of companies centered on applying complex data analysis to varying industries has exploded, and it is thus unsurprising that some analytic companies are turning attention to problems in health care. The purpose of this review is to explore what problems in medicine might benefit from such learning approaches and use examples from the literature to introduce basic concepts in machine learning. It is important to note that seemingly large enough medical data sets and adequate learning algorithms have been available for many decades, and yet, although there are thousands of papers applying machine learning algorithms to medical data, very few have contributed meaningfully to clinical care. This lack of impact stands in stark contrast to the enormous relevance of machine learning to many other industries. Thus, part of my effort will be to identify what obstacles there may be to changing the practice of medicine through statistical learning approaches, and discuss how these might be overcome.
Multiple sclerosis (MS) is a chronic inflammatory disease of the central nervous system and common cause of non-traumatic neurological disability in young adults. The likelihood for an individual to develop MS is strongly influenced by her or his ethnic background and family history of disease, suggesting that genetic susceptibility is a key determinant of risk. Over 100 loci have been firmly associated with susceptibility, whereas the main signal genome-wide maps to the class II region of the human leukocyte antigen (HLA) gene cluster and explains up to 10.5% of the genetic variance underlying risk. HLA-DRB1*15:01 has the strongest effect with an average odds ratio of 3.08. However, complex allelic hierarchical lineages, cis/trans haplotypic effects, and independent protective signals in the class I region of the locus have been described as well. Despite the remarkable molecular dissection of the HLA region in MS, further studies are needed to generate unifying models to account for the role of the MHC in disease pathogenesis. Driven by the discovery of combinatorial associations of Killer-cell Immunoglobulin-like Receptor (KIR) and HLA alleles with infectious, autoimmune diseases, transplantation outcome and pregnancy, multi-locus immunogenomic research is now thriving. Central to immunity and critically important for human health, KIR molecules and their HLA ligands are encoded by complex genetic systems with extraordinarily high levels of sequence and structural variation and complex expression patterns. However, studies to-date of KIR in MS have been few and limited to very low resolution genotyping. Application of modern sequencing methodologies coupled with state of the art bioinformatics and analytical approaches will permit us to fully appreciate the impact of HLA and KIR variation in MS. Copyright © 2015. Published by Elsevier Ltd.
Rheumatoid arthritis (RA) is a chronic, inflammatory joint disease that mainly attacks synovial joints. However, the underlying systematic relationship among different genes and biological processes involved in the pathogenesis are still unclear. By analyzing and comparing the transcriptional profiles from RA, OA (osteoarthritis) patients as well as ND (normal donors) with bioinformatics methods, we tended to uncover the potential molecular networks and critical genes which play important roles in RA and OA developmemt. Initially, hierarchical clustering was performed to classify the overall transcriptional profiles. Differentially Expressed Genes (DEGs) between ND and RA, OA patients were identified. Furthermore, PPI networks were constructed and functional modules were extracted, functional annotation was also applied. Our functional analysis identifies 22 biological processes and 2 KEGG pathways enriched in the commonly-regulated gene set. However, we found that number of set of genes differentially expressed genes only between RA and ND reaches up to 244, indicating this gene set may specifically accounts for processing to disease of RA. Additionally, 142 biological processes and 19 KEGG pathways are over-represented by these 244 genes. Meanwhile, although another 21 genes differentially expressed in only OA and ND, none of biological process nor pathway is over-represented by them. Copyright © 2015. Published by Elsevier B.V.
Guidelines for submitting commentsPolicy: Comments that contribute to the discussion of the article will be posted within approximately three business days. We do not accept anonymous comments. Please include your email address; the address will not be displayed in the posted comment. Cell Press Editors will screen the comments to ensure that they are relevant and appropriate but comments will not be edited. The ultimate decision on publication of an online comment is at the Editors' discretion. Formatting: Please include a title for the comment and your affiliation. Note that symbols (e.g. Greek letters) may not transmit properly in this form due to potential software compatibility issues. Please spell out the words in place of the symbols (e.g. replace “α” with “alpha”). Comments should be no more than 8,000 characters (including spaces ) in length. References may be included when necessary but should be kept to a minimum. Be careful if copying and pasting from a Word document. Smart quotes can cause problems in the form. If you experience difficulties, please convert to a plain text file and then copy and paste into the form.
Regulatory T (Treg) cells play a vital role in the prevention of autoimmunity and the maintenance of self-tolerance, but these cells also have an active role in inhibiting immune responses during viral, bacterial, and parasitic infections. Although excessive Treg activity can lead to immunodeficiency, chronic infection, and cancer, too little Treg activity results in autoimmunity and immunopathology and impairs the quality of pathogen-specific responses. Recent studies have helped define the homeostatic mechanisms that support the diverse pool of peripheral Treg cells under steady-state conditions and delineate how the abundance and function of Treg cells changes during inflammation. These findings are highly relevant for developing effective strategies to manipulate Treg cell activity to promote allograft tolerance and treat autoimmunity, chronic infection, and cancer.
We are currently witnessing the advent of a revolutionary new tool for biomedical research. Complex biochemically, biophysicall and pharmacologically detailed mathematical models of ‘living cells’ are being arranged in morphologically representativ tissue assemblies, and, using large–scale supercomputers, utilized to produce anatomically structured models of integrate tissue and organ function. This provides biomedical sciences with a radical new tool: ‘in silico’ organs, organ systems and, ultimately, organisms. In silico models will be a crucial tool for biomedical research and development in the new millennium, extracting knowledge from th vast amount of increasingly detailed data, and integrating this into a comprehensive analytical description of biologica function with predictive power: the Physiome. Our review will illustrate this approach using the example of the cardiovascula system, which, along with neurophysiology, has been at the forefront of analytical bio–mathematical modelling for many years and which is about to deliver the first anatomico–physiological model of a whole organ. Already, electrophysiologically detaile cardiac cell models have been incorporated into mathematical descriptions of representative ventricular tissue architectur and anatomy, including the coronary vasculature, and assimilated to realistic representation of ventricular active and passiv mechanical properties. This is being extended by matching atrial models and linked to an artificial torso to compute the bod surface electrocardiogram as a function of sub–cellular activity during various (patho–)physiological conditions. We wil illustrate the utility of in silico biological research in the context of refinement and partial replacement of in vivo and in vitro experimental work, show the potential of this approach for devising patient–specific treatment strategies, and try to forecas the impact of this new technology on biomedical research, health–care, and related industries.
Multisystem autoimmune rheumatic diseases are heterogeneous rare disorders associated with substantial morbidity and mortality. Efforts to create international consensus within the past decade have resulted in the publication of new classification or nomenclature criteria for several autoimmune rheumatic diseases, specifically for systemic lupus erythematosus, Sjögren's syndrome, and the systemic vasculitides. Substantial progress has been made in the formulation of new criteria in systemic sclerosis and idiopathic inflammatory myositis. Although the autoimmune rheumatic diseases share many common features and clinical presentations, differentiation between the diseases is crucial because of important distinctions in clinical course, appropriate drugs, and prognoses. We review some of the dilemmas in the diagnosis of these autoimmune rheumatic diseases, and focus on the importance of new classification criteria, clinical assessment, and interpretation of autoimmune serology. In this era of improvement of mortality rates for patients with autoimmune rheumatic diseases, we pay particular attention to the effect of leading complications, specifically cardiovascular manifestations and cancer, and we update epidemiology and prognosis.
Genome-wide analyses and high-throughput screening was long reserved for biomedical applications and genetic model organisms. With the rapid development of massively parallel sequencing nanotechnology (or next-generation sequencing) and simultaneous maturation of bioinformatic tools, this situation has dramatically changed. Genome-wide thinking is forging its way into disciplines like evolutionary biology or molecular ecology that were historically confined to small-scale genetic approaches. Accessibility to genome-scale information is transforming these fields, as it allows us to answer long-standing questions like the genetic basis of local adaptation and speciation or the evolution of gene expression profiles that until recently were out of reach. Many in the eco-evolutionary sciences will be working with large-scale genomic data sets, and a basic understanding of the concepts and underlying methods is necessary to judge the work of others. Here, I briefly introduce next-generation sequencing and then focus on transcriptome shotgun sequencing (RNA-seq). This article gives a broad overview and provides practical guidance for the many steps involved in a typical RNA-seq work flow from sampling, to RNA extraction, library preparation and data analysis. I focus on principles, present useful tools where appropriate and point out where caution is needed or progress to be expected. This tutorial is mostly targeted at beginners, but also contains potentially useful reflections for the more experienced.
Biosensors are a cunning combination of biological molecules and microelectronics that can be used to measure blood glucose levels, pollutants in the environment or food-borne pathogens in the food supply. In a comprehensive TechView, Anthony Turner takes us on a tour of historical developments and the latest innovations in biosensor research.
Systems pharmacology approaches can be used to identify and predict drug-induced adverse events.Disease-centered networks within the human interactome allow us to predict which drugs may produce a similar pathophysiology. Such predictions can be tested in animal models.
Autoimmune diseases (AIDs) are believed to be multifactorial diseases that commonly involve multiple organ systems. About three fourth of the patients afflicted with AIDs are women suggesting that sex differences impact the incidence of AID. However, the proportion of females to males suffering from AID varies depending on the disease. The response to some AID therapeutics also differs in females versus males, suggesting that enrollment of adequate numbers of women and men is important in clinical trials for development of AID drugs. It is known for a long time that genetic factors are important contributors to AID susceptibility. Currently available information suggests that multiple genes with modest association to AID contribute to susceptibility to AID. Also, the associations may differ for the various ethnicities. The major histocompatibility (MHC) locus appears to be a major genetic factor that confers susceptibility to multiple AIDs, even though the locus is complex and has the highest density of genes in the human genome. Thus, the association of different AIDs could be with different genes in the MHC locus. Among the non-MHC genes, some of the risk alleles are shared between different AIDs, but may not be common to all AIDs. For example, genetic polymorphisms in the Protein Tyrosine Phosphatase-22 (PTPN22) gene have reproducibly shown to have association with systemic lupus erythematosus (SLE), Graves' disease (GD), rheumatoid arthritis (RA) and multiple sclerosis (MS), but not with psoriasis. Identification of factors responsible for risk for developing AID and the of the pathways underlying these diseases are likely to help understand subsets of disease, identify responders to a specific treatment and develop better therapeutics for AID.
CD56(bright) NK cells, which may play a role in immunoregulation, are expanded in multiple sclerosis (MS) patients treated with immunomodulatory therapies such as daclizumab and interferon-beta (IFNβ). Yet, whether this NK cell subset is directly involved in the therapeutic effect is unknown. As NK receptor (NKR) expression by subsets of NK cells and CD8+ T lymphocytes is related to MS clinical course, we addressed whether CD56(bright) NK cells and NKR in IFNβ-treated MS patients differ according to the clinical response. IFNβ was associated to lower LILRB1+ and KIR+NK cells, and higher NKG2A+NK cell proportions, an immunophenotypic pattern mainly found in responders. After IFNβ treatment, a CD56(bright) NK cell expansion was significantly related to a positive clinical response. Our results reveal that IFNβ may promote in responders changes in the NK cell immunophenotype, corresponding to the profile found at early maturation stages of this lymphocyte lineage.