Disease variant prediction with deep generative models of evolutionary data – Nature.com

  • 1.

    Van Hout, C. V. et al. Exome sequencing and characterization of 49,960 individuals in the UK Biobank. Nature 586, 749–756 (2020).

    ADS 
    PubMed 
    PubMed Central 
    Article 
    CAS 

    Google Scholar
     

  • 2.

    Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).

    ADS 
    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  • 3.

    Landrum, M. J. & Kattman, B. L. ClinVar at five years: delivering on the promise. Hum. Mutat. 39, 1623–1630 (2018).

    PubMed 
    Article 
    PubMed Central 

    Google Scholar
     

  • 4.

    Raimondi, D. et al. DEOGEN2: prediction and interactive visualization of single amino acid variant deleteriousness in human proteins. Nucleic Acids Res. 45, W201-W206 (2017).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  • 5.

    Feng, B. J. PERCH: a unified framework for disease gene prioritization. Hum. Mutat. 38, 243–251 (2017).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  • 6.

    Ioannidis, N. M. et al. REVEL: an ensemble method for predicting the pathogenicity of rare missense variants. Am. J. Hum. Genet. 99, 877-885 (2016).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  • 7.

    Ionita-Laza, I., McCallum, K., Xu, B. & Buxbaum, J. D. A spectral approach integrating functional genomic annotations for coding and noncoding variants. Nat. Genet. 48, 214–220 (2016).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  • 8.

    Jagadeesh, K. A. et al. M-CAP eliminates a majority of variants of uncertain significance in clinical exomes at high sensitivity. Nat. Genet. 48, 1581-1586 (2016).

    CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar
     

  • 9.

    Rentzsch, P., Witten, D., Cooper, G. M., Shendure, J. & Kircher, M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 47, D886–D894 (2019).

    CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar
     

  • 10.

    Adzhubei, I. A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249 (2010).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  • 11.

    Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17, 405–424 (2015).

    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  • 12.

    Findlay, G. M. et al. Accurate classification of BRCA1 variants with saturation genome editing. Nature 562, 217–222 (2018).

    ADS 
    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  • 13.

    Glazer, A. M. et al. High-throughput reclassification of SCN5A variants. Am. J. Hum. Genet. 107, 111–123 (2020).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  • 14.

    Giacomelli, A. O. et al. Mutational processes shape the landscape of TP53 mutations in human cancer. Nat. Genet. 50, 1381–1387 (2018).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  • 15.

    Mighell, T. L., Evans-Dutson, S. & O’Roak, B. J. A saturation mutagenesis approach to understanding PTEN lipid phosphatase activity and genotype–phenotype relationships. Am. J. Hum. Genet. 102, 943–955 (2018).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  • 16.

    Jia, X. et al. Massively parallel functional testing of MSH2 missense variants conferring Lynch syndrome risk. Am. J. Hum. Genet. 108, 163–175 (2021).

    CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar
     

  • 17.

    Cao, Y. et al. The ChinaMAP analytics of deep whole genome sequences in 10,588 individuals. Cell Res. 30, 717–731 (2020).

    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  • 18.

    Gudbjartsson, D. F. et al. Large-scale whole-genome sequencing of the Icelandic population. Nat. Genet. 47, 435–444 (2015).

    CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar
     

  • 19.

    Esposito, D. et al. MaveDB: an open-source platform to distribute and interpret data from multiplexed assays of variant effect. Genome Biol. 20, 223 (2019).

    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  • 20.

    Trenkmann, M. Putting genetic variants to a fitness test. Nat. Rev. Genet. 19, 667 (2018).

    CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar
     

  • 21.

    Rehm, H. L. et al. ClinGen—the Clinical Genome Resource. N. Engl. J. Med. 372, 2235–2242 (2015).

    CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar
     

  • 22.

    Grimm, D. G. et al. The evaluation of tools used to predict the impact of missense variants is hindered by two types of circularity. Hum. Mutat. 36, 513–523 (2015).

    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  • 23.

    Hopf, T. A. et al. Mutation effects predicted from sequence co-variation. Nat. Biotechnol. 35, 128–135 (2017).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  • 24.

    Marks, D. S. et al. Protein 3D structure computed from evolutionary sequence variation. PLoS ONE 6, e28766 (2011).

    ADS 
    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  • 25.

    Hopf, T. A. et al. Sequence co-evolution gives 3D contacts and structures of protein complexes. eLife 3, e03430 (2014).

    PubMed Central 
    Article 

    Google Scholar
     

  • 26.

    Lapedes, A., Giraud, B. & Jarzynski, C. Using sequence alignments to predict protein structure and stability with high accuracy. Preprint at https://arxiv.org/abs/1207.2484v1 (2012).

  • 27.

    Vaser, R., Adusumalli, S., Leng, S. N., Sikic, M. & Ng, P. C. SIFT missense predictions for genomes. Nat. Protoc. 11, 1–9 (2016).

    CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar
     

  • 28.

    Reva, B., Antipin, Y. & Sander, C. Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic Acids Res. 39, e118 (2011).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  • 29.

    Rezende, D. J., Mohamed, S. & Wierstra, D. in Proceedings of the 31st International Conference on Machine Learning vol. 32 (eds Xing, E. P. & Jebara, T.) 1278–1286 (PMLR, 2014).

  • 30.

    Kingma, D. P. & Welling, M. Auto-encoding variational Bayes. Preprint at https://arxiv.org/abs/1312.6114 (2013).

  • 31.

    Riesselman, A. J., Ingraham, J. B. & Marks, D. S. Deep generative models of genetic variation capture the effects of mutations. Nat. Methods 15, 816–822 (2018).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  • 32.

    Suzek, B. E., Wang, Y., Huang, H., McGarvey, P. B. & Wu, C. H. UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics 31, 926–932 (2015).

    CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar
     

  • 33.

    Kalia, S. S. et al. Recommendations for reporting of secondary findings in clinical exome and genome sequencing, 2016 update (ACMG SF v2.0): a policy statement of the American College of Medical Genetics and Genomics. Genet. Med. 19, 249–255 (2017).

    PubMed 
    Article 
    PubMed Central 

    Google Scholar
     

  • 34.

    Frigo, G. et al. Homozygous SCN5A mutation in Brugada syndrome with monomorphic ventricular tachycardia and structural heart abnormalities. Europace 9, 391–397 (2007).

    PubMed 
    Article 
    PubMed Central 

    Google Scholar
     

  • 35.

    Itoh, H. et al. Asymmetry of parental origin in long QT syndrome: preferential maternal transmission of KCNQ1 variants linked to channel dysfunction. Eur. J. Hum. Genet. 24, 1160–1166 (2016).

    CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar
     

  • 36.

    Glazer, A. M. et al. Deep mutational scan of an SCN5A voltage sensor. Circ. Genom. Precis. Med. 13, e002786 (2020).

    CAS 
    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  • 37.

    Bouvet, D. et al. Methylation tolerance-based functional assay to assess variants of unknown significance in the MLH1 and MSH2 genes and identify patients with Lynch syndrome. Gastroenterology 157, 421–431 (2019).

    CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar
     

  • 38.

    Pan, X. et al. Structure of the human voltage-gated sodium channel Nav1.4 in complex with β1. Science 362, eaau2486 (2018).

    PubMed 
    Article 
    CAS 
    PubMed Central 

    Google Scholar
     

  • 39.

    Fishel, R. et al. The human mutator gene homolog MSH2 and its association with hereditary nonpolyposis colon cancer. Cell 75, 1027–1038 (1993).

    CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar
     

  • 40.

    Peltomaki, P. Role of DNA mismatch repair defects in the pathogenesis of human cancer. J. Clin. Oncol. 21, 1174-1179 (2003).

    CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar
     

  • 41.

    Warren, J. J. et al. Structure of the human MutSα DNA lesion recognition complex. Mol. Cell 26, 579–592 (2007).

    CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar
     

  • 42.

    Brnich, S. E. et al. Recommendations for application of the functional evidence PS3/BS3 criterion using the ACMG/AMP sequence variant interpretation framework. Genome Med. 12, 3 (2019).

    PubMed 
    PubMed Central 
    Article 

    Google Scholar
     

  • 43.

    Lewontin, R. C. The Genetic Basis of Evolutionary Change (Columbia Univ. Press, 1974).

  • 44.

    Kreitman, M. Nucleotide polymorphism at the alcohol dehydrogenase locus of Drosophila melanogaster. Nature 304, 412-417 (1983).

    ADS 
    CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar
     

  • 45.

    Sunyaev, S. et al. Prediction of deleterious human alleles. Hum. Mol. Genet. 10, 591–597 (2001).

    CAS 
    PubMed 
    Article 
    PubMed Central 

    Google Scholar
     

  • 46.

    IUCN. The IUCN red list of threatened species. IUCN https://www.iucnredlist.org (2020).