AlphaFold

維基百科,自由的百科全書
three individual polypeptide chains at different levels of folding and a cluster of chains
氨基酸折疊形成蛋白質

AlphaFold(直譯:阿爾法折疊)是Alphabet旗下Google旗下DeepMind開發的一款蛋白質結構預測程式[1]。該程序被設計為一個深度學習系統[2]

AlphaFold人工智能有2個主要版本:AlphaFold 1(2018)和AlphaFold 2(2020)。前者使用AlphaFold 1在2018年12月的第13屆CASP(英語:Critical Assessment of protein Structure Prediction,直譯:蛋白質結構預測的關鍵評估)的排名中第一。該程序特別成功地預測了被競賽組織者評為最困難的目標的最準確結構,其中沒有來自具有部分相似序列的蛋白質的現有模板結構。

蛋白質透過捲曲摺疊會構成三維結構,蛋白質的功能正由其結構決定。了解蛋白質結構有助於開發治療疾病的藥物[3]。DeepMind稱,AlphaFold能在數天內識別蛋白質的形狀,而此前學界識別蛋白質形狀經常需花費數年時間[4]。2020年11月,在第14屆CASP(英語:Critical Assessment of protein Structure Prediction,直譯:蛋白質結構預測的關鍵評估)競賽中[5],AlphaFold 2(2020)表現良好,中位分數為92.4(滿分100分)[6]。它的準確度遠遠高於其他任何程式[7]

2021年7月15日,AlphaFold 2論文在《自然》雜誌上作為高級訪問出版物與開源軟件和可搜索的物種蛋白質組數據庫一起發表[8][9][10]

蛋白質折疊問題

蛋白質由蛋白質一級結構組成,蛋白質折疊的過程中蛋白質會自發折疊形成蛋白質三級結構。蛋白質結構對蛋白質生物學功能至關重要。然而,了解氨基酸序列如何確定蛋白質三級結構極具挑戰性,這被稱為「蛋白質折疊問題」。[11]「蛋白質折疊問題」涉及折疊穩定結構的原子間力熱力學、蛋白質以極快速達到其最終折疊狀態的機制和途徑,以及如何從氨基酸序列預測蛋白質天然結構。[12]

蛋白質結構過去通過諸如X射線晶體學低溫電子顯微鏡核磁共振等技術進行實驗確定,這些技術既昂貴又耗時。[11]

過去60年努力只確定了約170,000種蛋白質結構,而所有生命形式中已知蛋白質超過2億種。[13]

如果可以僅從氨基酸序列預測蛋白質結構,將極大地促進科學研究。然而利文索爾佯謬表明,雖蛋白質可在幾毫秒內折疊,但隨機計算所有可能的結構以確定真正的天然結構所需的時間比已知宇宙的年齡要長,這使得預測蛋白質為科學家們構建了生物學中的一項重大挑戰。[11]

多年來,研究人員應用了許多計算方法來解決蛋白質結構預測問題,但除了小而簡單的蛋白質外,它們準確性還遠遠遠沒有接近實驗技術,從而限制了科學研究。

CASP於1994年發起,旨在挑戰科學界做出最好的蛋白質結構預測,結果對於最困難的到2016年的蛋白質發現GDT分數也只能達到100滿分的40分。[13]

2018年,AlphaFold使用人工智能深度學習技術參加CASP[11]

算法

AlphaFold蛋白質結構數據庫

AlphaFold蛋白質結構數據庫於2021年7月22日啟動,這是AlphaFold和歐洲分子生物學實驗室歐洲生物信息研究所的共同努力。AlphaFold提供對超過2億個蛋白質結構預測的開放訪問,以加速科學研究。在啟動時,該數據庫包含人類和20種模式生物的幾乎完整UniProt蛋白質組的AlphaFold預測蛋白質結構模型,總計超過365,000種蛋白質(該數據庫不包括少於16個或多於2700個氨基酸殘基蛋白質[69],但對人類而言,殘基蛋白質可在文件中獲得。[70])。

AlphaFold目標是覆蓋UniRef90中1億個蛋白質大部分集合。截至2022年5月15日,已有992,316個可用。[71]

應用

AlphaFold已被用於預測SARS-CoV-2COVID-19的病原體)的蛋白質結構。 這些蛋白質的結構在2020年初有待實驗檢測[72]。在將結果發佈到更大的研究界之前,英國弗朗西斯·克里克研究所英語Francis Crick Institute(Francis Crick Institute)的科學家們對結果進行了檢查。該團隊還證實了對實驗確定的SARS-CoV-2刺突蛋白的準確預測,該蛋白在國際開放存取數據庫蛋白質資料庫(Protein Data Bank)中共享,然後發佈了計算確定的未充分研究的蛋白質分子的結構[73]

參見

參考文獻

  1. ^ AlphaFold. Deepmind. [2020-11-30]. (原始內容存檔於2021-01-19). 
  2. ^ 2.0 2.1 2.2 2.3 2.4 2.5 DeepMind's protein-folding AI has solved a 50-year-old grand challenge of biology. MIT Technology Review. [2020-11-30]. (原始內容存檔於2021-08-28) (英語). 
  3. ^ DeepMind称AI能精确预测蛋白折叠 将加速药物设计. 第一財經. 
  4. ^ DeepMind宣布能够预测蛋白质结构. 金融時報中文網. [2020-12-03]. (原始內容存檔於2020-12-22). 
  5. ^ Shead, Sam. DeepMind solves 50-year-old 'grand challenge' with protein folding A.I.. CNBC. 2020-11-30 [2020-11-30]. (原始內容存檔於2021-01-28) (英語). 
  6. ^ “阿尔法折叠”精准预测蛋白质三维结构. 科技日報. [2020-12-03]. (原始內容存檔於2020-12-05). 
  7. ^ 7.0 7.1 DeepMind's protein-folding AI has solved a 50-year-old grand challenge of biology. MIT Technology Review. [2020-11-30]. (原始內容存檔於2021-08-28) (英語). 
  8. ^ Jumper, John; Evans, Richard; Pritzel, Alexander; Green, Tim; Figurnov, Michael; Ronneberger, Olaf; Tunyasuvunakool, Kathryn; Bates, Russ; Žídek, Augustin; Potapenko, Anna; Bridgland, Alex; Meyer, Clemens; Kohl, Simon A A; Ballard, Andrew J; Cowie, Andrew; Romera-Paredes, Bernardino; Nikolov, Stanislav; Jain, Rishub; Adler, Jonas; Back, Trevor; Petersen, Stig; Reiman, David; Clancy, Ellen; Zielinski, Michal; Steinegger, Martin; Pacholska, Michalina; Berghammer, Tamas; Bodenstein, Sebastian; Silver, David; Vinyals, Oriol; Senior, Andrew W; Kavukcuoglu, Koray; Kohli, Pushmeet; Hassabis, Demis. Highly accurate protein structure prediction with AlphaFold. Nature. 2021-07-15, 596 (7873): 583–589. PMC 8371605可免費查閱. PMID 34265844. doi:10.1038/s41586-021-03819-2可免費查閱 (英語). 
  9. ^ GitHub - deepmind/alphafold: Open source code for AlphaFold.. GitHub. [2021-07-24]. (原始內容存檔於2021-07-23) (英語). 
  10. ^ AlphaFold Protein Structure Database. alphafold.ebi.ac.uk. [2021-07-24]. (原始內容存檔於2021-07-24). 
  11. ^ 11.0 11.1 11.2 11.3 11.4 AlphaFold: Using AI for scientific discovery. Deepmind. [2020-11-30]. (原始內容存檔於2022-03-07). 
  12. ^ Ken A. Dill, S. Banu Ozkan, M. Scott Shell, and Thomas R. Weikl. The Protein Folding Problem. Annual Review of Biophysics. 2008, 37: 289–316. PMC 2443096可免費查閱. PMID 18573083. doi:10.1146/annurev.biophys.37.092707.153558. 
  13. ^ 13.0 13.1 13.2 13.3 13.4 13.5 13.6 Robert F. Service, 'The game has changed.' AI triumphs at solving protein structures頁面存檔備份,存於互聯網檔案館), Science, 30 November 2020
  14. ^ 14.0 14.1 14.2 14.3 14.4 AlphaFold: a solution to a 50-year-old grand challenge in biology. Deepmind. [2020-11-30]. (原始內容存檔於2020-11-30). 
  15. ^ Mohammed AlQuraishi (May 2019), AlphaFold at CASP13頁面存檔備份,存於互聯網檔案館), Bioinformatics, 35(22), 4862–4865 doi:10.1093/bioinformatics/btz422. See also Mohammed AlQuraishi (December 9, 2018), AlphaFold @ CASP13: "What just happened?"頁面存檔備份,存於互聯網檔案館) (blog post).
    Mohammed AlQuraishi (15 January 2020), A watershed moment for protein structure prediction頁面存檔備份,存於互聯網檔案館), Nature 577, 627–628 doi:10.1038/d41586-019-03951-0
  16. ^ AlphaFold: Machine learning for protein structure prediction頁面存檔備份,存於互聯網檔案館), Foldit, 31 January 2020
  17. ^ Torrisi, Mirko et al. (22 Jan. 2020), Deep learning methods in protein structure prediction頁面存檔備份,存於互聯網檔案館). Computational and Structural Biotechnology Journal vol. 18 1301–1310. doi:10.1016/j.csbj.2019.12.011 (CC-BY-4.0)
  18. ^ 18.0 18.1 18.2 18.3 DeepMind is answering one of biology's biggest challenges. The Economist. 2020-11-30 [2020-11-30]. ISSN 0013-0613. (原始內容存檔於2020-12-03). 
  19. ^ 19.0 19.1 19.2 Jeremy Kahn, Lessons from DeepMind's breakthrough in protein-folding A.I.頁面存檔備份,存於互聯網檔案館), Fortune, 1 December 2020
  20. ^ 20.0 20.1 John Jumper et al., conference abstract (December 2020)
  21. ^ 21.0 21.1 21.2 21.3 See block diagram. Also John Jumper et al. (1 December 2020), AlphaFold 2 presentation頁面存檔備份,存於互聯網檔案館), slide 10
  22. ^ The structure module is stated to use a "3-d equivariant transformer architecture" (John Jumper et al. (1 December 2020), AlphaFold 2 presentation頁面存檔備份,存於互聯網檔案館), slide 12).
    One design for a transformer network with SE(3)-equivariance was proposed in Fabian Fuchs et al SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks頁面存檔備份,存於互聯網檔案館), NeurIPS 2020; also website頁面存檔備份,存於互聯網檔案館). It is not known how similar this may or may not be to what was used in AlphaFold.
    See also the blog post頁面存檔備份,存於互聯網檔案館) by AlQuaraishi on this, or the more detailed post頁面存檔備份,存於互聯網檔案館) by Fabian Fuchs
  23. ^ John Jumper et al. (1 December 2020), AlphaFold 2 presentation頁面存檔備份,存於互聯網檔案館), slides 12 to 20
  24. ^ Callaway, Ewen. What's next for AlphaFold and the AI protein-folding revolution. Nature. 2022-04-13, 604 (7905): 234–238 [2022-04-15]. doi:10.1038/d41586-022-00997-5. (原始內容存檔於2022-07-26) (英語). 
  25. ^ Group performance based on combined z-scores頁面存檔備份,存於互聯網檔案館), CASP 13, December 2018. (AlphaFold = Team 043: A7D)
  26. ^ 26.0 26.1 Sample, Ian. Google's DeepMind predicts 3D shapes of proteins. The Guardian. 2018-12-02 [2020-11-30]. (原始內容存檔於2019-07-18). 
  27. ^ AlphaFold: Using AI for scientific discovery. Deepmind. [2020-11-30]. 
  28. ^ Singh, Arunima. Deep learning 3D structures. Nature Methods. 2020, 17 (3): 249. ISSN 1548-7105. PMID 32132733. S2CID 212403708. doi:10.1038/s41592-020-0779-y可免費查閱 (英語). 
  29. ^ See CASP 13 data tables頁面存檔備份,存於互聯網檔案館) for 043 A7D, 322 Zhang, and 089 MULTICOM
  30. ^ Wei Zheng et al,Deep-learning contact-map guided protein structure prediction in CASP13頁面存檔備份,存於互聯網檔案館), Proteins: Structure, Function, and Bioinformatics, 87(12) 1149–1164 doi:10.1002/prot.25792; and slides頁面存檔備份,存於互聯網檔案館
  31. ^ Hou, Jie; Wu, Tianqi; Cao, Renzhi; Cheng, Jianlin. Protein tertiary structure modeling driven by deep learning and contact distance prediction in CASP13. Proteins: Structure, Function, and Bioinformatics (Wiley). 2019-04-25, 87 (12): 1165–1178. ISSN 0887-3585. PMC 6800999可免費查閱. PMID 30985027. bioRxiv 10.1101/552422可免費查閱. doi:10.1002/prot.25697. 
  32. ^ 32.0 32.1 32.2 DeepMind Breakthrough Helps to Solve How Diseases Invade Cells. Bloomberg.com. 2020-11-30 [2020-11-30]. (原始內容存檔於2022-04-05) (英語). 
  33. ^ deepmind/deepmind-research. GitHub. [2020-11-30]. (原始內容存檔於2022-02-01) (英語). 
  34. ^ DeepMind's protein-folding AI has solved a 50-year-old grand challenge of biology. MIT Technology Review. [2020-11-30]. (原始內容存檔於2021-08-28) (英語). 
  35. ^ 35.0 35.1 35.2 35.3 35.4 Mohammed AlQuraishi, CASP14 scores just came out and they’re astounding頁面存檔備份,存於互聯網檔案館), Twitter, 30 November 2020.
  36. ^ For the GDT_TS measure used, each atom in the prediction scores a quarter of a point if it is within 8 Å(0.80 nm) of the experimental position; half a point if it is within 4 Å, three-quarters of a point if it is within 2 Å, and a whole point if it is within 1 Å.
  37. ^ To achieve a GDT_TS score of 92.5, mathematically at least 70% of the structure must be accurate to within 1 Å, and at least 85% must be accurate to within 2 Å.
  38. ^ 38.0 38.1 38.2 38.3 Callaway, Ewen. 'It will change everything': DeepMind's AI makes gigantic leap in solving protein structures. Nature. 2020-11-30, 588 (7837): 203–204. Bibcode:2020Natur.588..203C. PMID 33257889. doi:10.1038/d41586-020-03348-4可免費查閱 (英語). 
  39. ^ Artificial intelligence solution to a 50-year-old science challenge could 『revolutionise’ medical research頁面存檔備份,存於互聯網檔案館) (press release), CASP organising committee, 30 November 2020
  40. ^ Brigitte Nerlich, Protein folding and science communication: Between hype and humility頁面存檔備份,存於互聯網檔案館), University of Nottingham blog, 4 December 2020
  41. ^ Michael Le Page, DeepMind's AI biologist can decipher secrets of the machinery of life頁面存檔備份,存於互聯網檔案館), New Scientist, 30 November 2020
  42. ^ The predictions of DeepMind’s latest AI could revolutionise medicine頁面存檔備份,存於互聯網檔案館), New Scientist, 2 December 2020
  43. ^ Cade Metz, London A.I. Lab Claims Breakthrough That Could Accelerate Drug Discovery頁面存檔備份,存於互聯網檔案館), New York Times, 30 November 2020
  44. ^ Ian Sample,DeepMind AI cracks 50-year-old problem of protein folding頁面存檔備份,存於互聯網檔案館), The Guardian, 30 November 2020
  45. ^ Lizzie Roberts, 'Once in a generation advance' as Google AI researchers crack 50-year-old biological challenge頁面存檔備份,存於互聯網檔案館). Daily Telegraph, 30 November 2020
  46. ^ 46.0 46.1 46.2 Nuño Dominguez, La inteligencia artificial arrasa en uno de los problemas más importantes de la biología頁面存檔備份,存於互聯網檔案館) (Artificial intelligence takes out one of the most important problems in biology), El País, 2 December 2020
  47. ^ Jeremy Kahn, In a major scientific breakthrough, A.I. predicts the exact shape of proteins頁面存檔備份,存於互聯網檔案館), Fortune, 30 November 2020
  48. ^ Julia Merlot, Forscher hoffen auf Durchbruch für die Medikamentenforschung頁面存檔備份,存於互聯網檔案館) (Researchers hope for a breakthrough for drug research), Der Spiegel, 2 December 2020
  49. ^ Bissan Al-Lazikani, The solving of a biological mystery頁面存檔備份,存於互聯網檔案館), The Spectator, 1 December 2020
  50. ^ Tom Whipple, "Deepmind computer solves new puzzle: life", The Times, 1 December 2020. front page image頁面存檔備份,存於互聯網檔案館), via Twitter.
  51. ^ Tom Whipple, Deepmind finds biology’s 『holy grail』 with answer to protein problem頁面存檔備份,存於互聯網檔案館), The Times (online), 30 November 2020.
    In all science editor Tom Whipple wrote six articles on the subject for The Times on the day the news broke. (thread頁面存檔備份,存於互聯網檔案館)).
  52. ^ Tim Hubbard, The secret of life, part 2: the solution of the protein folding problem.頁面存檔備份,存於互聯網檔案館), medium.com, 30 November 2020
  53. ^ Christian Stöcker, Google greift nach dem Leben selbst頁面存檔備份,存於互聯網檔案館) (Google is reaching for life itself), Der Spiegel, 6 December 2020
  54. ^ John Jumper et al. (1 December 2020), AlphaFold 2頁面存檔備份,存於互聯網檔案館). Presentation given at CASP 14.
  55. ^ Carlos Outeiral, CASP14: what Google DeepMind’s AlphaFold 2 really achieved, and what it means for protein folding, biology and bioinformatics頁面存檔備份,存於互聯網檔案館), Oxford Protein Informatics Group. (3 December)
  56. ^ Aled Edwards, The AlphaFold2 success: It took a village頁面存檔備份,存於互聯網檔案館), via medium.com, 5 December 2020
  57. ^ David Briggs, If Google’s Alphafold2 really has solved the protein folding problem, they need to show their working頁面存檔備份,存於互聯網檔案館), The Skeptic, 4 December 2020
  58. ^ The Guardian view on DeepMind’s brain: the shape of things to come頁面存檔備份,存於互聯網檔案館), The Guardian, 6 December 2020
  59. ^ Demis Hassabis, "Brief update on some exciting progress on #AlphaFold!"頁面存檔備份,存於互聯網檔案館) (tweet), via twitter, 18 June 2021
  60. ^ 60.0 60.1 60.2 60.3 Tom Ireland, How will AlphaFold change bioscience research?頁面存檔備份,存於互聯網檔案館), The Biologist, 4 December 2020
  61. ^ 61.0 61.1 Stephen Curry, No, DeepMind has not solved protein folding頁面存檔備份,存於互聯網檔案館), Reciprocal Space (blog), 2 December 2020
  62. ^ Derek Lowe, In the Pipeline: What’s Crucial And What Isn’t頁面存檔備份,存於互聯網檔案館), Science Translational Medicine, 25 September 2019
  63. ^ Philip Ball, Behind the Screens of AlphaFold頁面存檔備份,存於互聯網檔案館), Chemistry World, 9 December 2020. See also tweets頁面存檔備份,存於互聯網檔案館), 1 December
  64. ^ Derek Lowe, In the Pipeline: The Big Problems頁面存檔備份,存於互聯網檔案館), Science Translational Medicine, 1 December 2020
  65. ^ 65.0 65.1 Bagdonas, Haroldas; Fogarty, Carl A.; Fadda, Elisa; Agirre, Jon. The case for post-predictional modifications in the AlphaFold Protein Structure Database. Nature Structural & Molecular Biology. 2021-10-29, 28 (11): 869–870 [2022-07-29]. ISSN 1545-9985. PMID 34716446. S2CID 240228913. doi:10.1038/s41594-021-00680-9. (原始內容存檔於2022-06-23) (英語). 
  66. ^ e.g. Greg Bowman, Protein folding and related problems remain unsolved despite AlphaFold's advance頁面存檔備份,存於互聯網檔案館), Folding@home blog, 8 December 2020
  67. ^ Cristina Sáez, El último avance fundamental de la biología se basa en la investigación de un científico español頁面存檔備份,存於互聯網檔案館), La Vanguardia, 2 December 2020. (Alfonso Valencia overall view)
  68. ^ Zero Gravitas and Jacky Liang, DeepMind’s AlphaFold 2—An Impressive Advance With Hyperbolic Coverage頁面存檔備份,存於互聯網檔案館), Skynet today (blog), Stanford, 9 December 2020
  69. ^ AlphaFold Protein Structure Database. alphafold.ebi.ac.uk. [2021-07-29]. (原始內容存檔於2022-07-29). 
  70. ^ AlphaFold Protein Structure Database. alphafold.ebi.ac.uk. [2021-07-27]. (原始內容存檔於2022-07-29). 
  71. ^ AlphaFold Protein Structure Database. www.alphafold.ebi.ac.uk. [2022-07-29]. (原始內容存檔於2022-08-02). 
  72. ^ AI Can Help Scientists Find a Covid-19 Vaccine. Wired. [2020-12-01]. ISSN 1059-1028. (原始內容存檔於2022-04-23) (美國英語). 
  73. ^ Computational predictions of protein structures associated with COVID-19. Deepmind. [2020-12-01]. (原始內容存檔於2022-03-25). 

外部連結

AlphaFold(2018年)

AlphaFold 2(2020年)