Scientists Tijana Radivojevic (left) and Hector Garcia Martin working on...
科学家Tijana Radivojevic(左)和Hector Garcia Martin在去年在敏捷生物福克斯的机械和统计建模,数据可视化和代谢地图上工作。
来源:Thor Swift/Berkeley Lab

Machine learning takes on synthetic biology


如果您已享用素食汉堡,请在您的美容常规中享受肉类或使用的合成胶原蛋白 - 这两种产品在实验室中“成长” - 那么您就受益于合成生物学。这是一个具有潜力的领域猖獗,因为它允许科学家设计生物系统,例如工程微生物以产生癌症斗争剂。然而,传统的生物工程方法缓慢而艰苦,试验和错误是主要方法。

Now scientists at the Department of Energy's Lawrence Berkeley National Laboratory (Berkeley Lab) have developed a new tool that adapts machine learning algorithms to the needs of synthetic biology to guide development systematically. The innovation means scientists will not have to spend years developing a meticulous understanding of each part of a cell and what it does in order to manipulate it; instead, with a limited set of training data, the算法能够预测细胞的DNA或者如何变化生物化学will affect its behavior, then make recommendations for the next engineering cycle along with probabilistic predictions for attaining the desired goal.

“这些可能性是革命性的,”伯克利实验室的生物系统和工程(BSE)部门的研究员Hector Garcia Martin表示,他们领导了研究。“现在,生物工程是一个非常缓慢的过程。它需要150人 - 创造抗疟疾药物蒿属植物。如果你能在几周或几个月而不是年代而不是几年内创建新的细胞来规范,你可以真的彻底改变了你能做什么bioengineering。"

与BSE数据科学家Tijana Radivojevic和一组国际研究人员合作,开发并展示了一种名为“自动推荐”工具(ART)的专利待处理算法。Machine learning允许计算机从大量可用的“培训”数据中“学习”之后进行预测。


在“组合机械和机器学习模型的预测工程和优化色氨酸新陈代谢”中,团队使用艺术来指导代谢工程过程增加色氨酸的产量,是各种用途的氨基酸,由酵母菌种类叫做酿酒酵母或面包师的酵母。该项目由Jie Zhang of Novo Nordisk Body of丹麦技术大学德国Nordiskity of Novo Nordisk Body Centres领导,与伯克利实验室和基于旧金山的创业公司Teselagen的科学家合作。


Then, using statistical inference, the tool was able to extrapolate how each of the remaining 7,000-plus combinations would affect tryptophan production. The design it ultimately recommended increased tryptophan production by 106% over the state-of-the-art reference strain and by 17% over the best designs used for training the model.

"This is a clear demonstration that bioengineering led by machine learning is feasible, and disruptive if scalable. We did it for five genes, but we believe it could be done for the full genome," said Garcia Martin, who is a member of the Agile BioFoundry and also the Director of the Quantitative Metabolic Modeling team at the Joint BioEnergy Institute (JBEI), a DOE Bioenergy Research Center; both supported a portion of this work. "This is just the beginning. With this, we've shown that there's an alternative way of doing metabolic engineering. Algorithms can automatically perform the routine parts of research while you devote your time to the more creative parts of the scientific endeavor: deciding on the important questions, designing the experiments, and consolidating the obtained knowledge."

更多的data needed

The researchers say they were surprised by how little data was needed to obtain results. Yet to truly realize synthetic biology's potential, they say the algorithms will need to be trained with much more data. Garcia Martin describes synthetic biology as being only in its infancy—the equivalent of where the Industrial Revolution was in the 1790s. "It's only by investing in automation and high-throughput technologies that you'll be able to leverage the data needed to really revolutionize bioengineering," he said.

Radivojevic added: "We provided the methodology and a demonstration on a small dataset; potential applications might be revolutionary given access to large amounts of data."

The unique capabilities of national labs

除了实验数据的缺乏外,Garcia Martin还表示其他限制是人力资本或机器学习专家。鉴于我们今天世界上的数据爆炸,许多领域和公司正在竞争机器学习的有限数量的专家人工智能

Garcia Martin指出,如果由国家实验室提供的团队环境包围,生物学的知识不是绝对的先决条件。例如,advojevic在应用数学和生物学中没有背景具有博士学位。“在这里两年来,她能够用我们的多学科生物学家,工程师和计算机科学家团队努力与合成生物领域有所作为,”他说。“以传统的方式进行代谢工程,她必须花五年或六年只是在开始自己的独立实验之前学习所需的生物学知识。”

"The national labs provide the environment where specialization and standardization can prosper and combine in the large multidisciplinary teams that are their hallmark," Garcia Martin said.


"If we could automate metabolic engineering, we could strive for more audacious goals. We could engineer microbiomes for therapeutic or bioremediation purposes. We could engineer microbiomes in our gut to produce drugs to treat autism, for example, or microbiomes in the environment that convert waste to biofuels," Garcia Martin said. "The combination of machine learning and克里普尔-based gene editing enables much more efficient convergence to desired specifications."

The research was published in the journal自然通信


Related articles






A new machine learning system costs less, generates less waste, and can be more innovative than manual discovery methods.

Enabling AI-driven advances without sacrificing privacy

Enabling AI-driven advances without sacrificing privacy

Secure AI Labs is expanding access to encrypted health care data to advance AI-driven innovation in the field.





人工智能快捷方式在癌症治疗中引入偏差 models are a powerful tool in cancer treatment. However, unless these algorithms are properly calibrated, they can sometimes make inaccurate or biased predictions.









Researchers have shown that a group of small autonomous, self-learning robots can adapt easily to changing circumstances. They connected the simple robots in a line, after which each individual robot taught itself to move forward as quickly as possible.



Machine learning helps some of the best microscopes to see better, work faster, and process more data.


Subscribe to Newsletter