Speaker: Dr. Jules SCHLEINITZ

Institution: École Normale Supérieure PSL, Paris

Hosted By: Professor Haibin SU

Co-Host: Professor Zhenyang LIN

Zoom Link: https://hkust.zoom.us/j/95150435343?pwd=Y2g0L3BpRmZWeStVZGVjOHUxalBFUT09

 

Abstract

Synthetic yield prediction using machine learning is intensively studied. Previous work has focused on two categories of data sets: high-throughput experimentation data, as an ideal case study, and data sets extracted from proprietary databases, which are known to have a strong reporting bias toward high yields. However, predicting yields using published reaction data remains elusive. To fill the gap, we built a data set on nickel-catalyzed cross-couplings extracted from organic reaction publications, including scope and optimization information. We demonstrate the importance of including optimization data as a source of failed experiments and emphasize how publication constraints shape the exploration of the chemical space by the synthetic community. While machine learning models still fail to perform out-of-sample predictions, this work shows that adding chemical knowledge enables fair predictions in a low-data regime. Eventually, we hope that this unique public database will foster further improvements of machine learning methods for reaction yield prediction in a more realistic context.

 

About the speaker

Jules Schleinitz completed a bachelor in chemistry and physics and then a master in theoretical chemistry at the École Normale Supérieure in Paris, then was recruited for a three year PhD at École Normale Supérieure under a teaching contract. He will defend his Ph.D thesis entitled "Mechanistic Analysis and Machine Learning" in October. In November he will start a postdoc for Computer Assisted Synthesis in Sarah E. Reisman's group at Caltech.

9月5日
2:00pm - 3:30pm
地点
Online
讲者/表演者
主办单位
Department of Chemistry
联系方法
付款详情
对象
PG students, Faculty and staff
语言
英语
其他活动
11月22日
研讨会, 演讲, 讲座
IAS / School of Science Joint Lecture - Leveraging Protein Dynamics Memory with Machine Learning to Advance Drug Design: From Antibiotics to Targeted Protein Degradation
Abstract Protein dynamics are fundamental to protein function and encode complex biomolecular mechanisms. Although Markov state models have made it possible to capture long-timescale protein co...
11月8日
研讨会, 演讲, 讲座
IAS / School of Science Joint Lecture - Some Theorems in the Representation Theory of Classical Lie Groups
Abstract After introducing some basic notions in the representation theory of classical Lie groups, the speaker will explain three results in this theory: the multiplicity one theorem for classical...