Recent publications


Machine Learning for Molecules and Materials NeurIPS 2018 Workshop

DEFactor: Differentiable Edge Factorization-based Probabilistic Graph Generation

Rim Assouel, Mohamed Ahmed, Marwin H Segler and Amir Saffari (BenevolentAI), Yoshua Bengio (MILA)


Generating novel molecules with optimal properties is a crucial step in many industries such as drug discovery. Recently, deep generative models have shown a promising way of performing de-novo molecular design. Although graph generative models are currently available they either have a graph size dependency in their number of parameters, limiting their use to only very small graphs or are formulated as a sequence of discrete actions needed to construct a graph, making the output graph non-differentiable w.r.t the model parameters, therefore preventing them to be used in scenarios such as conditional graph generation. In this work we propose a model for conditional graph generation that is computationally efficient and enables direct optimisation of the graph. We demonstrate favourable performance of our model on prototype-based molecular graph conditional generation tasks.

Read more

Machine Learning in Health Workshop, Neurips 2018

Adjusting for Confounding in Unsupervised Latent Representations of Images

Craig A. Glastonbury (BenevolentAI), Michael Ferlaino, Christoffer Nellåker and Cecilia M. Lindgren (Big Data Institute, University of Oxford)

BenevolentAI Publications.png

Biological imaging data are often partially confounded or contain unwanted variability. Examples of such phenomena include variable lighting across microscopy image captures, stain intensity variation in histological slides, and batch effects for high throughput drug screening assays. Therefore, to develop "fair" models which generalise well to unseen examples, it is crucial to learn data representations that are insensitive to nuisance factors of variation. In this paper, we present a strategy based on adversarial training, capable of learning unsupervised representations invariant to confounders. As an empirical validation of our method, we use deep convolutional autoencoders to learn unbiased cellular representations from microscopy imaging.

Read more

Machine Learning in Health Workshop, Neurips 2018

Interpretable Graph Convolutional Neural Networks for Inference on Noisy Knowledge Graphs

Daniel Neil, Joss Briody, Alix Lacoste, Aaron Sim, Paidi Creed, Amir Saffari

NeurIPS 2018.png

In this work, we provide a new formulation for Graph Convolutional Neural Networks (GCNNs) for link prediction on graph data that addresses common challenges for biomedical knowledge graphs (KGs). We introduce a regularized attention mechanism to GCNNs that not only improves performance on clean datasets, but also favorably accommodates noise in KGs, a pervasive issue in real-world applications. Further, we explore new visualization methods for interpretable modelling and to illustrate how the learned representation can be exploited to automate dataset denoising. The results are demonstrated on a synthetic dataset, the common benchmark dataset FB15k-237, and a large biomedical knowledge graph derived from a combination of noisy and clean data sources. Using these improvements, we visualize a learned model's representation of the disease cystic fibrosis and demonstrate how to interrogate a neural network to show the potential of PPARG as a candidate therapeutic target for rheumatoid arthritis.

Read more



Future Medicinal Chemistry, 13 Aug 2018

Artificial intelligence in drug discovery

Matthew A Sellwood, Mohamed Ahmed, Marwin HS Segler & Nathan Brown

Future Medicinal Chemistry.png

There has been a great deal of hype surrounding the resurgence of Artificial Intelligence and Machine Learning. This commentary was published in Future Medicinal Chemistry as a brief overview of the AI and ML domains, their relevance in different aspects of drug discovery and, importantly, reflecting on managing expectations from different quarters. The key themes covered are molecular design approaches, including our recent paper on do novo design models, predictive modelling, synthesis planning, and closing the feedback loop to learn from our decisions.

Read more


british medical journal, 7 june 2018

Clinical trial design and dissemination: comprehensive analysis of and PubMed data since 2005

Magdalena Zwierzyna, Mark Davies, Aroon D Hingorani, Jackie Hunter

Magdalena Zwierzyna: Lessons on trial design and transparency from is the world’s largest primary registry of clinical studies. For almost two decades now  it has been helping physicians, patients, and regulators identify relevant trials and collect evidence. It also offers a unique opportunity to explore, examine, and monitor the clinical research landscape.  In our recent research paper, we used the registry data to conduct a comprehensive large-scale analysis of registered clinical trials and investigate trends in their design and transparency. 

Read more  |  BMJ Opinion


Progress in Medicinal Chemistry, Volume 57, elsevier, 10 April 2018

Chapter Five - Big Data in Drug Discovery

Nathan Brown, Jean Cambruzzi, Peter J. Cox, Mark Davies, James Dunbar, Dean Plumbley, Matthew A.Sellwood, Aaron Sim, Bryn I. Williams-Jones, Magdalena Zwierzyna, David W.Sheppard

Screen Shot 2018-08-16 at 14.30.19.png

Modern scientific discovery is driven by data and learning from those data. This book chapter offers an overview of available data sources of relevance to drug discovery and how these can and do make an impact in our research and predictions to make better informed decisions that more rapidly make changes in our discovery research ethic to progress drugs to the clinic.

Read more



Nature Chemistry, 4 April 2018

Organic synthesis provides opportunities to transform drug discovery

Ian Churcher et al

Screen Shot 2018-08-16 at 12.59.06.png

Ian Churcher, VP Drug Discovery recently published a paper in Nature to highlight how organic synthesis could represent an opportunity for the pharmaceuticals industries to improve drug development. He presents the current challenges that the industry needs overcome and explains how new technologies and industry-academia collaborations are essential to progress.

Read more  |  Blog



Nature, 28 March 2018

Planning chemical syntheses with deep neural networks and symbolic AI

Marwin Segler et al

Screen Shot 2018-08-14 at 23.48.01.png

The AI technology developed by Marwin uses deep neural networks to learn from every chemical reaction ever performed (12.4 million of them). Combined with modern tree search algorithms, this allows to plan the synthesis of novel molecules. The technology augments the ability of chemists to make molecules faster, increases the success rate of synthetic chemistry and the speed and efficiency of drug development in general.

Read more


OPEN REVIEW, ICLR 2018, 27 March 2018

Exploring deep recurrent models with reinforcement learning for molecule design

Daniel Neil, Marwin Segler, Laura Guasch, Mohamed Ahmed, Dean Plumbley, Matthew Sellwood, Nathan Brown

Screen Shot 2018-08-16 at 13.49.54.png

The essence of molecular design is to effectively fulfill a molecular property profile that is desirable as a drug. In this paper we consider a number of different generative models for the design of new molecular structures the satisfy specific multiple objectives that are desirable for a particular drug discovery project. In addition to the evaluation of multiple generative models, we also presented as part of this work a benchmarking dataset to the community with the aim to provide an objective set to evaluate other new de novo molecular design models appropriately

Read More



ChemMedChem, 20 March 2018

Special Issue: Cheminformatics in Drug Discovery

Andreas Bender, Nathan Brown

Screen Shot 2018-08-14 at 23.32.38.png

BenevolentAI guest edited a special issue of ChemMedChem in early 2018 with our Head of Cheminformatics, Nathan Brown, in collaboration with Andreas Bender at the University of Cambridge. The special issue consisted of twenty original research papers from leading names in the field and was introduced with a guest editorial written by Nathan and Andreas, introducing the content. The special issue covered a broad range of topics in Cheminformatics from recent work in Machine Learning in Drug Discovery, to large scale data analyses of protein structures and ligand binding.

Read more