Using Machine Learning To Make Drug Discovery Better

New drugs typically take 12-14 years to make it to market, with a 2014 report finding that the average cost of getting a new drug to market had ballooned to a whopping $2.6 billion.
This post was published on the now-closed HuffPost Contributor platform. Contributors control their own work and posted freely to our site. If you need to flag this entry as abusive, send us an email.

New drugs typically take 12-14 years to make it to market, with a 2014 report finding that the average cost of getting a new drug to market had ballooned to a whopping $2.6 billion.

It's a topic I've covered before, with a study published earlier this year highlighting how automation could be used to reduce the cost of drug discovery by approximately 70%.

It's an approach that a number of companies are taking to market. For instance, London based start-up Benevolent.AI utilizes complex AI to look for patterns in the scientific literature.

They have already managed to identify two potential drug targets for Alzheimer's that has already attracted the attention of pharmaceutical companies.

Automating drug discovery

A nice example of what could be possible is provided by a recent study published in Cell Chemical Biology. The study reveals a big data based approach to detecting toxic side effects that would prohibit a drug from being used on humans before it gets to the expensive clinical trial stage.

The approach is nice, because rather than looking solely at the molecular structure to test its viability, they look instead at a number of features related to how the drug binds to molecules.

"We looked more broadly at drug molecule features that drug developers thought were unimportant in predicting drug safety in the past. Then we let the data speak for itself," the authors say.

It's an approach known as PrOCTOR, and it took its inspiration from the Moneyball method popularized in baseball. The researchers analyze each drug using 48 different features to gauge its safety for clinical use, and it does all of this automatically using machine learning.

Training the machine

The algorithm was trained using hundreds of drugs that had already been approved by the FDA, together with those that had failed clinical trials due to some sort of toxicity problem.

This training allowed them to develop a so called PrOCTOR score that helps them to distinguish the drugs that were approved by the FDA from those that failed the toxicity test.

"We were able to find several features that led to a very predictive model," the team say. "Hopefully this approach could be used to determine whether it's worth pursuing a drug prior to starting human trials."

They hope that their method will be used for post-approval surveillance of drugs that have already received FDA approval but still carry a risk of toxicity. For instance, one diabetes drug that was on the market was flagged by PrOCTOR and when it was investigated further, it did indeed reveal that it had been taken from the market in Europe.

This automated approach has a huge amount of potential, both to improve our drug discovery process but also to make it cheaper and more effective. With something like 90% of drugs failing to make it through the trial process, anything that can improve matters has to be welcomed.

Central to such methods however is having the data available to run the algorithms. Something like 50% of clinical trial results remain unpublished, and this puts a significant hurdle in the way of big data based approaches like those by PrOCTOR and

Give them the data and the sky really is the limit.

Popular in the Community