blog post background

AI drug discovery: Entering a new age

By Yelena Lavrentyeva, Innovation Analyst
Published on

AI drug discovery is exploding.

Overhyped or not, investments in AI drug discovery jumped from $450 million in 2014 to a whopping $58 billion in 2021. All pharma giants, including Bayer, AstraZeneca, Takeda, Sanofi, Merck, and Pfizer, have stepped up spending in the hope to create new-age AI solutions that will bring cost efficiency, speed, and precision to the process.

Traditional drug discovery has long been notoriously difficult. It takes at least 10 years and costs $1.3 billion to bring a new drug to the market. And this is only the case for drugs that succeed in clinical trials (only one in ten does).

Hence, the interest in finding new ways we discover and design drugs.

AI has already helped identify promising candidate therapeutics, and it didn’t take years, but months or even days.

In this article, we will explore how AI drug discovery is changing the industry. We will look at success stories, AI benefits, and limitations. Let’s go.

How drugs are discovered

The drug discovery process typically starts with scientists identifying a target in the body, such as a specific protein or hormone, that is involved in the disease. Then they use different methods to find a possible solution, a drug candidate, including:

  1. Screening existing compounds: Scientists can screen libraries of compounds (natural products or chemicals) they made before, to check if any of them have the desired activity or interaction with the target.

  2. De novo drug design: They can use computer modeling and simulation to develop novel chemical compounds that can do the job. This approach is used to create small molecule drugs, which are chemically synthesized compounds less than 1,500 daltons in size.

  3. Biologics: Researchers can also generate biological molecules like antibodies, enzymes, or proteins to act as drugs. This involves isolating or synthesizing molecules from living organisms that can interact with the target. Compared with small molecules, such molecules are typically larger and more complex.

  4. Repurposing: Scientists can take a look at compounds that were developed for something else and see if they have therapeutic potential for the disease in question.

Once a potential drug candidate (called lead compound) is found, it is tested in cells or animals, before moving on to clinical trials which include three phases, starting with small groups of healthy volunteers, and then proceeding to larger groups of patients suffering from the specific condition.

How AI is applied

Artificial Intelligence covers various technologies and approaches that involve using sophisticated computational methods to mimic elements of human intelligence such as visual perception, speech recognition, decision-making, and language understanding.

AI began back in the 1950s as a simple series of “if, then rules” and made its way into healthcare two decades later after more complex algorithms were developed. Since the advent of deep learning in the 2000s, AI applications in healthcare have expanded.

A few AI technologies are empowering drug design.

Machine Learning

Machine learning (ML) focuses on training computer algorithms to learn from data and improve their performance, without being explicitly programmed.

ML solutions encompass a diverse array of branches, each with its own unique characteristics and methodologies. These branches include supervised and unsupervised learning, as well as reinforcement learning, and within each, there are various algorithmic techniques that are used to achieve specific goals, such as linear regression, neural networks, and support vector machines. ML has many different application areas, one of which is in the field of AI drug discovery where it enables the following:

  • Virtual screening of compounds to identify potential drug candidates

  • Predictive modeling of drug efficacy and toxicity

  • Identification of new targets for drug development

  • Analysis of large-scale genomic and proteomic data collected from living organisms (DNA sequences, gene expression levels, protein structures, etc.)

  • Optimization of drug dosing and treatment regimens

  • Predictive modeling of patient responses to treatment

Deep Learning

Deep Learning (DL) is a subset of ML based on using artificial neural networks (ANNs). ANNs are made up of interconnected nodes, or “neurons,” that are connected by pathways, called “synapses.” Like in the human brain, these neurons work together to process information and make predictions or decisions. The more layers of interconnected neurons a neural network has, the more “deep” it is.

Unlike supervised and semi-supervised learning algorithms that can identify patterns only in structured data, DL models are capable of processing vast volumes of unstructured data and make more advanced predictions with little supervision from humans.

In AI drug discovery, DL is used for:

  • Improved virtual screening of compound libraries to identify hits with a higher probability to bind to a target

  • Image-based profiling to understand disease-associated phenotypes, disease mechanisms, or a drug’s toxicity

  • More accurate prediction of how a drug will be absorbed, distributed, metabolized, and excreted from the body (pharmacokinetic properties)

  • Prediction of drug-target interactions and binding affinity

  • Prediction of the structure of proteins that account for most of the currently identified drug targets

  • Generation of novel drug-like compounds with the desired physical, chemical, and bioactivity properties

  • Automation of clinical trial processes and protocol design

Natural Language Processing (NLP)

NLP relies on a combination of techniques from linguistics, mathematics, and computer sciences, including DL models, to analyze, understand, and generate human language. AI drug discovery research often uses NLP to extract information from both structured and unstructured data to accomplish the following:

  1. Text mining of scientific literature to identify associations between chemical/drug entities, their targets, and novel disease-related pathways

  2. Extracting structured information from unstructured electronic health records (EHRs), such as patient demographics, diagnoses, and medications

  3. Identifying adverse drug events by analyzing text data from social media, news articles, and other sources

  4. Determining clinical trial eligibility criteria based on protocols and matching patients to trials

  5. Summarizing drug information

Why AI drug discovery is the talk of the town now

In the last couple of years, companies across the pharmaceutical sector have taken steps to incorporate AI into their research methods. This includes building in-house AI teams, hiring AI healthcare professionals and data analysts, backing startups with an AI focus, and teaming up with technology firms or research centers.

A combination of factors is driving this trend.

The increasing power of computers and new AI developments

Recent tech advances have shifted the traditional focus of AI drug discovery research.

As the majority of companies in the sector (around 150 in 2022 according to BiopharmaTrend AI Report) continue to be busy with designing small molecules, which are easy to represent computationally and compare at scale, there is also a growing interest in new applications of AI in drug discovery.

Many companies are beginning to embrace AI for designing biologics (77 companies) and discovering biomarkers that indicate the presence or progression of a disease (59). Others are focused on building all-embracing AI drug discovery platforms, identifying new targets, or creating ontologies — structured representations of relationships between different entities such as chemical compounds, proteins, and diseases.

Widening access to AI tools

As the shortage of AI talent shows no sign of abating, the entry barriers to AI drug discovery have actually reduced. Tech vendors and pharma giants are releasing increasingly sophisticated AI platforms, including ready-to-use no-code and drag-and-drop systems that enable non-AI experts to integrate artificial intelligence into their research. These developments are playing a major role in the accelerated adoption of AI by the industry.

AI-enabled success stories

AI drug discovery projects pursued in academia and the industry have already produced the first successful results across the value chain of drug discovery. Examples include:

  • DeepMind has built the AI system AlphaFold that can predict a protein’s 3D structure from its one-dimensional amino acid sequence in seconds rather than months or years that it would normally take. The system was used to predict over 200 million protein structures belonging to animals, plants, bacteria, fungi, and other organisms.

  • University of Washington researchers have developed a deep learning model that uses gaming computers to calculate protein structures within 10 minutes.

  • Deep Genomics has used AI technologies to screen more than 2,400 diseases and 100,000 mutations to predict the exact disease-causing mechanism in a Wilson disease mutation and create a DG12P1 drug in 18 months.

  • Aladdin has released a proprietary AI drug discovery platform for commercial use in virtual screening, hit-to-lead, lead optimization, and the preclinical phase. This platform helped Aladdin identify a number of drug compounds for a potential treatment of age-related diseases.

  • IBM has developed the Watson system with cognitive computing capabilities that is used by the pharmaceutical industry for matching patients to the right-fit clinical trials for their condition. In a clinical trial for breast cancer, the platform demonstrated an increase of 80% in enrollment and a reduction in trial matching time.

  • It has taken less than three months for AbCellera to develop a monoclonal antibody for neutralizing viral variants of COVID-19 and obtain approval from the US Food and Drug Administration (FDA).

  • BenevolentAI has combined its knowledge graph with AI tools to uncover baricitinib as a potential COVID-19 treatment in several days.

  • BioXcel Therapeutics has accelerated the discovery of dexmedetomidine as a sedative for patients with schizophrenia and bipolar disorders. The company obtained FDA approval for its proprietary sublingual film of dexmedetomidine (IgalmiTM) in less than four years after its first-in-human trials.

  • Using AI, Exscientia has designed three small molecules to enter clinical trials over the span of two years (for the treatment of Alzheimer’s disease psychosis, obsessive-compulsive disorder, and immuno-oncology).

  • In early 2023, Insilico reported positive topline results in a Phase 1 clinical trial of the first AI-designed novel molecule for an AI-discovered novel target to treat idiopathic pulmonary fibrosis (IPF).

  • In 2021, 13 AI-derived biologics reached the clinical stage, with their therapy areas including COVID-19, oncology, and neurology.

Benefits and challenges in AI drug discovery

AI is a powerful tool that holds the promise of revolutionizing the pharmaceutical industry solutions. With its ability to analyze vast amounts of data and make predictions, artificial intelligence can help researchers overcome the obstacles that have long hindered the drug discovery process by enabling:

  • Reduced timelines for discovery and preclinical stages

  • More accurate predictions on the efficacy and safety of drugs

  • New, unanticipated insights into drug effects and diseases

  • New research lines and new R&D strategies

  • Cost savings through quicker analysis and automation

According to Insider Intelligence, AI can save the pharmaceutical industry up to 70% of drug discovery costs. The potential of AI in drug discovery is truly exciting, but there are a few roadblocks that need to be tackled first to exploit it to the fullest.


When it comes to AI, it always comes down to input data. Data silos and legacy systems that wouldn’t allow their consolidation are big hurdles to AI research in any domain. In the pharmaceutical industry, the problem may be even more pronounced.

Pharmaceutical companies have traditionally been bad at sharing data, be it results from clinical studies or de-identified patient information, while the troves of data they have may provide answers to questions that the original researcher never considered.

When it ultimately comes to sharing data, it’s often incomplete, inconsistent, or biased, as is the case with datasets used for predicting protein-ligand binding affinities that are crucial for drug discovery. In some cases, the data may not even be reflective of the entire population and the AI model may fall short in real-world scenarios.


The sheer complexity of biological systems makes AI-enabled analysis and predictions of time and spatial changes in their behavior hard.

There is a vast number of complex and dynamic interactions within biological systems where each element such as proteins, genes, and cells can have multiple functions and be affected by multiple factors, including genetic variations, environmental conditions, and disease states.

Interactions between different elements can also be non-linear, meaning that small changes in one element can have a great impact on the entire system. For instance, a single gene that controls cell division can be responsible for the growth of a tumor, or interactions between multiple proteins can lead to the development of highly specific and complex structures such as the cytoskeleton of a cell.

Another challenge is a lack of qualified staff to handle AI drug discovery tools.


The use of neural networks in AI drug discovery has pushed the boundaries of what is possible, but a lack of their interpretability poses a significant challenge. Referred to as black boxes, such AI models might produce the most accurate predictions possible but even engineers can’t explain the reasoning behind them. This is particularly challenging in deep learning, where the complexity of understanding the output of each layer escalates as the number of layers grows.

This lack of transparency can lead to flawed solutions and reduce trust in AI among researchers, medical professionals, and regulatory bodies. To address this challenge, there is a growing need for the development of explainable, trustworthy AI.

Wrapping up

New drugs that are changing the game for patients continue to emerge.

Just 15 years after HIV was identified as the cause of AIDS in the 1980s, the pharmaceutical industry has developed a multi-drug therapy that allows people affected by the virus to live a normal life span. Novartis’ Gleevec prolongs the lives of leukemia patients. Incivek from Vertex Pharmaceuticals has doubled hepatitis C cure rates. Keytruda from Merck reduces by 35% the risk of cancer coming back after patients had surgery to excise melanoma.

But not all new drugs are created equal.

A recent analysis of over 200 new medicines conducted in Germany has revealed that only 25% provided significant advantages over existing treatments. The remaining drugs yielded either minimal or no benefits, or their impact was uncertain.

Given the costly and time-consuming nature of drug discovery, it’s clear the pharmaceutical industry needs major changes. And that’s where AI drug discovery could play a role. There is every chance that artificial intelligence can make a transformational contribution going beyond accelerating time-to-clinic.

How drugs are discoveredHow AI is appliedMachine LearningDeep LearningNatural Language Processing (NLP)Why AI drug discovery is the talk of the town nowThe increasing power of computers and new AI developmentsWidening access to AI toolsAI-enabled success storiesBenefits and challenges in AI drug discoveryDataComplexityInterpretabilityWrapping up
Talk to AI consultants
Contact us
background banner
edge ai

Thinking about your own AI drug discovery project? Drop us a line. With years of experience in creating AI solutions for healthcare, we are your right partner.