Abstract
Pharmaceutical formulations comprise active pharmaceutical ingredients (APIs) and excipients, the latter of which can significantly influence drug stability. Compatibility studies between drugs and excipients are essential during the pre-formulation stage. To address the challenges posed by time-consuming and costly compatibility experiments, this study enhanced the PharmDE expert system into a machine learning-based AI platform for compatibility prediction. In this work, we first established a comprehensive database containing 1,105 entries of compatibility data, including 579 compatible and 526 incompatible data. Subsequently, a machine learning model was trained using this database, achieving performance metrics with an accuracy of 0.75, precision of 0.75, recall of 0.75, F1 score of 0.74, MCC of 0.50, and AUC of 0.82. Finally, the machine learning model was integrated with the expert system to create FormulationDE—an interpretable platform for predicting drug-excipient compatibility. This platform not only facilitates rapid predictions regarding drug-excipient compatibility but also provides risk assessments for potential interacting groups based on SHAP force plots that highlight high-contribution functional groups. We believe that FormulationDE has potential as a valuable tool in the pharmaceutical industry by reducing both time and costs associated with experimental studies in drug development.
Introduction
Excipients are defined as all compounds utilized in pharmaceutical formulations, excluding the active pharmaceutical ingredient (API) (Bharate et al., 2016; Chaudhari and Patil 2012). These compounds can significantly influence drug absorption by modifying disintegration and dissolution processes or by directly affecting gastrointestinal physiological functions(Charman et al., 1997; Martinez and Amidon 2002). Traditionally regarded as inert substances(Pifferi and Restani 2003), excipients may, under certain conditions, engage in physical or chemical interactions with APIs. Such interactions can lead to diminished drug efficacy, altered release profiles, or even the generation of toxic products(Handa et al., 2022; Narang et al., 2015). Inadequate identification and management of potential incompatibilities during the early phases of drug development may result in variability in outcomes. Consequently, regulatory bodies such as the U.S. Food and Drug Administration (FDA) and China’s National Medical Products Administration (NMPA) have instituted rigorous guidelines and standards for evaluating compatibility between excipients and APIs. These regulations mandate comprehensive testing and documentation of drug-excipient interactions throughout the drug development process, underscoring the essential role that excipient/API compatibility plays in ensuring pharmaceutical quality and safety.
Conventionally, drug-excipient compatibility assessments have relied primarily on experimental methods, including fourier transform infrared spectroscopy (FTIR) (Dutta 2017), differential scanning calorimetry (DSC) (Freire 1995), high-performance liquid chromatography(HPLC)(Brittain 2008; Chadha and Bhandari 2014). Although these methods are effective in identifying potential incompatibilities, they are often resource intensive. Consequently, despite the extensive practical knowledge accumulated through traditional experimental methods, their inherent limitations highlight the necessity for more efficient and reliable tools and methodologies.
The rapid advancement of artificial intelligence (AI) has invigorated pharmaceutical development(Wu 2025; Wang et al., 2023), giving rise to the emerging discipline of computational pharmaceutics (Wang 2024) and the new Quality by Computational Design (QbCD) paradigm (Wang et al., 2025), covering key stages of pharmaceutical development from preformulation to drug formulation design (Wu et al., 2024; Wang et al., 2025; Xiong et al., 2025). Currently, expert systems and machine learning applications are available for predicting drug-excipient compatibility. In 2021, Wang et al., (2021) developed a comprehensive database to store incompatibility data and created 60 drug-excipient interaction rules to assess the risks associated with excipients interacting with drugs. They effectively integrated database retrieval with rule-based incompatibility risk prediction, establishing the PharmDE platform using an expert system. After that, two platforms utilizing machine learning methods to predict drug-excipient compatibility have been introduced. In 2023, Patel et al., (2023) developed a platform named DE-INTERACT based on artificial neural network models. Subsequently, in 2024, Nguyen et al. (2024) employed a stacking ensemble approach to train a ML model, which can rapidly predict incompatibilities between drugs and excipients. Although the datasets employed in these two studies were predominantly sourced from FDA-approved drug labels, reports have indicated instances of incompatibilities between APIs and excipients even in marketed pharmaceutical products (Akers 2002). Moreover, these ML methods somewhat reduced interpretability, complicating in-depth exploration of the underlying mechanisms governing drug-excipient compatibility or incompatibility.
This study proposes a novel platform integrating high-quality datasets with advanced machine learning techniques for predicting drug-excipient compatibility. By utilizing experimentally validated data and interpretable machine learning approaches, the proposed platform effectively addresses issues related to data quality assessment and model interpretability. The platform not only enables rapid prediction of drug-excipient compatibility probabilities, but also emphasizes structural characteristics of drugs and excipients, offering more reliable and transparent risk assessments and valuable insights into potential interactions.
Download the full article as PDF here: FormulationDE
or continue reading here
Jia, S., Wang, N., Yang, R. et al. FormulationDE: an updated artificial intelligence system for drug-excipient compatibility prediction. AAPS Open 12, 20 (2026). https://doi.org/10.1186/s41120-026-00156-4
Enjoy our new webinar:
Lipid-based formulations and enteric capsules to enhance oral bioavailability of peptides










































All4Nutra







