Pharma Excipients
No Result
View All Result
  • Login
  • Shop
  • News
    • Specials
      • Excipients for CBD
      • Excipients & 3D Printing
      • Infographics – The overview
      • GMP-certified excipient production sites
      • The Future of TiO2
      • Excipients in the COVID-19 Vaccines
      • BASF PVP-Iodine
      • RegXcellence™
      • BASF Parenteral Excipients
    • World Days – The overview
  • Excipient basics
    • Excipient Solutions for CBD
    • Inorganic Chemicals
      • Calcium Carbonate
      • Calcium Phosphates
      • Calcium Sulfate
      • Halites
      • Metallic Oxides
      • Silica
    • Organic Chemicals
      • Actual Sugars
      • Artificial Sweeteners
      • Carbohydrates
      • Cellulose
      • Cellulose Esters
      • Cellulose Ethers
      • CMC and Croscarmellose Sodium
      • Converted Starch
      • Dried Starch
      • Microcrystalline Cellulose
      • Modified Starch
      • Starch
      • Sugars
      • Sugar Alcohols
    • Petrochemicals
      • Acrylic Polymers
      • Glycols
      • Mineral Hydrocarbons
      • Mineral Oils
      • Mineral Waxes
      • Petrolatum
      • Polyethylene Glycol (PEG)
      • Povidones
      • Propylene Glycol
      • Other Petrochemical Excipients
    • Oleochemicals
      • Fatty Alcohols
      • Glycerin
      • Mineral Stearates
      • Pharmaceutical Oils
      • Other Oleochemical Excipients
    • Proteins
  • Applications
    • 3D Printing – Drug Carrier
      • 3D Printing
      • Binder
      • Coating
      • Colour / Color
      • Coating Systems and Additives
      • Controlled Release Excipient
      • DC excipient
      • Disintegrant / Superdisintergrant
      • Drug Carrier
    • Emulsifier – Glidant
      • Emulsifier
      • Excipient for Inhalation
      • Filler
      • Film former
      • Flavour / Flavor
      • Glidant
    • Lubricant – Preservative
      • Lubricant
      • Nanotechnology
      • Orally Dissolving Technology Excipient
      • Pellet
      • Plasticizer
      • Preservative
    • Solubilizer – Viscocity Agent
      • Solubilizer
      • Speciality Excipient
      • Surfactants
      • Suspension Agent
      • Sustained Release Agent
      • Sweeteners
      • Taste Masking
      • Topical Excipient
      • Viscosity Agent
  • Sources
    • Handbook of Pharmaceutical Excipients – 9th Edition
    • EINECS Numbers
    • Excipient DMF List
    • Excipient cGMP Certification Organisations
    • FDA Inactive Ingredient List
    • FDA GRAS Substances (SCOGS) Database
    • Excipient E-Numbers
    • Whitepapers / Publications
    • Contract Development|Contract Manufacturing
  • Suppliers
    Excipient Suppliers List
    BIOGRUND Logo
    Evonik Logo
    LI logo violet
    logo roquette
    ADM
    Antares Navi Logo
    Antares
    Armor Pharma
    Asahi KASEI
    Ashland
    BASF
    Beneo
    Captisol
    Clariant Logo
    Clariant
    Croda
    DFE Pharma
    Excipio Chemicals
    Fuji Chemical
    Gattefossé
    Gangwal
    IOI Oleo
    Ingredient Pharm
    JRS Pharma
    Kerry Logo
    Kerry
    KLK Oleo
    KLK Oleo
    Lipoid
    Lubrizol Life Science Health
    Lubrizol Life Science Health
    MAGNESIA
    MAGNESIA
    MEGGLE Excipients & Technology
    MEGGLE
    Nagase Viita
    Nagase Viita
    Nordic Bioproducts
    Nordic Bioproducts
    pharm-a-spheres
    pharm-a-spheres
    PMC Isochem
    PMC Isochem
    Seppic
    Seppic
    ShinEtsu
    ShinEtsu
    Sigachi
    Sigachi
    SPI Pharma
    SPI Pharma
    Südzucker
    Südzucker
    Vikram Thermo
    Vikram Thermo
    Zerion Pharma
    Zerion Pharma
    • A-B
      • ADM
      • ARMOR PHARMA
      • Ceolus™ & Celphere™
      • Ashland
      • BASF
      • Beneo – galenIQ
      • Biogrund
      • Budenheim
    • C-G
      • Captisol
      • Croda
      • Cyclolab
      • DFE Pharma
      • DuPont Pharma Solutions
      • Evonik
      • Fuji Chemical Industries
      • Gattefossé
      • Gangwal Healthcare
    • I-O
      • ingredientpharm
      • IOI Oleochemical
      • JRS Pharma
      • Kerry
      • KLK Oleo Life Science
      • Lactalis Ingredients Pharma
      • Lipoid
      • Dr. Paul Lohmann
      • Lubrizol
      • Magnesia
      • MEGGLE Excipients
      • Nagase Viita – Pharmaceutical Ingredients
      • Nordic Bioproducts Group
    • P-Z
      • Pfanstiehl
      • pharm-a-spheres
      • Pharma Line
      • PMC Isochem
      • Roquette Pharma
      • Seppic
      • Shin-Etsu
      • Sigachi Group
      • Südzucker AG
      • VIKRAM THERMO
      • Zerion Pharma
      • ZoomLab® – Your Virtual Pharma Assistant
  • Inquiries
    • Product Inquiry
    • Tailored Tableting Excipients
      • Tailored Film Coating
  • Events
    • Overview Pharmaceutical Webinars
    • Videos CPhI Frankfurt 2025
    • CPhI China 2024
    • ExciPerience – The great excipient event!
  • All4Nutra

No products in the cart.

  • Shop
  • News
    • Specials
      • Excipients for CBD
      • Excipients & 3D Printing
      • Infographics – The overview
      • GMP-certified excipient production sites
      • The Future of TiO2
      • Excipients in the COVID-19 Vaccines
      • BASF PVP-Iodine
      • RegXcellence™
      • BASF Parenteral Excipients
    • World Days – The overview
  • Excipient basics
    • Excipient Solutions for CBD
    • Inorganic Chemicals
      • Calcium Carbonate
      • Calcium Phosphates
      • Calcium Sulfate
      • Halites
      • Metallic Oxides
      • Silica
    • Organic Chemicals
      • Actual Sugars
      • Artificial Sweeteners
      • Carbohydrates
      • Cellulose
      • Cellulose Esters
      • Cellulose Ethers
      • CMC and Croscarmellose Sodium
      • Converted Starch
      • Dried Starch
      • Microcrystalline Cellulose
      • Modified Starch
      • Starch
      • Sugars
      • Sugar Alcohols
    • Petrochemicals
      • Acrylic Polymers
      • Glycols
      • Mineral Hydrocarbons
      • Mineral Oils
      • Mineral Waxes
      • Petrolatum
      • Polyethylene Glycol (PEG)
      • Povidones
      • Propylene Glycol
      • Other Petrochemical Excipients
    • Oleochemicals
      • Fatty Alcohols
      • Glycerin
      • Mineral Stearates
      • Pharmaceutical Oils
      • Other Oleochemical Excipients
    • Proteins
  • Applications
    • 3D Printing – Drug Carrier
      • 3D Printing
      • Binder
      • Coating
      • Colour / Color
      • Coating Systems and Additives
      • Controlled Release Excipient
      • DC excipient
      • Disintegrant / Superdisintergrant
      • Drug Carrier
    • Emulsifier – Glidant
      • Emulsifier
      • Excipient for Inhalation
      • Filler
      • Film former
      • Flavour / Flavor
      • Glidant
    • Lubricant – Preservative
      • Lubricant
      • Nanotechnology
      • Orally Dissolving Technology Excipient
      • Pellet
      • Plasticizer
      • Preservative
    • Solubilizer – Viscocity Agent
      • Solubilizer
      • Speciality Excipient
      • Surfactants
      • Suspension Agent
      • Sustained Release Agent
      • Sweeteners
      • Taste Masking
      • Topical Excipient
      • Viscosity Agent
  • Sources
    • Handbook of Pharmaceutical Excipients – 9th Edition
    • EINECS Numbers
    • Excipient DMF List
    • Excipient cGMP Certification Organisations
    • FDA Inactive Ingredient List
    • FDA GRAS Substances (SCOGS) Database
    • Excipient E-Numbers
    • Whitepapers / Publications
    • Contract Development|Contract Manufacturing
  • Suppliers
    Excipient Suppliers List
    BIOGRUND Logo
    Evonik Logo
    LI logo violet
    logo roquette
    ADM
    Antares Navi Logo
    Antares
    Armor Pharma
    Asahi KASEI
    Ashland
    BASF
    Beneo
    Captisol
    Clariant Logo
    Clariant
    Croda
    DFE Pharma
    Excipio Chemicals
    Fuji Chemical
    Gattefossé
    Gangwal
    IOI Oleo
    Ingredient Pharm
    JRS Pharma
    Kerry Logo
    Kerry
    KLK Oleo
    KLK Oleo
    Lipoid
    Lubrizol Life Science Health
    Lubrizol Life Science Health
    MAGNESIA
    MAGNESIA
    MEGGLE Excipients & Technology
    MEGGLE
    Nagase Viita
    Nagase Viita
    Nordic Bioproducts
    Nordic Bioproducts
    pharm-a-spheres
    pharm-a-spheres
    PMC Isochem
    PMC Isochem
    Seppic
    Seppic
    ShinEtsu
    ShinEtsu
    Sigachi
    Sigachi
    SPI Pharma
    SPI Pharma
    Südzucker
    Südzucker
    Vikram Thermo
    Vikram Thermo
    Zerion Pharma
    Zerion Pharma
    • A-B
      • ADM
      • ARMOR PHARMA
      • Ceolus™ & Celphere™
      • Ashland
      • BASF
      • Beneo – galenIQ
      • Biogrund
      • Budenheim
    • C-G
      • Captisol
      • Croda
      • Cyclolab
      • DFE Pharma
      • DuPont Pharma Solutions
      • Evonik
      • Fuji Chemical Industries
      • Gattefossé
      • Gangwal Healthcare
    • I-O
      • ingredientpharm
      • IOI Oleochemical
      • JRS Pharma
      • Kerry
      • KLK Oleo Life Science
      • Lactalis Ingredients Pharma
      • Lipoid
      • Dr. Paul Lohmann
      • Lubrizol
      • Magnesia
      • MEGGLE Excipients
      • Nagase Viita – Pharmaceutical Ingredients
      • Nordic Bioproducts Group
    • P-Z
      • Pfanstiehl
      • pharm-a-spheres
      • Pharma Line
      • PMC Isochem
      • Roquette Pharma
      • Seppic
      • Shin-Etsu
      • Sigachi Group
      • Südzucker AG
      • VIKRAM THERMO
      • Zerion Pharma
      • ZoomLab® – Your Virtual Pharma Assistant
  • Inquiries
    • Product Inquiry
    • Tailored Tableting Excipients
      • Tailored Film Coating
  • Events
    • Overview Pharmaceutical Webinars
    • Videos CPhI Frankfurt 2025
    • CPhI China 2024
    • ExciPerience – The great excipient event!
  • All4Nutra
No Result
View All Result
Pharma Excipients
No Result
View All Result

Startseite » News » A structured oral formulation database for machine learning: Uncovering data-informed design strategies to facilitate effective formulation development

A structured oral formulation database for machine learning: Uncovering data-informed design strategies to facilitate effective formulation development

18. May 2026
A structured oral formulation database for machine learning

A structured oral formulation database for machine learning

Abstract

Oral dosage form development still relies heavily on empirical trial-and-error, while the high prevalence of poorly soluble drug candidates increases the need for structured data support. To address this limitation, we constructed the Computational Pharmaceutics Intelligent Manufacturing Database (CPIMD), an ML-oriented database that integrates physicochemical properties of active pharmaceutical ingredients, qualitative excipient compositions, release categories, and in vitro dissolution data from 683 marketed oral dosage forms approved by the PMDA of Japan. A standardized workflow for data cleaning, feature encoding and dissolution profile digitalization was used to transform raw information into a machine learning ready dataset. Using CPIMD, we applied unsupervised clustering to characterize four major formulation pattern clusters defined by drug properties, release categories, and associated excipient combinations. Together, these clusters summarize a formulation pattern matrix linking API physicochemical properties, release objectives, and associated functional excipients within the current dataset. A proof-of-concept random forest model further showed that binary excipient features could predict release type with 97.1% test accuracy, supporting the utility of CPIMD for downstream ML applications. CPIMD provides a structured data foundation for predictive modeling, preliminary excipient screening, and data-informed oral formulation development.

Highlights

  • CPIMD: an oral formulation database structured for ML-oriented drug development.
  • Unsupervised clustering on CPIMD characterized four major formulation pattern clusters associated with drug properties, release categories, and excipient usage.
  • Formulation design templates linking drug properties, release goals, and functional excipients were proposed.
  • CPIMD enables data-informed formulation development and model training.

Introduction

Oral administration remains one of the most common drug delivery routes due to its convenience, high patient compliance, and suitability for large-scale manufacturing (Alqahtani et al., 2021, Bannigan et al., 2020). As a core component of the pharmaceutical industry, the development of oral formulations plays a crucial role in transforming candidate compounds into safe, effective, and quality-controlled medicines. However, this process has long relied on empirical knowledge and iterative trial-and-error experimentation, resulting in prolonged development cycles, high costs, and limited success rates (Gao et al., 2021b, Treherne and Langley, 2021, Yang et al., 2019). Under such an experience-driven paradigm, drug development typically requires 10–15 years and investments of several billion US dollars (DiMasi et al., 2016, Kuentz et al., 2016, Schlander et al., 2021).

Approximately 40% of approved drugs and nearly 90% of drug candidates exhibit poor water solubility (Loftsson and Brewster, 2010, Xie et al., 2024). For BCS class II drugs, dissolution and solubilization are often major determinants of oral absorption, whereas for BCS class IV drugs systemic exposure is influenced by both solubilization-related and permeability-related processes. In this context, dissolution behavior represents an important, but not exclusive, factor affecting the oral performance of poorly soluble compounds, adding complexity to oral formulation development (Tsume et al., 2020, ANayak et al., 2025). In vitro dissolution behavior therefore provides an important formulation-level readout for evaluating release characteristics and potential in vivo performance, while also reflecting the combined influence of drug physicochemical properties, excipient selection, and experimental conditions (Patel et al., 2025, ANayak et al., 2025). However, because dissolution behavior is shaped by multiple interdependent factors, its prediction and control remain major challenges in oral formulation development.

Machine learning (ML) refers to algorithms that learn patterns from data to make predictions or decisions. This capability is particularly valuable for deciphering the complex multivariate relationships inherent in pharmaceutical formulations (Bannigan et al., 2021, Bao et al., 2023, Yang et al., 2019). For example, one study employed multiple ML models on a dataset of nearly 2000 tablet formulations to predict disintegration time, using molecular, physical, and compositional features as inputs; the Sparse Bayesian Learning model achieved a test R2 of 0.96 (Ghazwani and Hani, 2025). In another investigation, ML models were trained on 377 direct compression formulations comprising 20 APIs and 80 excipients to predict entire drug release profiles under dynamic dissolution conditions, with random forest achieving a five-fold cross-validation R2 of 0.635 (Protopapa et al., 2025). A further study combined design of experiments with artificial neural networks to predict the dissolution kinetics of extended-release tablets, using formulation and process parameters as inputs to model a first-order release constant, achieving a root mean square error of prediction of 0.0011 s−1 (Lourenço et al., 2025). These studies illustrate that ML can support concrete tasks such as formulation classification and performance-oriented prediction, but their broader application still depends on the availability of structured, standardized datasets.

One of the major bottlenecks in pharmaceutics, however, lies in the lack of high-quality, standardized, and reusable data resources (Bannigan et al., 2021, Bao et al., 2023). For formulation machine learning, robust datasets should ideally contain not only API descriptors and formulation composition, but also quantitative excipient levels, process parameters, batch-level variability, records of unsuccessful formulations, and independent external validation data. In practice, however, much of the currently available information—particularly dissolution profiles and formulation compositions—remains scattered in unstructured formats across regulatory documents, scientific publications, and patents, making it difficult to directly use for model training and validation (Dong et al., 2023, Gao et al., 2021a, Yanes et al., 2025). This fragmentation, together with the absence of unified digitalization standards, severely constrains the application of ML to formulation and performance prediction.

To address this challenge, we established the CPIMD, a structured resource integrating oral dosage forms approved by the PMDA of Japan with active pharmaceutical ingredient physicochemical properties, qualitative excipient compositions, release categories, and in vitro dissolution profiles. The current version encompasses 683 marketed oral formulations, covering 83 active pharmaceutical ingredients, 127 excipients, and 3,419 digitized dissolution profiles measured under various experimental conditions. By organizing heterogeneous regulatory information into standardized formulation-level and profile-level records, CPIMD provides a computable basis for systematic formulation analysis and proof-of-concept machine learning applications. The present work focuses on the construction and characterization of this database, with the clustering analysis and predictive modeling serving as illustrative examples to demonstrate its potential utility. CPIMD thus provides a useful data foundation for future studies in oral formulation analysis, predictive modeling, and effective formulation development.

Continue reading here

Jie Zhou, Conghui Li, Peng Zan, Zengming Wang, Baoqing Wang, Yanpeng Zhao, Xiuli Gao, Aiping Zheng, A structured oral formulation database for machine learning: Uncovering data-informed design strategies to facilitate effective formulation development, International Journal of Pharmaceutics, 2026, 126957, ISSN 0378-5173, https://doi.org/10.1016/j.ijpharm.2026.126957.


Read also more articles on Machine Learning here:

  • Artificial intelligence-assisted design of Chinese herbal medicine based hydrogels
  • FormulationDE: an updated artificial intelligence system for drug-excipient compatibility prediction
  • Predicting disintegration time in fast-disintegrating tablets using machine learning: a data-driven framework based on functional excipient representation

Artificial intelligence-assisted design of Chinese herbal medicine based hydrogels

 

Excipients insights. Straight to your inbox. Subscribe now.

 

 

 

Tags: excipientsformulation

Related Posts

100 Years of Pharmaceutical Lactose continuous improvement
3D Printing

100 Years of Pharmaceutical Lactose continuous improvement

17. May 2026
High-dose subcutaneous Administration of Biologics
Alginates

High-dose subcutaneous Administration of Biologics: Overcoming barriers through formulation and device innovation

17. May 2026
Lipid@polymer hybrid nanoparticles for efficient siRNA transport across the lung barriers
Amino Acids

Lipid@polymer hybrid nanoparticles for efficient siRNA transport across the lung barriers: Mechanistic insights into the role of Ionizable lipids

16. May 2026

Cart

Shop Search

  • Search for excipients and samples
  • Product Inquiry
  • Newsletter Registration
  • Visit the Homepage

Top Pharma-Excipient Links

  • Pharmaceutical Excipients – Some Definition
  • Inactive ingredient search for approved drug products in the USA
  • Excipient Suppliers List
  • GRAS Substances (SCOGS) Database
  • DC Excipients List
  • Homepage

About | Privacy Policy | Cookie policy | Cookie Settings | Contact | Homepage
Copyright: PharmaExcipients AG

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Shop
  • News
    • Specials
      • Excipients for CBD
      • Excipients & 3D Printing
      • Infographics – The overview
      • GMP-certified excipient production sites
      • The Future of TiO2
      • Excipients in the COVID-19 Vaccines
      • BASF PVP-Iodine
      • RegXcellence™
      • BASF Parenteral Excipients
    • World Days – The overview
  • Excipient basics
    • Excipient Solutions for CBD
    • Inorganic Chemicals
      • Calcium Carbonate
      • Calcium Phosphates
      • Calcium Sulfate
      • Halites
      • Metallic Oxides
      • Silica
    • Organic Chemicals
      • Actual Sugars
      • Artificial Sweeteners
      • Carbohydrates
      • Cellulose
      • Cellulose Esters
      • Cellulose Ethers
      • CMC and Croscarmellose Sodium
      • Converted Starch
      • Dried Starch
      • Microcrystalline Cellulose
      • Modified Starch
      • Starch
      • Sugars
      • Sugar Alcohols
    • Petrochemicals
      • Acrylic Polymers
      • Glycols
      • Mineral Hydrocarbons
      • Mineral Oils
      • Mineral Waxes
      • Petrolatum
      • Polyethylene Glycol (PEG)
      • Povidones
      • Propylene Glycol
      • Other Petrochemical Excipients
    • Oleochemicals
      • Fatty Alcohols
      • Glycerin
      • Mineral Stearates
      • Pharmaceutical Oils
      • Other Oleochemical Excipients
    • Proteins
  • Applications
    • 3D Printing – Drug Carrier
      • 3D Printing
      • Binder
      • Coating
      • Colour / Color
      • Coating Systems and Additives
      • Controlled Release Excipient
      • DC excipient
      • Disintegrant / Superdisintergrant
      • Drug Carrier
    • Emulsifier – Glidant
      • Emulsifier
      • Excipient for Inhalation
      • Filler
      • Film former
      • Flavour / Flavor
      • Glidant
    • Lubricant – Preservative
      • Lubricant
      • Nanotechnology
      • Orally Dissolving Technology Excipient
      • Pellet
      • Plasticizer
      • Preservative
    • Solubilizer – Viscocity Agent
      • Solubilizer
      • Speciality Excipient
      • Surfactants
      • Suspension Agent
      • Sustained Release Agent
      • Sweeteners
      • Taste Masking
      • Topical Excipient
      • Viscosity Agent
  • Sources
    • Handbook of Pharmaceutical Excipients – 9th Edition
    • EINECS Numbers
    • Excipient DMF List
    • Excipient cGMP Certification Organisations
    • FDA Inactive Ingredient List
    • FDA GRAS Substances (SCOGS) Database
    • Excipient E-Numbers
    • Whitepapers / Publications
    • Contract Development|Contract Manufacturing
  • Suppliers
    • A-B
      • ADM
      • ARMOR PHARMA
      • Ceolus™ & Celphere™
      • Ashland
      • BASF
      • Beneo – galenIQ
      • Biogrund
      • Budenheim
    • C-G
      • Captisol
      • Croda
      • Cyclolab
      • DFE Pharma
      • DuPont Pharma Solutions
      • Evonik
      • Fuji Chemical Industries
      • Gattefossé
      • Gangwal Healthcare
    • I-O
      • ingredientpharm
      • IOI Oleochemical
      • JRS Pharma
      • Kerry
      • KLK Oleo Life Science
      • Lactalis Ingredients Pharma
      • Lipoid
      • Dr. Paul Lohmann
      • Lubrizol
      • Magnesia
      • MEGGLE Excipients
      • Nagase Viita – Pharmaceutical Ingredients
      • Nordic Bioproducts Group
    • P-Z
      • Pfanstiehl
      • pharm-a-spheres
      • Pharma Line
      • PMC Isochem
      • Roquette Pharma
      • Seppic
      • Shin-Etsu
      • Sigachi Group
      • Südzucker AG
      • VIKRAM THERMO
      • Zerion Pharma
      • ZoomLab® – Your Virtual Pharma Assistant
  • Inquiries
    • Product Inquiry
    • Tailored Tableting Excipients
      • Tailored Film Coating
  • Events
    • Overview Pharmaceutical Webinars
    • Videos CPhI Frankfurt 2025
    • CPhI China 2024
    • ExciPerience – The great excipient event!
  • All4Nutra

About | Privacy Policy | Cookie policy | Cookie Settings | Contact | Homepage
Copyright: PharmaExcipients AG