A dataset of formulation compositions for self-emulsifying drug delivery systems


Self-emulsifying drug delivery systems (SEDDS) are a well-established formulation strategy for improving the oral bioavailability of poorly water-soluble drugs. Traditional development of these formulations relies heavily on empirical observation to assess drug and excipient compatibility, as well as to select and optimize the formulation compositions. The aim of this work was to leverage previously developed SEDDS in the literature to construct a comprehensive SEDDS dataset that can be used to gain insights and advance data-driven approaches to formulation development. A dataset comprised of 668 unique SEDDS formulations encompassing 20 poorly water-soluble drugs was curated. While there are still opportunities to enhance the quality and quantity of data on SEDDS, this research lays the groundwork to potentially simplify the SEDDS formulation development process.

Background & Summary

Poor aqueous solubility and permeability are recognized as major contributors to limited oral drug bioavailability. Indeed, these are integral considerations of theoretical frameworks such as Lipinski’s Rule of Five, the Biopharmaceutical Classification System (BCS), and the expanded Developability Classification System (DCS), which provide ways to differentiate promising drugs for oral administration. Over time, it has been reported that a growing number of small molecule drug candidates exhibit properties that may hinder oral absorption. In fact, in the 20 years since the rule of five was first proposed, new chemical entities approved by the FDA have been shown to increase in molecular weight and calculated water-octanol partition coefficient (clogP). In general, the successful clinical approval of less traditionally drug-like molecules underscores the critical role of pharmaceutical formulations.

Advanced lipid-based formulation strategies have enabled enhancement of oral absorption of drugs with poor water solubility and/or low intestinal permeability (i.e., BCS II and IV drugs). One such example is self-emulsifying drug delivery systems (SEDDS), a combination of oils, surfactants, and/or cosolvents that spontaneously emulsify in the aqueous environment of the gastrointestinal tract6. The ability of SEDDS formulations to improve oral bioavailability has been attributed to a number of mechanisms, notably through increased apparent solubility of highly lipophilic drugs, as well as reduced metabolism or efflux. As a result, several clinically approved drugs rely on delivery in SEDDS formulations including cyclosporine A (e.g., Sandimmune, Neoral), tipranavir (e.g., Aptivus), and fenofibrate (e.g., Lipofen), among others.

Despite the relative simplicity of SEDDS in principle, the path to design such formulations remains non-trivial. The traditional approach to SEDDS development is an empirical process relying on iterative trial-and-error to screen, optimize, and evaluate the formulations. One of the most pertinent questions lies with the selection of appropriate excipients and mixtures thereof. Typically, this begins with quantification of the drug solubility in excipients, followed by screening excipient mixtures based on their emulsification properties, through visual assessment. Given the range of possible excipients for SEDDS (i.e., oils, surfactants, cosolvents – all of which may differ in terms of hydrophilicity/ lipophilicity, purity, etc.), selection is often narrowed based on generally recognized as safe (GRAS) status. An established tool to facilitate the process of formulation development is the Lipid-based Formulation Classification System (LFCS). The LFCS defines four categories of oral lipid-based formulations according to their compositions, which essentially range from a pure mixture of oils to a combination of exclusively surfactants and cosolvents6. While the LFCS relates these compositional ranges to typical properties, it does not eliminate the need to develop bespoke formulations by exploring various excipient combinations. Nonetheless, methods to shift away from the traditional development of SEDDS have emerged, largely employing data-driven tools.

In recent years, there has been significant interest in the integration of artificial intelligence (AI) and machine learning (ML) in pharmaceutical sciences, including drug formulation. These tools have been used in a variety of advanced applications, from the expedited design of polymeric long-acting injectables to engineering peptides for sustained delivery to the eye, and the development of ionizable lipids for lipid nanoparticle delivery of mRNA12,13,14. In the context of oral lipid-based formulations, ML and computational techniques have played a role in early-stage development, notably based on small molecule drug solubility screening15. Preliminary ML modeling has been used to predict drug supersaturation in lipid-based formulations and increases in the apparent solubility of drug upon dispersion of SEDDS. In these cases, a limited number of formulation compositions (i.e., two representative examples) were explored. Few studies have performed extensive investigations relating to SEDDS compositions. One example includes an approach integrating ML and molecular dynamics to predict self-emulsification regions for SEDDS formulations, which also reported the distribution of excipients in their dataset18. However, this study did not identify drugs that were in the formulations in the dataset.

Thus, although SEDDS are a well-established formulation strategy, there are currently no open-access SEDDS datasets with a focus on formulation composition. Here, we present a literature mined SEDDS dataset containing 668 unique formulations, with drug, excipient, and formulation features that may be used to better understand composition patterns or relationships and predict formulation properties (Fig. 1). Our dataset contributes to the development of SEDDS formulations by providing a resource with documented formulations and related information that may serve as a starting point for excipient selection and screening.

Download the research paper as PDF here: A dataset of formulation compositions for self-emulsifying drug delivery systems

or read it here

Zaslavsky, J., Allen, C. A dataset of formulation compositions for self-emulsifying drug delivery systems. Sci Data 10, 914 (2023). https://doi.org/10.1038/s41597-023-02812-w

You might also like