Data-Driven Modeling of the Spray Drying Process. Process Monitoring and Prediction of the Particle Size in Pharmaceutical Production


Spray drying is used in the pharmaceutical industry for particle engineering of amorphous solid dispersions (ASDs). The particle size of the spray-dried (SD) powders is one of their key attributes due to its impact on the downstream processes and the drug product’s functional properties. Offline and inline laser diffraction methods can be used to estimate the product’s particle size; however, the final release of these ASDs is based on offline instruments. This paper presents a novel data-driven modeling approach for predicting the particle size of SD products. The model-based characterization of the process and the product’s particle size, as a critical quality attribute, follows the quality by design principles. The resulting model can be used for online process monitoring, reducing the risks of out-of-specifications products and supporting their real-time release. A Tucker3 model is trained to capture and factorize the deterministic variability of the process. Subsequently, a partial least-squares regression model is calibrated to model the impact that variability in the input material properties, the process parameters, and the spray nozzle have on the products’ particle size. This strategy has been calibrated and validated using large scale production data for two intermediate drug products under high sparsity of particle size data. Despite the challenges, high accuracy was obtained in predicting the median particle size (dv50) for release. The 99% confidence interval results in an error of maximum 2.5 μm, which is less than 10% of the allowed range of variation.


Sound scientific understanding of the pharmaceutical manufacturing processes and, specifically, the impact that the variability in the material attributes (MAs) and process parameters (PPs) has on the critical quality attributes is a core element of the quality by design (QbD) framework. (1) Achieving this goal comes with challenges as well as opportunities, some of them specific to large scale pharmaceutical manufacturing. One of these challenges is the use of mathematical models to characterize the large scale production processes including sources of variation that were not present during the product and process development. (1,2) However, this challenge also poses an opportunity; the model obtained, once it is validated, can aid the manufacturing activities. Thus, applications such as real time release (RTR) and multivariate statistical process control can be developed and deployed. As identified in the QbD framework, these models and their applications have a strong potential for impact at the later stages of the product’s life cycle, i.e., at defining the control strategies, and for the continual improvement. (3) However, if the models developed are informative and interpretable, the opportunity for a feedback loop toward previous stages is possible, i.e., for the risk assessment and the design space definition.

Spray drying is a technology that is largely used in the pharmaceutical industry. It is a process used in the late stages in the production of active pharmaceutical ingredients (APIs) or during the manufacture of intermediate and final drug products (DP). The production of poorly water-soluble drugs, delivered in the form of amorphous solid dispersion (ASDs) or nanocomposites, are examples of processes where spray drying can be a key step. (4−7) Broadly speaking, spray drying is a (semi)continuous process used in the production of nano- to micro-sized particles with a reasonably narrow size distribution. (8) In any case, the spray drying process requires tight control over the product’s particle properties. Among these properties, particle size distribution (PSD) is of primary interest; it can be a CQA and/or it can impact the functional properties, of the product. Additionally, the particle size can impact the performance during the downstream processing, e.g., powder compressibility. (9)

Despite the long-standing presence of the spray drying process in industry, research related to this process is still very active. Research efforts are focused on the development of novel formulations with advanced physical, biological, and/or chemical functional properties, as well as the advance in process understanding, modeling, (10−17) scale-up, and control. Novel and specialized spray-dried (SD) products such as microparticles containing plasmid nanocomplex, (18) microparticles of ciprofloxacin hydrochloride for pulmonary delivery, (19) and hollow spherical aggregates of silica nanoparticles (20) are a strong focus of research. These works and many others have also demonstrated the importance of understanding the impact that variations in the process conditions and input materials can have on the CQA of the products. In this regard, the QbD concept provides a framework to formulate these challenges. During the past decade, the QbD approach has serve to bridge the product-process development and the large scale manufacturing. (21,22)

The sustained increase of computational power and the scientific progress in the field of modeling of particulate solid processes in the pharmaceutical industry are paving the way toward models that can support the QbD objectives. For spray drying, models based on computational fluid dynamics, Lagrangian–Eulerian modeling, discrete element method, and population balance modeling have been investigated to develop mechanistic models of the process. (10−13) These modeling strategies provide a representation of some of the physical phenomena involved in the process, making them good for process understanding and development. However, they are not yet the best solution for process monitoring, prediction, and control. These methods are highly computationally demanding, which hinders their use in online applications and they normally require detailed data, not always available, to be calibrated for a specific process. (13,15) Other mechanistically inspired models, based on approximations and formulated as simpler one-dimensional ordinary differential equations (14−17) might be more suitable for efficient process development and for online applications. However, these models include several assumptions to approximate the real process, while many sources of variability have to be handled as sources of uncertainty. As Sloth et al. (23) have shown, the spray drying process combines two phenomena occurring simultaneously, i.e., the morphology formation and the drying. Sturm et al. (24) developed a model that describes these phenomena; however, this model is only demonstrated in the spray drying of hypromellose acetate succinate (HPMCAS), and it is valid only until the glass transition point of the material is reached. These are a few examples that show how mechanistic and semiempirical models still have limited applicability to support the large scale spray drying process.

Data-driven modeling approaches can offer a favorable trade-off between complexity and accuracy for process modeling, especially regarding online applications. Although these black box modeling approaches are specific to the product/process and more generally to the data used for training, the outcome of their application can be highly valuable regarding process understanding and model accuracy. For instance, Gil-Chávez et al. (25) use a response surface for the optimization of the spray drying process of aquasolv lignin and Milanesi et al. (13) use machine learning to extend a thermodynamic balance model of the spray drying process and to estimate the outlet temperature accurately. In recent years, multivariate statistical process monitoring (MSPM) and machine learning have become more standard approaches. These methods aid in the modeling of large scale batch processes that are subject to disturbances that were not foreseen or could not be modeled based on mechanistic or semiempirical models. (26−29) In recent reviews, Ramos et al. (30) discusses some methods commonly used in applications dealing with batch processes, while Ebadi et al. (31) focus on specific methods that target the covariance matrix of the process. Despite the diversity of existing and novel methods, the black box nature of these modeling strategies results in one common limitation, i.e., their limited interpretability. Improvement in this regard is one of the main drivers for research in this field.

A novel hierarchical strategy to model the spray drying process and predict the particle size of the product is presented. This strategy is based on MSPM methods and follows QbD precepts. It is intended to exploit data that are commonly available in the large scale, pharmaceutical manufacturing environment, with no need for additional experimental data. It is a data-driven approach that integrates a recently developed tensor decomposition training method and a linear regression model. Given the black box nature of the models obtained, the contribution of this work is the modeling strategy presented and not the specific calibrated models. These models are valid only for the products and production conditions included in the calibration data. The goal of this novel modeling strategy, applied to the spray drying process, is to (i) build a better understanding of the impact that uncontrolled process variability has on the particle size of the product, (ii) provide a strategy for interpretable and reliable process monitoring, and (iii) predict accurately the particle size of the products.

At the core, the proposed modeling strategy uses a Tucker3 model, which is calibrated using an algorithm for simultaneous data scaling and training. (32) The tensor decomposition via Tucker3 results in a multilinear rank approximation of the process variability. This method is well suited to factorize batch process data because it preserves the three-dimensional tensor structure of the data, where each mode is one direction of the process variability, i.e., batches, variables, and time. Fanaee-T and Gama (33) discuss empirical evidence that shows the advantages of using tensor methods for anomaly detection in batch process monitoring; this in comparison with matrix-based methods, such as principal component analysis, which require tensor unfolding. Better interpretability of the score plots, higher classification accuracy, lower approximation error, better identification of the variance, and a lower risk of overfitting are some of the identified advantages. Multilinear partial least-squares (PLS) for batch processes have also shown a reduction in noise propagation and a higher accuracy in the predictions, compared to traditional PLS. (27) More recently, Sun and Braatz (34) have highlighted the need for more systematic and in-depth research focused on the improvement, use, and applications of tensorial data analytics in chemical and biological manufacturing. Thus, the Tucker3 method is used in this work in conjunction with a recently developed strategy for simultaneous data scaling and training. As shown by Muñoz et al., (32) using this calibration algorithm with a Tucker3 model results in a better factorization of the deterministic variability of the process, which is more interpretable, and provides insights into the correlations of the variables and the dynamic behavior of the process.

A PLS model is integrated into the proposed hierarchical model structure to predict the particle size of the SD material. The PLS model takes the time-invariant data and the scores of the Tucker3 model as inputs to predict the median particle size of the product. The scores of the Tucker3 model serve as the fingerprints for each batch produced in the spray dryer. Thus, the PLS regression is calibrated to model the impact that the variability in the input material properties, the PPs, and the spray nozzle have on the products’ particle size. PLS is a common technique used in chemometrics to develop bilinear regression models exploiting advanced characterization techniques. (35) PLS models have also been used in some applications of process modeling such as the tableting (36) and granulation. (37,38) However, the direct use of PLS was discarded due to the three-dimensional nature of the process data and the need to combine these inputs with time-invariant inputs such as the critical material attributes (CMAs).

Two major challenges were addressed during the development of the proposed modeling strategy. These challenges arise from the incomplete data found in real industrial scenarios. First, only a subset of the independent input variables (e.g., feed flow rate, temperature, and density) as well as the response to the feed flow condition are measured in industrial spray dryers. However, the sensitivity analysis of the spray drying process has shown that the particle size is mostly affected by variations in the viscosity of the feed solution and the feed flow condition through the pressure nozzle. (39) To work out this limitation, the proposed regression model uses a set of empirical factors that serve to decorrelate the sources of variation in the flow through the nozzle. These empirical factors are derived from the empirical equation formulated for the flow through swirl nozzles. (40) The second challenge is the high sparsity of the particle size data. Although inline particle size analyzers are available in the market, their use in manufacturing environments is limited due to high cost/benefit ratios, low reliability, and difficulty to validate. In this work, an iterative training strategy is used to address data sparsity. This training strategy allows missing values to be inferred, based on the knowledge captured by the model about the process and the input material variability.

The paper first discusses specific aspects of the materials and methods used in this work, i.e., the products used to develop and validate the modeling strategy, the spray drying unit, the data sets available, and the modeling methods implemented. The second part of the paper presents the results obtained at every step of the modeling effort and discusses the most relevant aspects regarding the process understanding and the validity of the proposed method. Finally, the conclusions drawn from the main findings of this research are summarized.

Download the full article as PDF here: Data-Driven Modeling of the Spray Drying Process. Process Monitoring and Prediction of the Particle Size in Pharmaceutical Production

or read it here

Data-Driven Modeling of the Spray Drying Process. Process Monitoring and Prediction of the Particle Size in Pharmaceutical Production, Carlos André Muñoz López, Kristin Peeters, and Jan Van Impe, Cite this: ACS Omega 2024, Publication Date:June 7, 2024,, © 2024 The Authors. Published by American Chemical Society. This publication is licensed under, CC-BY-NC-ND 4.0.

Watch our recorded webinar on “Spray Drying” here:

Spray Drying
Spray Drying
You might also like