Fast, Inexpensive Machine Learning For Manufacturing Processes Learning Transitions From Physical Models

Transfer Learning 12/12/2023

3 main points
✔️ In manufacturing processes, there is a significant challenge of the intrinsic and significant cost of developing qualitative and accurate physics-based models for new processes
✔️ To address this problem, we propose an approach based on transfer learning, in which large amounts of computationally inexpensive data obtained from physics-based process models are used to We propose an approach based on transfer learning, where ML models are trained on a large amount of computationally inexpensive data from a physics-based process model and then fine-tuned on a small amount of costly experimental data.
✔️ The proposed approach reduces model development costs by several years, experimental costs by 56-76%, computational costs by an order of magnitude, and prediction errors by 16-24%!

Accelerated and Inexpensive Machine Learning for Manufacturing Processes with Incomplete Mechanistic Knowledge
written by Jeremy Cleeman, Kian Agrawala, Rajiv Malhotra
[Submitted on 29 Apr 2023]
Comments: Manufacturing Letters, 2023
Subjects: Machine Learning (cs.LG); Materials Science (cond-mat.mtrl-sci)

code：

The images used in this article are from the paper, the introductory slides, or were created based on them.

First of all

Machine learning (ML) is of increasing interest for modeling parametric effects in manufacturing processes. This is because state-of-the-art approaches focus on reducing the experimental and computational costs of generating training data, but ignore the inherent and significant costs of developing qualitative and accurate physics-based models for new processes.

To address this problem, this paper proposes an approach based on transition learning, in which ML models are trained on a large amount of computationally inexpensive data from a physics-based process model (source) and then fine-tuned on a small amount of costly experimental data (target). The novelty lies in pushing the boundaries of the qualitative accuracy required for the source model.

Our approach has been evaluated for modeling printed line widths in Fused Filament Fabrication. Despite the extreme functional and quantitative inaccuracies of the source, our approach reduces model development costs by several years, experimental costs by 56-76%, computational costs by an order of magnitude, and prediction errors by 16-24%.

Introduction

This paper proposes a multi-fidelity learning approach that reduces model development cost ( _CD ) despite limited mechanical knowledge of process physics. The method is presented to model the printed line width W in fused filament fabrication (FFF) as a function of the filament feed rate F and extruder speed S. Complex physical phenomena such as non-Newtonian fluids, friction, cooling, wetting, and compressibility are involved in this problem.

Machine learning (ML) models are popular for modeling parametric effects in manufacturing processes, but generating the necessary training data from experiments is time and resource consuming (experiment cost _CE). Generating training data from a physics-based process model incurs computational costs _CC (CPU time) to run the simulation and time and human resources required to create the constitutive laws and numerical methods through intuitive trial and error (model development costs _CD). The underlying cause of high _{CD in} new processes is often a lack of qualitative knowledge of the underlying physics.

In multi-fidelity learning, ML models are trained using large amounts of inexpensive, inaccurate data (sources) and fine-tuned using small amounts of expensive but accurate data (targets). By using computational process models as sources and experimental data as targets, _CE is reduced compared to training on experimental data alone, _CC is reduced compared to training on computational data alone, and basic truth is captured. However, these tasks are based on the assumption that the source must be qualitatively consistent with the target. Thus, _CD remains high because of the need for qualitatively accurate, physically-based sources. Using an analytical process model as a source can further reduce _CC, but not _CD.

Using experimental sources for new processes is not possible due to their inherent novelty. This paper addresses this problem and proposes a novel multi-fidelity learning approach that reduces _CD even with limited mechanical knowledge of process physics.

Technique

This approach uses transition-based multi-fidelity learning, where the source is a physics-based process model and the target is an experiment. With this approach, the final machine learning (ML) model reflects the underlying experimental facts and reduces the _cost of experimentation ( _CE). The process model must (a) include one or more conservation laws to respect natural laws, (b) infer the shape of the constitutive laws without experimental calibration or validation to reduce model development costs ( _CD), and (c) avoid or minimize space-time discretization to minimize computational costs ( _CC ) is necessary to.

In this study, Epsilon support vector regression (SVR) was used as the ML model and fine-tuned with TrAdaBoostR2 instance-based transition learning along with Gaussian basis functions. epsilon-SVR hyperparameters were based on the brute force method and the TrAdaBoostR2 boosting iterations was 30. The source model used the law of conservation of mass, i.e., W = FA/Sh, where h is the distance from the nozzle to the platen and A is the filament cross-sectional area. This model almost completely ignores the complexities of extrusion physics and makes the erroneous but simplifying assumption that line (or layer) height and h are equal. It took approximately ^10-6 CPU hours to generate the 624 source samples used.

Experiments were performed on a home-built FFF machine printing PLA lines over 16 equal intervals S (350-725 mm/min) and F (153-729 mm/min) at h = 0.7, 0.85, and 1.2 mm. W was measured with a caliper and averaged over three measurements. Unstable printing regimes were excluded. First, the SVR was directly trained on the experimental target data only. Gradually, more training points were used until the mean squared error (RMSE) of the test data (the rest of the data set) did not decrease any further. This training and testing was done 1000 times using random sampling to obtain the average of the minimum error _RMSEdirect of direct learning and the corresponding number of samples _ndirect. Transfer learning was performed on source data of the same size as _ndirect, and a progressively increasing amount of target data was used to iteratively identify the minimum target data set _nt so that the transfer learning error _RMSEt was less than or equal to _RMSEdirect. This ensured that prediction accuracy was not sacrificed in an effort to reduce costs.

The test of the final SVR obtained after transfer training was performed on data randomly obtained from the portion of the experimental data that was not used for training. This test data set was the same size as _ndirect to prevent extreme bias in the training to test ratio and to fairly compare direct and transfer learning. This randomized test was run 30 times, yielding an average _RMSEt

Result

Figure 1 shows the functional discrepancy between the source model and the experimental target in a 3D plot and a representative 2D plot. The actual effects of filament feed rate F and extruder speed S on line width W are clearly nonlinear compared to the linear assumptions of the source, especially at low nozzle height h.

Figures 2a-c show how the tested RMSE of direct learning on experimental data only varies as a function of the number of training points, revealing RMSEdirect _and_ndirect (constant 150 for all h). Figure 2d-f compares this _RMSEdirect to errors from transfer learning for different quantities of experimental data (i.e., combinations of F and S). There are multiple cases where _RMSEt ≤ _RMSEdirect and _nt < _ndirect due to transfer learning.

Qualitatively, Figure 3 shows that despite the qualitatively and quantitatively imprecise mechanical knowledge built into the source model, the transition-learned SVR is able to capture the nonlinearities of the experimental data. This suggests that transfer learning has the ability to more accurately model complex relationships in experimental data despite the simplifying assumptions of the source model.

Figure 1: Comparison of source and target at h = (a) 0.7 mm (b) 0.85 mm (c) 1.2 mm. Feed rate F and stage speed S are in mm/min.

Figure 2: RMSE from direct learning as a function of the number of learning points for h = (a) 0.7 mm (b) 0.85 mm (c) 1.2 mm. h = (d) 0.7 mm (e) 0.85 mm (f) 1.2 mm for _RMSEdirect for errors obtained from transfer learning using different experimental quantities F and S Comparison. Feed rate F and stage speed S are in mm/min.

Figure 3: Comparison between the transfer-trained model and the target for h = (a-d) 0.7 mm (e-h) 0.85 mm (i-l) 1.2 mm.

The proposed approach reduces the cost of experiment ( _CEXP ) by 56-76% and the error by 16-24% compared to direct learning on experimental data (Table 1). Computational or analytical process models already developed can be used as a source or for direct learning. These models show good agreement with the qualitative and quantitative criteria of the experimental data. However, it took time and effort for these models to reach this level, from 2000 to 2019 for analytical equations and from 2002 to 2018 for computational simulations. This indicates that at least 15 _person-years of model development costs ( _CDEV) could have been saved if Smart-ML had been used in 2000 when the source models were reported in the literature.

Table 1. minimum RMSE and corresponding number of training samples for direct and transfer learning

Overall, the proposed approach reduces the _{CDEV of the} new process by reducing the need for qualitatively accurate human-created, physics-based process models. In addition, using a high-fidelity computational model to generate a single training sample for FFF requires orders of magnitude more CPU time than Smart-ML (i.e., 10^-6 CPU hours). Thus, Smart-ML reduces not only _CEXP, but also computational cost ( _CCOMP ) and _CDEV.

Conclusion

State-of-the-art approaches to ML models of parametric effects in manufacturing processes focus on reducing the experimental and computational costs of training data generation. This paper pushes this paradigm forward and examines the often overlooked, but important, potential to reduce the cost of process model development.

This is accomplished by testing the necessary similarity limits between the source process model and the target experimental data in transition learning by exploring the use of uncalibrated guesses for the functional form of the constitutive law to avoid the cost of iterative model development. This state-of-the-art approach for ML models of parametric effects in manufacturing processes focuses on reducing the experimental and computational costs of training data generation.

This paper pushes this paradigm forward and examines the possibility of also reducing the often-overlooked, but important, cost of process model development. This is accomplished by testing the limits of required similarity between source process models and target experimental data in transition learning by exploring the use of uncalibrated guesses for the functional form of the constitutive law to avoid the costs of iterative model development. This approach overcomes significant functional discrepancies between source and target, unlike assumptions in model development.

Categories related to this article

友安昌幸 (Masayuki Tomoyasu): JDLA G certificate 2020#2, E certificate2021#1 Japan Society of Data Scientists, DS Certificate Japan Society for Innovation Fusion, DX Certification Expert Amiko Consulting LLC, CEO