GMS: Revolutionizing Manufacturing With ChatGPT And Diffusion Models

Manufacturing 28/08/2024

3 main points

✔️ Generative AI revolutionizes manufacturing processes with a new approach that significantly improves efficiency and flexibility
✔️ GMS implementation dramatically enhances system resilience and responsiveness to uncertainty
✔️ Leveraging ChatGPT and diffusion models to facilitate human-centric decision making Innovative manufacturing systems

Generative manufacturing systems using diffusion models and ChatGPT
written by Xingyu Li, Fei Tao, Wei Ye, Aydin Nassehi, John W. Sutherland
[Submitted on 2 May 2024]
Comments: Accepted by arXiv
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Systems and Control (eess.SY)
code：

The images used in this article are from the paper, the introductory slides, or were created based on them.

Summary

The study introduces Generative Manufacturing Systems (GMS), which effectively manages and coordinates autonomous manufacturing assets to improve responsiveness and flexibility to various production goals and human preferences.

Unlike traditional explicit modeling, GMS uses generative AI (including diffusion models and ChatGPT) to implicitly learn from future visions, moving from model optimization to decision making through training and sampling. With the integration of generative AI, GMS enables complex decision making through human interaction, allowing manufacturing assets to generate multiple high-quality global decisions that can be iteratively improved based on human feedback.

Empirical results show that GMS significantly improves system resilience and responsiveness to uncertainty, reducing decision-making time from seconds to milliseconds. The study highlights the creativity and diversity of the solutions generated and reveals that it facilitates human-centered decision making through smooth and continuous human-machine interaction.

Introduction

Manufacturing systems face persistent uncertainties that take many forms, urgencies, and impacts. First, the advent of mass personalization and changes in regulations and standards add complexity to production requirements, requiring systems to skillfully navigate evolving demands and obligations.

Second, interruptions in production due to natural disasters, pandemics, financial crises, and geopolitical conflicts cause resource shortages and changes in consumer behavior. Following a major interruption, 20-30% of companies and businesses are forced to close. Finally, new manufacturing initiatives driven by sustainability, social, and environmental goals will require a reassessment of production goals and a rethinking of existing systems.

Future manufacturing systems must be flexible enough to adapt quickly to uncertainties and balance constraints with new initiatives. Flexibility was first introduced into manufacturing systems in the 1960s with the birth of flexible manufacturing systems. Despite efforts to improve hardware and software flexibility, the NP difficulties of centralized control with increasing asset and planning horizons have hampered system responsiveness.

Increased autonomy of manufacturing assets such as robots, vehicles, and mobile manipulators offers an opportunity to address this challenge and potentially increase responsiveness by delegating decision-making authority to each asset. Manufacturers such as Audi are shifting from fixed line production to split workstations with autonomous assets, and assets suited to specific manufacturing tasks (such as Little Helper, OMRON MoMa, and KMR IIWA) have shown effectiveness in the automotive and aerospace industries. These assets enable adaptable layout and scheduling through strategic task assignment and routing, and are expected to improve worker utilization and production by up to 30%. Emerging manufacturing systems such as agent-based manufacturing, matrix production systems, and anarchic manufacturing incorporate asset autonomy through decentralized or distributed control.

However, these control approaches also face challenges as more assets become more complex and flexible due to open interfaces and universal standards. Each asset often lacks a comprehensive awareness of the overall system and its constraints, making it difficult to coordinate individual plans and preventing optimal solutions from being achieved.

More importantly, the optimal solution depends on effectively balancing diverse objectives and stakeholder preferences, which may not be fully and explicitly modeled. To maximize the benefits of asset autonomy, a revolutionary approach is needed to efficiently manage diverse assets, accommodate different production goals, and deal with uncertainty while ensuring human-centered decision making.

Generative models offer a transformative opportunity to address these challenges through unique generative capabilities, probabilistic modeling, and interactive decision making. We propose a GMS that represents a fundamental shift from current explicit models to future tacit knowledge. Inspired by the vision of a dreaming factory, our approach explores a combination of diverse decisions and uncertainties and generates a number of potential futures from future experiences. By leveraging generative models (including diffusion models and ChatGPT), GMS skillfully captures the patterns and distributions underlying decisions and facilitates creative decision making even in scenarios beyond the initial scope of exploration.

Generative Manufacturing Systems (GMS)

The authors propose the synergistic integration of stationary machines, autonomous assets, and a diverse human workforce in future manufacturing systems. Given the increasing autonomy and mobility of assets, the authors suggest that autonomous assets and humans could dynamically move between various workstations and organize themselves to improve manufacturing operations and streamline the flow of goods. GMS is designed to skillfully adjust configuration and scheduling to meet uncertainties and production goals under human supervision.

Figure 1: Schematic of GMS

Figure 1 shows a schematic of the GMS, depicting the assets that receive human queries (left), the process by which the trained GMS model samples new decisions from future exploration (center), and the GMS that provides a variety of configuration and schedule options in response to human queries (right). The GMS is a GMS that is used by ChatGPT to make decisions. GMS leverages large-scale language models such as ChatGPT, XLNet, and Turning-NLP to translate human queries into machine language.

It then employs image generation models such as diffusion models, BigGAN, and DALL-E to generate system configurations (human and asset placement at each station) in response to human queries. In addition, it determines detailed operational schedules and task assignments, distributes tasks among stations and between humans and robots, and considers material and process constraints. Unlike approaches that rely on existing explicit models to find optimal decisions (model optimization), GMS employs a training and sampling approach.

Through extensive exploration of future scenarios, GMS implicitly learns probabilistic distributions of good decisions, assembles these distributions according to human desires and production goals, and samples decisions. This transition from model optimization to a training and sampling approach not only addresses the computational challenges of existing manufacturing systems, but also provides the following benefits

Creativity: Incorporating noise when sampling expands the range of potential decisions. Generative models also create new decisions through the combination of learned distributions and are a key element in responding to new human queries and unexpected scenarios.

Resilience: training and sampling make the system more responsive in the face of uncertainty, sampling decisions are more efficient than optimization convergence, and provide a wide range of solutions for diverse scenarios.

Human-centricity: GMS tacit knowledge is seamlessly integrated with human inquiry, knowledge, and expertise, allowing humans to access subtle insights in the generative model. This synergy allows for more integrated and effective collaboration between humans and autonomous assets, enabling humans to leverage GMS capabilities to enhance decision making and gain a sense of ownership and job satisfaction.

Generative Models

This section describes two generative models for dynamic asset management in GMS: 1) ChatGPT is used to extract system requirements from human queries, and 2) a diffusion model is used to generate configurations for those requirements.

ChatGPT

Using OpenAI's ChatGPT API in Python and the gpt-3.5-turbo model, we created a named entity recognition task to generate key requirements from a human query. For example, a query that says "I need a production line with a capacity of at least 240 parts per hour and using no more than 9 machines" would return class c = '(240, None, 9)' as a response. The 'None' serves as a placeholder for human skills that are not explicitly stated.

Diffusion model

Diffusion models are used to generate new samples by learning the underlying patterns, features, and distributions of configurations from training data. Diffusion models differ from other machine learning models in that they generate new samples by incrementally refining noise-infused data. This process involves two processes, as shown in Figure 2.

Forward process: add noise _𝜖𝑡 until data _𝑥0 is destroyed at each step.

Backward process: step by step remove the estimated noise and sample a new _𝑥0.

Figure 2: Forward and backward processes of the diffusion model In the forward process, Gaussian noise 𝜖 ∼ 𝑁 ( 0 , 𝐼 ) is introduced at each step 𝑡 ∈ 𝑇 in the input data 𝑥0 with weights determined by the variance 𝛽𝑡 of the forward process. The backward process uses the learning model ℎ𝜃 to estimate the noise 𝜖𝑡 as a function of 𝑧𝑡, current step 𝑡, and class label 𝑐.

Learning model

Learning Model _ℎ𝜃 utilizes a U-Net structure for efficient noise estimation; the U-Net is used to facilitate information flow between pooling and inverted convolution paths. Residual convolution blocks are tuned to enhance hierarchical feature extraction and pattern recognition for matrix-format data. The introduction of skip connections seamlessly integrates learned features and contextual information across different levels of U-Net, preserving spatial features across the network.

Figure 3: U-Net architecture for noise estimation using residual convolutional blocks.

Each block has two consecutive convolution layers with batch normalization, GELU activation, and residual connections that add inputs to the output tensor, ensuring that the network learns the residual mapping.

Dreaming Process

The study introduces a "dreaming process" that uses meta-heuristics to explore potential decisions. This process generates random future scenarios for demand, human, and asset capabilities and makes corresponding configuration and scheduling decisions. It integrates genetic algorithm-inspired selection, crossover, and mutation operations to accelerate data accumulation and facilitate the generation of diverse and appropriate configurations. The dreaming process terminates after a predefined number of iterations, rather than model convergence, ensuring a balanced data set.

Result

This section describes the GMS implementation and simulation results. In this study, GMS was implemented and simulated in an industrial parts processing use case. The system assumes 9 different asset types and operational/operational settings, distributed across 7 stations to facilitate flexible cooperation. Human skill levels were randomized to high/medium/low (120/60/0 parts/hour).

Figure 4: Sampling process for configurations according to target capacity

The dreaming process randomized worker skills over 25 generations to include 40 potential configurations in each generation; Cplex was used to obtain a mapping of configurations and optimal schedules. The simulation spanned 120 runtime units and generated 120,000 data points over 15 hours for training purposes. The diffusion process and training model were implemented using Python and PyTorch. Based on optimal tuning results, the process variancewas set to_𝛽0=^10-4and 𝛽𝑇=0.02, total steps T = 400, and guidance intensity w = 2.

The diffusion model was trained through a sampling process to generate reasonable configurations for a given target capacity. As the number of steps is reduced, the sampled configurations become more rational and produce a clear layout. Rational generation relies on the skillful accumulation of tacit knowledge of key features and patterns. For example, a configuration with a capacity of 0 will show predominantly light colors in the latter part of the matrix, indicating that a particular type of asset is used only minimally. As capacity increases, a variety of asset types (darker colors) are included, increasing parallel production and operational efficiency.

Table 1: Comparison of decision-making time with other algorithms

The diffusion model resulted in decision-making times ranging from 9 ms to 16 ms across the entire specified capacity. This consistent efficiency represents a quantitative improvement over other algorithms, which typically exceed 10 seconds or in some cases 300 seconds before reaching their target capacity. The consistent efficiency of the diffusion model represents an improvement in the algorithmic efficiency of the train-sample approach and greatly enhances the responsiveness and resilience of the GMS to uncertainty.

Table 2: Model Performance with and without Guidance

To comprehensively assess the quality of the generated samples, we randomly sampled 1000 configurations and evaluated them on the following three metrics

Accuracy: Accuracy of agreement with required capacity (Accu) and Mean Square Error (MSE)

Diversity: Duplication rate (DR) of generative configurations present in the training data

Fidelity: Frechet Inception Distance (FID), which measures the perceived quality and fidelity of the generated samples compared to the distribution of the training data

The performance of the diffusion model with and without guidance is shown below. The model with guidance produced decisions with better accuracy, lower MSE, and higher diversity for the required requirements; FID scores are much lower at extreme capacities and higher at medium capacities due to the high similarity of the corresponding configurations. Overall, these higher accuracy, higher fidelity, and more diverse decisions demonstrate GMS's resilience and creativity in dealing with uncertainty and diverse goals.

Conclusion

This study introduces Generative Manufacturing Systems (GMS) that leverage the autonomy of manufacturing assets to address uncertainty, human aspirations, and emerging production goals GMS represents a paradigm shift in decision making from model optimization to training-sampling . Empirical results in industrial use cases highlight the resilience and creativity of GMS, which consistently outperforms existing approaches in terms of decision-making time, diversity, and quality.

The GMS will skillfully adjust its configuration and schedule in response to human inquiries and additional goals to facilitate human-centered decision making, allowing for collaborative exploration and continuous improvement. Future research could incorporate more complex human inquiries through embedding rather than fixed classes, exploring diverse scenarios such as diagnostics and quality control, and performance indicators such as carbon emissions and human well-being.

Categories related to this article

友安昌幸 (Masayuki Tomoyasu): JDLA G certificate 2020#2, E certificate2021#1 Japan Society of Data Scientists, DS Certificate Japan Society for Innovation Fusion, DX Certification Expert Amiko Consulting LLC, CEO