A Data-Driven Approach for Metal Casting Defects Prediction Using Feature Redistribution and Cost-Sensitive Convolutional Neural Networks

In the context of the Fourth Industrial Revolution, characterized by informatization and intellectualization, the manufacturing industry must accelerate digital transformation to overcome challenges and achieve quality improvements. Metal casting, particularly sand casting, is a core process in producing components for construction machinery, involving complex steps such as molding, core-making, melting, pouring, and cooling. The numerous process parameters and their couplings make it difficult to identify intrinsic relationships between process variables and quality outcomes. Moreover, datasets from enterprises are often imbalanced, with defect samples being scarce compared to non-defect ones, hindering the effective analysis of metal casting defects evolution. This study addresses these issues by proposing a convolutional neural network-based defect prediction method that integrates feature redistribution and cost-sensitive learning, offering a novel approach for process optimization in industrial settings.

Data-driven prediction of metal casting defects has been a focal point for researchers worldwide. For instance, multilayer perceptron networks have been employed to enhance quality prediction systems, while artificial neural networks have analyzed parameters like melt composition and pouring speed to forecast defect probabilities in continuous casting. Error-weighted deep neural networks have also been developed to reduce cost losses from prediction errors. In domestic studies, backpropagation neural networks have identified key factors in defects like broken cores, and migration learning with ResNet has improved fault diagnosis accuracy. These efforts highlight the potential of machine learning in tackling metal casting defects, but challenges remain in handling imbalanced data and capturing weak feature correlations.

This research focuses on predicting common metal casting defects—such as sand holes, blowholes, shrinkage cavities, and cold shuts—in engineering components like steering axles, revolving frames, and axle housings made of QT450-10 material. Despite controlled process parameters, defects persist due to multi-factor couplings, complicating quality management. We collected key process data from sand mixing, molding, metal melting, and pouring stages, encompassing 18 parameters. After data cleaning, a dataset of 6,267 samples was obtained, with parameter ranges summarized in Table 1.

Table 1: Process Parameter Variation Ranges
Parameter	Lower Bound	Upper Bound
Compaction Rate (%)	35.07	48.82
Shear Strength (kPa)	2	6
Old Sand Temperature (°C)	33.4	48.8
Old Sand Moisture (%)	1.38	2.38
Bentonite (%)	19.9	33.2
Mixed Soil (%)	9.8	19.7
New Sand (%)	0	40
C (%)	3.61	3.85
Si (%)	2.6	2.92
Mn (%)	0.38	0.66
P (%)	0.013	0.047
S (%)	0.006	0.018
Mg (%)	0.034	0.057
Al (%)	0.017	0.054
Pouring Temperature (°C)	1385	1415
Pouring Weight (kg)	128	145
Pouring Time (s)	11.9	30.5
Inoculation Amount (g)	24	92

To model the prediction of metal casting defects, we utilized a one-dimensional convolutional neural network. However, the arrangement of features in the input vector significantly impacts the convolution kernel’s ability to extract information. Simply concatenating features end-to-end may result in strong correlations being overemphasized, while weak correlations are overlooked. Thus, we implemented feature redistribution to optimize the spatial arrangement of features, ensuring that weakly correlated parameters are combined proximally to enhance feature learning. The redistribution process involves calculating the Pearson correlation coefficient matrix for the 18 parameters, converting it into a weight matrix where higher correlations correspond to lower weights. A set of 153 binary combinations is sorted by weight, and a greedy algorithm is applied to iteratively adjust the one-dimensional distances between parameters, achieving a globally optimal distribution. This approach mitigates the risk of missing subtle interactions that contribute to metal casting defects.

The dataset imbalance, with 5,111 non-defect samples and only 1,156 defect samples across four types, poses a significant challenge. Minority class features are easily ignored by models, leading to biased predictions. To address this, we incorporated cost-sensitive learning by modifying the loss function with a regularization term that assigns different costs based on class distribution. The revised loss function is defined as:

$$ L_\theta = \frac{1}{N} \sum_{i=1}^{N} \sum_{c=1}^{C} y_{ic} \log(\hat{y}_{ic}) + \lambda \sum_{i=1}^{N} \sum_{c=1}^{C} M_{\text{index}_c} \cdot \text{Softmax}(y_{ic}) $$

where $ N $ is the number of samples, $ y_{ic} $ is an indicator function (1 if sample $ i $’s true label is $ c $, 0 otherwise), $ \hat{y}_{ic} $ is the predicted probability for class $ c $, $ \lambda $ is a regularization coefficient (typically set to 10), $ M_{\text{index}_c} $ represents the cost associated with predicting class $ c $, and $ \text{Softmax}(y_{ic}) $ is the probability distribution from the network output. The cost matrix is designed based on the inverse of class frequencies to penalize misclassifications of minority classes more heavily. For example, misclassifying a shrinkage cavity as a sand hole incurs a cost proportional to the ratio of class samples. This adjustment shifts the decision threshold from merely selecting the maximum probability to considering the cost-weighted outcomes, thereby improving sensitivity to metal casting defects.

Integrating feature redistribution and cost-sensitive learning, we constructed the Feature Redistribution Cost-Sensitive Convolutional Neural Network for predicting metal casting defects. The model architecture, depicted in Figure 4, comprises input, convolutional, pooling, fully connected, and output layers. Unlike traditional multilayer perceptrons, the 1D-CNN leverages local feature fusion through convolution kernels, avoiding full connectivity that may obscure weak correlations. The network includes multiple convolutional layers with ReLU activation, max-pooling for dimensionality reduction, and a softmax output layer for five-class classification (four defects and non-defect). Training involved Adam optimization with early stopping to prevent overfitting.

We compared FR-CS-CNN with baseline models, including multilayer perceptron and standard convolutional neural network, using accuracy, loss, confusion matrices, precision, recall, and F1 scores. The MLP achieved 90.56% training accuracy and 86.10% testing accuracy, while the CNN reached 93.89% and 90.71%, respectively. The FR-CS-CNN model outperformed both, with 96.53% training accuracy and 93.67% testing accuracy, demonstrating a 2.96% improvement over CNN and 7.57% over MLP. Although the loss value for FR-CS-CNN was higher due to the cost-sensitive regularization, it effectively reduced prediction risks for metal casting defects. The confusion matrices in Figure 6 show that FR-CS-CNN achieved higher recall for defect classes, minimizing false negatives that could lead to overlooked defects in production. Detailed evaluation metrics are provided in Table 2.

Table 2: Model Evaluation Metrics on Test Set
Model	Metric	Shrinkage Cavity	Sand Hole	Blowhole	Cold Shut	Non-Defect
MLP	Precision	0.6512	0.7108	0.6118	0.5067	0.9720
	Recall	0.8235	0.7867	0.6933	0.7917	0.9187
	F1 Score	0.7273	0.7468	0.6500	0.6179	0.9446
CNN	Precision	0.7391	0.8214	0.8442	0.7931	0.9641
	Recall	0.8462	0.9000	0.9231	0.8667	0.9448
	F1 Score	0.7891	0.8591	0.8821	0.8284	0.9544
FR-CS-CNN	Precision	0.5789	0.7826	0.8372	0.7879	0.9792
	Recall	0.7391	0.7419	0.8667	0.8214	0.9660
	F1 Score	0.6491	0.7618	0.8518	0.8043	0.9725

To analyze the evolution of metal casting defects, we conducted single-factor and dual-factor analyses using the FR-CS-CNN model. For single factors, we examined shear strength, carbon content, and pouring temperature. The probability of defect occurrence was derived from the softmax output, showing fluctuations with parameter changes. For instance, low shear strength increased the probability of sand holes, with a threshold around 2.7 kPa recommended for prevention. High carbon content elevated the risk of cold shuts, suggesting a limit below 3.83%. Low pouring temperatures below 1390°C raised the likelihood of cold shuts and blowholes, emphasizing the need for timely pouring after treatment.

Dual-factor analyses explored interactions between parameters. For shear strength and compaction rate, the probability of blowholes increased significantly in specific ranges: compaction rate between 45.2% and 48% with shear strength between 2 kPa and 3.1 kPa, and compaction rate between 39.3% and 42.9% with shear strength between 2.8 kPa and 4.6 kPa. Avoiding these ranges can reduce defect risks. For carbon and magnesium content, cold shuts were prominent at high carbon levels, while shrinkage cavities rose with magnesium above 0.05%, indicating minimal coupling effects. For pouring temperature and inoculation amount, defects were minimized within inoculation amounts of 20–70 g and pouring temperatures above 1390°C. Outside this, cold shuts dominated at low temperatures, and blowholes at excessive inoculation.

In conclusion, this study addresses the challenges of predicting metal casting defects in sand casting processes by proposing a convolutional neural network enhanced with feature redistribution and cost-sensitive learning. The FR-CS-CNN model achieved a prediction accuracy of 93.67%, outperforming baseline models by significant margins. The feature redistribution ensured effective capture of weak correlations, while cost-sensitive learning mitigated data imbalance issues. Analysis of process parameters provided actionable insights for defect prevention, contributing to improved quality control in metal casting industries. Future work could explore real-time applications and integration with digital twin technologies for dynamic process optimization.