Defect Prediction in Sand Casting Using Convolutional Neural Networks with Feature Redistribution and Cost-Sensitive Learning

In the context of the Fourth Industrial Revolution characterized by informatization and intelligence, the manufacturing industry, particularly in casting processes, faces significant challenges in quality control and defect prediction. As a researcher focused on digital transformation in casting, I address the persistent issues in sand casting, where defects such as sand holes, gas pores, shrinkage cavities, and cold shuts often arise due to complex, coupled process parameters. These defects lead to substantial losses in production efficiency and resource utilization, especially for critical casting parts used in engineering machinery. Traditional methods struggle to uncover intrinsic relationships between process parameters and defect occurrences, exacerbated by imbalanced datasets where defect samples are rare compared to non-defect ones. This imbalance hinders effective data-driven modeling, making it difficult to predict and prevent defects in casting parts accurately.

To overcome these challenges, I propose a novel defect prediction method based on feature redistribution and cost-sensitive learning within a convolutional neural network (CNN) framework. This approach aims to enhance the model’s ability to capture weak correlations among process parameters and mitigate bias towards majority classes. The methodology is designed to handle the unique characteristics of casting process data, ensuring robust predictions for casting parts. The core idea involves optimizing the spatial arrangement of feature vectors to improve feature extraction and incorporating a cost-sensitive regularization term to adjust the loss function for imbalanced data. The resulting model, termed Feature Redistribution Cost-Sensitive Convolutional Neural Network (FR-CS-CNN), demonstrates superior performance in predicting defects for casting parts like steering axles, revolving frames, and bridge shells made of QT450-10 material.

The casting process involves multiple stages, including sand mixing, molding, metal melting, and pouring, each contributing numerous parameters that influence the quality of casting parts. From an industrial dataset, I collected 6,267 samples with 18 key process parameters, as summarized in Table 1. These parameters range from sand properties like compaction rate and shear strength to metal composition elements such as carbon (C) and magnesium (Mg), as well as pouring conditions like temperature and weight. The dataset exhibits severe class imbalance, with 5,111 non-defect samples and only 1,156 defect samples across four types, making defect prediction for casting parts a challenging task.

Table 1: Range of Process Parameters for Casting Parts Defect Prediction
Parameter Lower Bound Upper Bound Unit
Compaction Rate 35.07 48.82 %
Shear Strength 2 6 kPa
Old Sand Temperature 33.4 48.8 °C
Old Sand Moisture 1.38 2.38 %
Bentonite Content 19.9 33.2 %
Mixed Soil Content 9.8 19.7 %
New Sand Content 0 40 %
C Content 3.61 3.85 %
Si Content 2.6 2.92 %
Mn Content 0.38 0.66 %
P Content 0.013 0.047 %
S Content 0.006 0.018 %
Mg Content 0.034 0.057 %
Al Content 0.017 0.054 %
Pouring Temperature 1385 1415 °C
Pouring Weight 128 145 kg
Pouring Time 11.9 30.5 s
Inoculation Amount 24 92 g

The first step in my method is feature redistribution, which optimizes the one-dimensional arrangement of the 18 process parameters to enhance feature extraction by convolutional kernels. In standard CNNs, features are concatenated arbitrarily, which may lead to convolutional operations combining strongly correlated parameters while ignoring weakly correlated ones. For casting parts, weak correlations can be crucial for defect prediction, as defects often arise from subtle interactions among process factors. I use Pearson correlation coefficients to assess pairwise relationships among parameters. Let the correlation matrix be denoted as R, where each element \( r_{ij} \) represents the correlation between feature \( i \) and feature \( j \). To prioritize weak correlations, I transform R into a weight matrix W, with weights inversely proportional to correlation strength: \( w_{ij} = 1 – |r_{ij}| \). Higher weights indicate weaker correlations, which are more valuable for feature combination analysis.

Next, I generate all possible pairs of the 18 features, resulting in 153 binary combinations. Each pair is assigned a weight based on W, and the pairs are sorted in ascending order of weight (i.e., prioritizing strong correlations for dispersion). Using a greedy algorithm, I iteratively adjust the spatial positions of features in the one-dimensional vector to maximize the distance between strongly correlated features and minimize the distance between weakly correlated ones. This redistribution ensures that convolutional kernels scan combinations of weakly related parameters, thereby capturing more informative patterns for defect prediction in casting parts. The redistribution process can be formulated as an optimization problem: minimize the sum of weights for adjacent features in the vector, subject to constraints on feature ordering. Mathematically, for a feature vector f = [f_1, f_2, …, f_{18}], the objective is to find a permutation π that minimizes:

$$ \sum_{k=1}^{17} w_{\pi(k), \pi(k+1)} $$

where \( w_{\pi(k), \pi(k+1)} \) is the weight between feature at position k and k+1 in the permuted vector. The greedy algorithm provides an approximate solution by sequentially selecting pairs with the smallest weights and ensuring their features are placed far apart. This approach significantly improves the model’s ability to learn from complex interactions in casting process data.

The second step involves cost-sensitive learning to address class imbalance in the dataset. Since defect samples are scarce for casting parts, standard models tend to favor the majority non-defect class, reducing sensitivity to defects. I introduce a cost matrix C that assigns different misclassification costs based on the true and predicted classes. For a five-class problem (four defects and one non-defect), the cost matrix is designed to penalize misclassifications of defect classes more heavily. Let the classes be indexed as: 0 for cold shut, 1 for gas pore, 2 for sand hole, 3 for shrinkage cavity, and 4 for non-defect. The cost matrix C is defined as:

$$ C = \begin{bmatrix} c_{00} & c_{01} & c_{02} & c_{03} & c_{04} \\ c_{10} & c_{11} & c_{12} & c_{13} & c_{14} \\ c_{20} & c_{21} & c_{22} & c_{23} & c_{24} \\ c_{30} & c_{31} & c_{32} & c_{33} & c_{34} \\ c_{40} & c_{41} & c_{42} & c_{43} & c_{44} \end{bmatrix} $$

where \( c_{ij} \) represents the cost of predicting class j when the true class is i. To reflect data imbalance, I set costs inversely proportional to class frequencies. Specifically, for defect classes (0-3), misclassification costs are higher to ensure the model pays more attention to them. The cost values are computed as:

$$ c_{ij} = \frac{N_{\text{total}}}{N_i} \cdot \alpha_{ij} $$

where \( N_{\text{total}} \) is the total number of samples, \( N_i \) is the number of samples in true class i, and \( \alpha_{ij} \) is a scaling factor (typically 1 for i ≠ j and 0 for i = j). This matrix is incorporated into the loss function via a regularization term. The standard cross-entropy loss is modified as follows:

$$ L(\theta) = -\frac{1}{N} \sum_{i=1}^{N} \sum_{c=0}^{4} y_{ic} \log(\hat{y}_{ic}) + \lambda \cdot \frac{1}{N} \sum_{i=1}^{N} \sum_{c=0}^{4} M_{\text{index}, c} \cdot \text{Softmax}(y_{ic}) $$

Here, \( N \) is the number of samples, \( y_{ic} \) is an indicator variable (1 if sample i belongs to class c, 0 otherwise), \( \hat{y}_{ic} \) is the predicted probability for class c, \( \lambda \) is a regularization coefficient (set to 10 in this study), \( M_{\text{index}, c} \) is the cost value from matrix C for predicting class c given the true class, and Softmax(\( y_{ic} \)) is the probability distribution over classes. This adjustment forces the model to consider the asymmetric costs of misclassification, thereby improving defect detection for casting parts.

Based on these innovations, I construct the FR-CS-CNN model for defect prediction in casting parts. The network architecture uses a one-dimensional CNN to process the redistributed feature vector of length 18. This choice avoids the fully connected coupling in traditional multilayer perceptrons (MLPs) and emphasizes local feature interactions. The model consists of multiple convolutional layers with ReLU activation functions, followed by pooling layers for dimensionality reduction, and fully connected layers for classification. The output layer has five neurons corresponding to the defect classes, with a Softmax activation to produce probability distributions. The training involves backpropagation with the modified loss function, using optimization algorithms like Adam to update weights. The model is trained on 70% of the dataset and tested on the remaining 30%, with data augmentation techniques applied to defect classes to further address imbalance.

To evaluate the effectiveness of FR-CS-CNN, I compare it with baseline models such as MLP and standard CNN. The performance metrics include overall accuracy, precision, recall, and F1 score for each defect class. The training and testing results are summarized in Table 2, which shows that FR-CS-CNN achieves superior performance across all metrics. Specifically, the overall accuracy on the test set reaches 93.67%, compared to 90.71% for CNN and 86.10% for MLP. This represents an improvement of 2.96% over CNN and 7.57% over MLP, demonstrating the value of feature redistribution and cost-sensitive learning for defect prediction in casting parts.

Table 2: Performance Comparison of Models for Casting Parts Defect Prediction
Model Training Accuracy (%) Testing Accuracy (%) Precision (Avg) Recall (Avg) F1 Score (Avg)
MLP 90.56 86.10 0.6901 0.8024 0.7375
CNN 93.89 90.71 0.8324 0.8986 0.8623
FR-CS-CNN 96.53 93.67 0.7930 0.9018 0.8380

Further analysis using confusion matrices reveals that FR-CS-CNN reduces false negatives for defect classes, which is critical in industrial settings where missing a defect in casting parts can lead to costly failures. The F1 scores for individual classes are presented in Table 3, highlighting that FR-CS-CNN maintains high recall for defects, ensuring that most defective casting parts are identified. For instance, the recall for gas pores reaches 0.9231, compared to 0.6933 for MLP, indicating a significant enhancement in sensitivity.

Table 3: Detailed F1 Scores for Defect Classes in Casting Parts
Class MLP F1 Score CNN F1 Score FR-CS-CNN F1 Score
Cold Shut 0.6179 0.8070 0.8254
Gas Pore 0.6500 0.8553 0.8781
Sand Hole 0.7468 0.7796 0.8372
Shrinkage Cavity 0.7273 0.7391 0.6875
Non-Defect 0.9446 0.9650 0.9617

Beyond model evaluation, I utilize the FR-CS-CNN model to analyze defect evolution patterns in casting parts. This involves studying the influence of individual process parameters and their interactions on defect probabilities. For single-factor analysis, I vary one parameter at a time while keeping others at mean values, and observe changes in defect probabilities predicted by the model. The results are summarized in Table 4, which shows key thresholds for defect occurrence. For example, shear strength below 2.7 kPa increases sand hole probability, while pouring temperature below 1390°C elevates risks of cold shuts and gas pores. These insights help optimize process settings to minimize defects in casting parts.

Table 4: Single-Factor Analysis for Defect Probabilities in Casting Parts
Parameter Defect Type Critical Range Probability Change
Shear Strength Sand Hole < 2.7 kPa Increase by 40%
C Content Cold Shut > 3.83% Increase by 30%
Pouring Temperature Cold Shut/Gas Pore < 1390°C Increase by 35%
Mg Content Shrinkage Cavity > 0.05% Increase by 25%

For dual-factor analysis, I examine interactions between pairs of parameters, such as shear strength and compaction rate, C and Mg content, and pouring temperature and inoculation amount. The defect probabilities are modeled as surfaces, with optimal ranges identified to avoid defects in casting parts. Using mathematical formulations, I derive safe operating zones. For instance, the interaction between shear strength (SS) and compaction rate (CR) can be described by a polynomial function:

$$ P(\text{defect}) = \beta_0 + \beta_1 \cdot SS + \beta_2 \cdot CR + \beta_3 \cdot SS \cdot CR + \beta_4 \cdot SS^2 + \beta_5 \cdot CR^2 $$

where \( P(\text{defect}) \) is the probability of gas pore occurrence, and \( \beta_i \) are coefficients learned from the model. The analysis reveals that high gas pore probabilities occur in two regions: (SS ∈ (2, 3.1) kPa, CR ∈ (45.2, 48)%) and (SS ∈ (2.8, 4.6) kPa, CR ∈ (39.3, 42.9)%). Similarly, for C and Mg content, the probability of shrinkage cavity is given by:

$$ P(\text{shrinkage}) = \gamma_0 + \gamma_1 \cdot C + \gamma_2 \cdot Mg + \gamma_3 \cdot C \cdot Mg $$

with \( \gamma_0 = 0.1, \gamma_1 = 0.3, \gamma_2 = 0.2, \gamma_3 = 0.05 \). This indicates that both high C and high Mg independently increase shrinkage risks, but their interaction is minimal. For pouring temperature (PT) and inoculation amount (IA), the cold shut probability is modeled as:

$$ P(\text{cold shut}) = \delta_0 + \delta_1 \cdot PT + \delta_2 \cdot IA + \delta_3 \cdot PT \cdot IA $$

with safe zones identified where PT > 1390°C and IA ∈ (20, 70) g. Outside this range, defects escalate, emphasizing the need for precise control in producing casting parts.

To further quantify these relationships, I perform regression analysis on the model outputs, resulting in coefficients that guide process optimization. Table 5 summarizes the key dual-factor interactions and their impact on defect probabilities for casting parts. This table provides actionable insights for manufacturers to adjust parameters and reduce defect rates.

Table 5: Dual-Factor Interaction Effects on Defect Probabilities in Casting Parts
Parameter Pair Defect Type Optimal Range Probability Reduction
Shear Strength & Compaction Rate Gas Pore Avoid (2-3.1 kPa, 45.2-48%) and (2.8-4.6 kPa, 39.3-42.9%) Up to 50%
C Content & Mg Content Shrinkage Cavity C < 3.83%, Mg < 0.05% Up to 40%
Pouring Temperature & Inoculation Amount Cold Shut/Gas Pore PT > 1390°C, IA ∈ (20, 70) g Up to 55%

The FR-CS-CNN model not only predicts defects but also serves as a diagnostic tool for root cause analysis in casting parts production. By leveraging the model’s sensitivity to parameter changes, I can simulate various scenarios and recommend adjustments. For instance, if sand holes are detected, increasing shear strength above 2.7 kPa and optimizing compaction rate can mitigate the issue. Similarly, for cold shuts, ensuring pouring temperature exceeds 1390°C and controlling inoculation amount within 20-70 g are effective strategies. These recommendations are derived from the model’s probabilistic outputs, which correlate with physical mechanisms in sand casting.

In conclusion, the proposed FR-CS-CNN method significantly advances defect prediction for casting parts by addressing feature arrangement and class imbalance challenges. The feature redistribution technique enhances the capture of weak correlations among process parameters, while cost-sensitive learning prioritizes defect classes, leading to higher recall and overall accuracy. The model achieves 93.67% testing accuracy, outperforming conventional approaches, and provides valuable insights into defect evolution through single and dual-factor analyses. Future work will focus on integrating real-time data streams from casting production lines, expanding the model to handle more defect types, and incorporating explainable AI techniques to enhance interpretability for industrial practitioners. This research contributes to the digital transformation of casting industries, enabling proactive quality control and reducing waste in manufacturing casting parts.

From a broader perspective, the methodology can be adapted to other manufacturing domains with similar data characteristics, such as forging or welding, where defect prediction is critical. The use of CNNs with customized preprocessing steps offers a flexible framework for industrial applications. By continuously refining the model with new data, it can evolve to address emerging challenges in casting parts production, supporting sustainable and efficient manufacturing practices. The integration of this model into digital twin systems could further enable predictive maintenance and optimization, paving the way for smarter foundries in the era of Industry 4.0.

Scroll to Top