Quality Prediction in Sand Casting Using Multi-Source Heterogeneous Data

In the field of sand casting, process parameters and product three-dimensional structures significantly influence the quality of castings, leading to defects such as cold shuts, porosity, sand inclusions, and shrinkage cavities. However, the complexity of three-dimensional geometries poses challenges in quantitative characterization, making it difficult to establish relationships between morphology and casting quality. Traditional data-driven approaches in sand casting often neglect morphological analysis due to limitations in handling unstructured 3D data. To address this, we focus on leveraging multi-source heterogeneous data, including structured process parameters and 3D structural information, to predict defects in complex sand castings. This study aims to develop a robust framework for quality prediction by integrating advanced feature extraction techniques and machine learning models, ultimately enhancing the reliability of sand casting processes.

Sand casting remains a dominant manufacturing method, accounting for approximately 70% of casting production worldwide, due to its flexibility, short production cycles, and applicability across industries like aerospace, automotive, and machinery. The process involves several stages: sand mixing and molding, core making, metal melting, and pouring. Each stage generates structured data, such as sand properties (e.g., moisture, compactability, permeability, and green strength), melting parameters (e.g., temperature, composition), and pouring conditions (e.g., time, temperature). Additionally, the three-dimensional structure of castings, represented as voxel grids or point clouds, serves as a critical but often underutilized data source. Defects in complex castings like steering axles, swing frames, and axle housings frequently arise from interactions between process variables and geometric features, necessitating a holistic approach to quality prediction.

To illustrate the complexity of sand casting components, consider the following examples of castings with intricate geometries that are prone to defects:

We begin by analyzing the sand casting process and data collection methods. The production workflow for sand casting, such as on a KW high-pressure molding line, includes:
– Sand mixing: Combining materials like coal dust, clay, fine powder, new sand, and recycled sand to form green sand, with periodic testing of properties.
– Molding: Using molds to shape sand into cavities via air pre-compaction and head compaction.
– Core making: Creating sand cores based on internal cavity designs.
– Melting: Controlling furnace charge composition to influence final properties.
– Pouring: Recording real-time data like pouring temperature and time for each ladle.

Common defects in sand casting are summarized in Table 1, highlighting their causes and impacts. This table integrates insights from process analysis and defect studies, emphasizing the need for comprehensive data integration.

Table 1: Common Defects in Sand Casting and Their Characteristics
Defect Type	Description	Primary Causes
Cold Shut	Incomplete fusion of metal streams	Low pouring temperature, improper gating
Porosity	Gas pockets within the casting	High moisture, inadequate venting
Sand Inclusion	Embedded sand particles	Poor sand strength, erosion
Shrinkage Cavity	Voids from solidification shrinkage	Inadequate feeding, design issues

For data acquisition, we implement a single-piece traceability solution that links process data to individual castings. This involves correlating records from modules like core setting, spectral analysis, spheroidization, and internal/external defect logging. Key fields such as production date, casting ID, and heat number are used to uniquely trace melting and pouring parameters. For instance, the relationship between core setting time and spectral analysis time enables precise mapping of process conditions to each casting, facilitating the collection of heterogeneous data sources. The data preprocessing step includes normalization to scale values between 0 and 1, as shown in Equation 1, where $ x $ is the original value, $ x_{\min} $ and $ x_{\max} $ are the minimum and maximum values, and $ x_{\text{norm}} $ is the normalized result:

$$ x_{\text{norm}} = \frac{x – x_{\min}}{x_{\max} – x_{\min}} $$

Additionally, categorical defect labels are encoded using one-hot encoding, converting them into numerical vectors for model input.

Central to our approach is the extraction of features from three-dimensional casting structures. We employ a 3D Deep Convolutional Autoencoder (3D-DCAE) to reduce the dimensionality of voxelized 3D models and capture essential morphological characteristics. Autoencoders are unsupervised models that reconstruct input data through an encoder-decoder architecture, learning compact representations in the hidden layer. The encoder maps input data to a latent space, while the decoder reconstructs it, with the loss function measuring the reconstruction error. For a 3D voxel grid input $ \mathbf{X} $, the encoder produces a latent representation $ \mathbf{h} $, and the decoder outputs a reconstruction $ \hat{\mathbf{X}} $. The mean squared error loss is given by Equation 2:

$$ L_{\text{recon}} = \frac{1}{N} \sum_{i=1}^{N} \| \mathbf{X}_i – \hat{\mathbf{X}}_i \|^2 $$

where $ N $ is the number of samples. Compared to linear methods like PCA, 3D-DCAE handles nonlinear relationships through convolutional layers, which scan the 3D space with kernels to preserve spatial hierarchies. Our 3D-DCAE model consists of multiple 3D convolutional layers, pooling layers, and deconvolutional layers, as outlined in Table 2, which summarizes the model architecture and parameters.

Table 2: 3D-DCAE Model Architecture and Parameters
Layer Type	Parameters	Output Shape
Input	Voxel grid (e.g., 64x64x64)	–
3D Convolution	Kernel: 3x3x3, Filters: 32	64x64x64x32
3D Max Pooling	Pool size: 2x2x2	32x32x32x32
3D Convolution	Kernel: 3x3x3, Filters: 64	32x32x32x64
3D Max Pooling	Pool size: 2x2x2	16x16x16x64
Latent Space	Fully connected	512
3D Deconvolution	Kernel: 3x3x3, Filters: 64	16x16x16x64
3D Upsampling	Size: 2x2x2	32x32x32x64
3D Deconvolution	Kernel: 3x3x3, Filters: 32	32x32x32x32
Output	3D Convolution	64x64x64x1

To evaluate the feature extraction capability, we compare 3D-DCAE with a 2D convolutional autoencoder (2D-DCAE), which treats depth as channels and may lose spatial continuity. The reconstruction accuracy of 3D-DCAE reaches 99.76%, significantly outperforming 2D-DCAE, as quantified by the lower reconstruction loss over iterations. The training process uses stochastic gradient descent with a learning rate of 0.01 and mini-batch size of 64. The superiority of 3D-DCAE is attributed to its ability to capture three-dimensional correlations, whereas 2D-DCAE overlooks depth-wise relationships. This makes 3D-DCAE particularly suitable for sand casting applications, where complex geometries like holes and curved surfaces in swing frames and steering axles require precise representation.

Next, we develop a defect prediction model that integrates the extracted 3D features with structured process data. The model is a convolutional neural network (CNN) combined with fully connected layers, designed to handle heterogeneous inputs. The topology includes:
– Input layer: Normalized process parameters and 3D features.
– Convolutional layers: Two 1D convolutional layers with kernel size 3 to scan local regions and expand feature channels.
– Pooling layer: Max pooling to reduce dimensionality and retain salient features.
– Flattening layer: Conversion to a 1D vector.
– Output layer: Fully connected layer with five units corresponding to defect types (e.g., cold shut, porosity).

The model employs a cost-sensitive cross-entropy loss function to address class imbalance, as defined in Equation 3:

$$ L = -\sum_{c=1}^{C} w_c y_c \log(\hat{y}_c) + \lambda \| \mathbf{W} \|^2 $$

where $ C $ is the number of classes, $ y_c $ is the true label, $ \hat{y}_c $ is the predicted probability, $ w_c $ is the class weight, $ \lambda $ is the regularization parameter, and $ \mathbf{W} $ represents model weights. Training uses mini-batch gradient descent with parameters detailed in Table 3.

Table 3: Defect Prediction Model Training Parameters
Parameter	Value
Training set ratio	0.8
Test set ratio	0.2
Initial learning rate	0.01
Epochs	80
Batch size	64
Loss function	Cross-entropy with cost-sensitive regularization
Optimization algorithm	Mini-batch gradient descent

We compare our model, termed FR-CS-CNN (Feature-Enhanced Cost-Sensitive CNN), with traditional multilayer perceptron (MLP) and standard CNN models. Performance metrics on training and test sets are summarized in Table 4. Our model achieves higher accuracy, demonstrating its efficacy in leveraging multi-source data for sand casting quality prediction.

Table 4: Model Performance Comparison for Defect Prediction
Model	Training Accuracy (%)	Test Accuracy (%)
MLP	92.6	86.1
Standard CNN	93.9	90.7
FR-CS-CNN (Ours)	96.5	93.7

The integration of 3D features and process parameters enables our model to capture complex interactions, such as how geometric hotspots influence defect formation in sand casting. For example, in steering axles, areas with high curvature may be prone to shrinkage cavities due to solidification patterns, and our model accurately identifies these correlations. This approach provides a scalable solution for industrial applications, where real-time quality prediction can reduce scrap rates and improve efficiency in sand casting processes.

In conclusion, we address the challenges of characterizing complex three-dimensional structures in sand casting by developing a multi-source heterogeneous data-driven framework. Key contributions include:
– Analyzing sand casting processes and collecting structured and 3D data.
– Constructing a 3D-DCAE for effective feature extraction from casting geometries.
– Building a defect prediction model that integrates process and morphological data for accurate quality assessment.

This research underscores the importance of leveraging advanced data analytics in sand casting to enhance product quality and operational reliability. Future work could explore real-time implementation and expansion to other casting methods, further solidifying the role of data-driven approaches in modern manufacturing.