Intelligent Inspection of Casting Part Surface Defects: A Deep Learning and Machine Vision Approach

The quality assurance of casting parts is a cornerstone of modern manufacturing, directly impacting the structural integrity, performance, and safety of final products across industries from automotive to aerospace. Traditionally, the inspection of surface defects on casting parts has relied heavily on manual visual examination. This method, while historically necessary, is fraught with significant limitations that compromise quality control standards. Human inspectors are subject to visual fatigue, leading to inconsistent judgments, high rates of missed defects (false negatives) and false alarms (false positives), and a lack of standardized, quantifiable criteria. The subjective nature of this process makes it difficult to achieve repeatable and traceable results, which are essential for advanced manufacturing and supply chain accountability. This paper addresses these critical challenges by presenting the design and application of an intelligent inspection system for casting part surface defects, leveraging the synergistic power of machine vision and deep learning. The system aims to automate the detection process, thereby enhancing precision, efficiency, and robustness in industrial settings.

To establish a technical foundation, it is essential to understand the common anomalies that afflict casting parts. Surface defects in a casting part arise from complex interactions during the solidification process, including thermal stress, chemical reactions, gas entrapment, and mold interactions. These defects can be systematically categorized based on their formation mechanism. A clear classification aids in developing targeted detection algorithms, as different defects possess distinct visual signatures in terms of morphology, contrast, and texture. The following table summarizes the primary types of surface defects encountered in casting parts.

Table 1: Common Surface Defect Types in Casting Parts
Defect Category	Specific Defects	Primary Causes	Typical Morphology & Impact
Solidification Defects	Cracks, Cold Shuts, Misruns	Thermal stress, improper filling, low metal fluidity.	Linear or irregular fissures; incomplete filling. Severely reduces mechanical strength.
Solidification Defects	Shrinkage Porosity/Cavities	Uncompensated volumetric shrinkage during solidification.	Irregular, often internal cavities that may open to surface. Creates stress concentrators.
Gas-Related Defects	Gas Porosity, Pinholes	Entrapment of dissolved gases (e.g., hydrogen, nitrogen) released during solidification.	Spherical or elongated smooth-walled voids. Reduces effective load-bearing area.
Mold-Related Defects	Sand Inclusions (Sand holes)	Erosion or breakdown of the mold/core material.	Irregular cavities filled with sand. Creates localized weakness and surface roughness.
Mold-Related Defects	Slag Inclusions	Entrapment of non-metallic oxides or fluxes from the melting process.	Irregular, dark-colored inclusions within or just below the surface.

The core technological solution proposed here rests on image recognition. The first and often critical step in any machine vision pipeline is image preprocessing, specifically noise reduction. Raw images captured in an industrial environment are invariably corrupted by noise from uneven lighting, sensor imperfections, and electromagnetic interference. This noise can obscure subtle defect features and drastically reduce the performance of subsequent segmentation and classification algorithms. Therefore, selecting an appropriate denoising filter is paramount. We conducted a comparative analysis of three fundamental spatial filters: Mean, Median, and Gaussian. Their mathematical formulations and characteristics are as follows:

1. Mean Filter: This linear filter operates by replacing the intensity of each pixel with the average (mean) intensity of its neighbors within a defined kernel window of size $ n \times n $.

$$
g(x, y) = \frac{1}{n^2} \sum_{i=-k}^{k} \sum_{j=-k}^{k} f(x+i, y+j)
$$

Where $ g(x, y) $ is the filtered image, $ f(x, y) $ is the original noisy image, and $ k = \frac{n-1}{2} $. While computationally simple and effective for Gaussian noise, it tends to blur edges, which is detrimental for preserving the sharp boundaries of cracks or inclusions in a casting part.

2. Median Filter: A non-linear filter that replaces a pixel’s value with the median value from its neighborhood. This is highly effective against “salt-and-pepper” noise, common in digital imaging, while preserving edge sharpness.

$$
g(x, y) = \text{median} \{ f(x+i, y+j) \}, \quad i,j \in [-k, k]
$$

Its edge-preserving property is crucial for maintaining the integrity of defect contours during later analysis stages.

3. Gaussian Filter: Another linear filter that uses a weighted average based on a Gaussian (bell-shaped) kernel. Pixels closer to the center contribute more to the final value.

$$
G(x, y) = \frac{1}{2\pi\sigma^2} e^{-\frac{x^2+y^2}{2\sigma^2}}
$$

$$
g(x, y) = G(x, y) * f(x, y)
$$

Where $ \sigma $ is the standard deviation controlling the spread (blur extent). It is optimal for suppressing Gaussian noise but, like the mean filter, causes edge smoothing.

For the specific context of casting part inspection, where impulse noise is prevalent and edge definition is critical, the median filter was selected as the optimal preprocessing method. Its ability to suppress noise without significantly degrading defect boundaries provides a superior foundation for the subsequent, more complex image segmentation task. The comparative performance is summarized below.

Table 2: Comparative Analysis of Denoising Filters for Casting Part Images
Filter Type	Mathematical Principle	Advantages	Disadvantages for Casting Inspection	Suitability
Mean Filter	Linear Averaging	Simple, fast, good for Gaussian noise.	Blurs edges excessively, reduces defect contrast.	Low
Median Filter	Non-linear Order Statistics	Excellent at removing salt-and-pepper noise, preserves edges.	Less effective against Gaussian noise.	High (Selected)
Gaussian Filter	Weighted Averaging (Gaussian Kernel)	Optimal for Gaussian noise, tunable with σ.	Causes edge blurring, may suppress fine defect details.	Medium

Following preprocessing, the most critical step is image segmentation—isolating potential defect regions from the background of the casting part surface. This is a challenging task due to factors like low contrast, textured backgrounds, and varying defect shapes. We investigated and compared three distinct algorithmic approaches, ranging from classical computer vision to modern deep learning.

1. Adaptive Threshold Segmentation: This classical method calculates a local threshold for each pixel based on the mean intensity of its surrounding region, making it adaptable to uneven illumination. The threshold $ T(x, y) $ for a pixel is often computed as:

$$
T(x, y) = \mu(x, y) + k \cdot \sigma(x, y) + C
$$

Where $ \mu(x, y) $ and $ \sigma(x, y) $ are the local mean and standard deviation, $ k $ is a weighting constant, and $ C $ is a global bias. While fast and parameter-light, it often struggles with the complex, low-contrast textures of defects like subtle cracks, resulting in fragmented or noisy segmentation.

2. Canny Edge Detector with Region Growing: This hybrid method first uses the Canny algorithm to detect strong edges. The Canny operator involves gradient calculation using Sobel kernels:

$$
G_x = \begin{bmatrix} -1 & 0 & +1 \\ -2 & 0 & +2 \\ -1 & 0 & +1 \end{bmatrix} * I, \quad
G_y = \begin{bmatrix} -1 & -2 & -1 \\ 0 & 0 & 0 \\ +1 & +2 & +1 \end{bmatrix} * I
$$

Gradient magnitude and direction are then: $ G = \sqrt{G_x^2 + G_y^2} $, $ \theta = \arctan2(G_y, G_x) $. After non-maximum suppression and hysteresis thresholding, clear edge maps are produced. These edges then serve as seeds or boundaries for a region-growing algorithm, which aggregates connected pixels satisfying a homogeneity criterion (e.g., intensity within a range $ T $ of the region mean $ \mu_R $): $ |I(p) – \mu_R| < T $. This method works well for defects with strong, closed contours but is sensitive to noise in the initial edge detection and requires careful tuning of multiple parameters (Gaussian blur $ \sigma $, Canny thresholds, growing tolerance $ T $).

3. Improved U-Net Network (Deep Learning): This approach employs a convolutional neural network (CNN) architecture specifically designed for biomedical and industrial image segmentation. The standard U-Net features a symmetric encoder-decoder path with skip connections. Our improved version incorporates additional mechanisms like residual connections and attention gates within the skip connections to better focus on relevant defect features and suppress irrelevant background information from the casting part surface. The network is trained end-to-end by minimizing a loss function, typically a combination of cross-entropy and Dice loss, which measures the overlap between the predicted segmentation map $ Y_{pred} $ and the ground truth $ Y_{true} $:

$$
\mathcal{L} = -\frac{1}{N} \sum_{i=1}^{N} [Y_{true}^{(i)} \log(Y_{pred}^{(i)}) + (1 – Y_{true}^{(i)}) \log(1 – Y_{pred}^{(i)})] + \lambda \left(1 – \frac{2 \sum Y_{true} \cdot Y_{pred}}{\sum Y_{true} + \sum Y_{pred}}\right)
$$

Once trained, the network can directly infer a pixel-wise segmentation mask from an input image. It demonstrates superior capability in handling the diverse and complex appearance of defects, offering high completeness and clarity in the extracted regions. The comparative strengths and weaknesses of these segmentation methods are detailed in the following table.

Table 3: Comparison of Segmentation Algorithms for Casting Part Defects
Algorithm	Core Principle	Advantages	Disadvantages	Performance on Casting Parts
Adaptive Threshold	Local statistical thresholding.	Very fast, simple to implement, adaptive to lighting.	Poor on low-contrast/textured defects; high noise sensitivity.	Fragmented output, many false positives.
Canny + Region Growing	Edge detection followed by region aggregation.	Good for sharp-edged defects; provides closed contours.	Parameter-sensitive; fails on blurry or faint edges; computationally heavier.	Disconnected edges, incomplete growth, noise-prone.
Improved U-Net	Deep Convolutional Encoder-Decoder Network.	Automatic feature learning; high accuracy & completeness; robust to noise and variation.	Requires large labeled dataset; high computational cost for training.	Excellent: Clear, complete, continuous defect segmentation with minimal background noise.

Based on the comparative analysis, the improved U-Net network was selected as the core segmentation engine due to its unmatched performance in target integrity and edge clarity for casting part defect inspection. The complete system architecture integrates this choice into a cohesive, four-stage pipeline designed for industrial deployment.

The intelligent inspection system for casting part defects follows a modular workflow: Image Acquisition, Preprocessing, Segmentation, and Defect Recognition & Classification. In the acquisition stage, high-resolution industrial area-scan or line-scan cameras capture detailed images of the casting part surface under controlled, uniform lighting. The preprocessing stage converts the image to grayscale and applies the selected median filter to suppress noise. The segmentation stage employs the trained, improved U-Net model to generate a binary mask highlighting all potential defect pixels. Finally, in the recognition stage, connected component analysis is performed on the mask to label individual defect blobs. For each blob, morphological and geometric features are extracted—such as area, perimeter, bounding box dimensions, eccentricity, and Hu moments. These feature vectors are then fed into a classifier (which can be a simpler ML model like SVM or additional CNN layers) to categorize the defect into types like ‘Gas Pore’, ‘Crack’, or ‘Sand Inclusion’. The system’s output includes annotated images and structured reports containing defect type, location, size, and severity for each inspected casting part.

The practical validation of this system in an industrial setting confirmed its significant advantages over manual inspection. The following table quantifies the system’s key performance metrics based on extensive testing with a diverse set of casting parts.

Table 4: Industrial Performance Metrics of the Proposed Inspection System
Performance Metric	System Result	Traditional Manual Benchmark	Improvement Factor / Notes
Overall Detection Accuracy	96.2%	~80% (estimated, highly variable)	Significant and consistent enhancement in reliability.
Average Processing Time per Part	0.8 seconds	~4-5 seconds (visual scan + decision)	>5x faster, enabling inline inspection.
Defect-wise Detection Rate (Recall)	Gas Pores: 97.3% Cracks: 95.8% Sand Inclusions: 96.5% Slag Inclusions: 95.2%	Highly inconsistent, prone to misses especially for small or low-contrast defects.	High and stable recall for all major defect types.
False Positive Rate	1.7%	Variable, can be high due to overcautious inspection.	Low and controlled, reducing unnecessary rework.
False Negative (Miss) Rate	≤ 2.1%	Can exceed 10-15% under fatigue.	Drastically reduced, enhancing quality assurance.
System Uptime / Robustness	> 99.5%, accuracy drift < 1% over 24h run	Limited by worker shifts and concentration.	Enables continuous, unattended operation.

In conclusion, the developed intelligent inspection system, centered on an improved U-Net segmentation network and robust median filter preprocessing, effectively addresses the long-standing challenges in casting part quality control. It transitions the inspection process from a subjective, variable manual task to an objective, quantifiable, and highly efficient automated operation. The system’s demonstrated high accuracy, speed, and robustness provide a concrete and practical reference for the intelligent upgrading of the foundry industry. Future work will focus on further optimizing the deep learning model’s efficiency for deployment on edge computing devices, expanding the defect database to cover even rarer anomalies, and integrating the system with robotic arms for automated rejection or marking of defective casting parts, thereby closing the loop on a fully intelligent production cell.