In modern steel production, the continuous casting process is a critical link that directly influences product quality and operational efficiency. As a researcher focused on metallurgical process optimization, I have observed that surface slag inclusion defects in cast slabs are a persistent issue, leading to downstream rolling plate defects such as peeling and inclusions, which reduce metal yield and even cause breakout accidents. The hereditary nature of slag inclusion defects means that even minor imperfections can propagate through subsequent processing, underscoring the need for accurate online prediction. Traditional methods, such as manual visual inspection or offline sampling, are inefficient, subjective, and lack real-time capability. While machine vision-based approaches exist, they suffer from high false-negative rates due to interference from oxide scale and mold flux. Therefore, in this study, we explore machine learning-based prediction models to address these limitations, with a focus on developing a robust framework that balances training accuracy and testing performance for practical application.
The core challenge in slag inclusion defect prediction lies in the imbalance between model fitting and generalization capabilities. Existing models often achieve high training accuracy but fail to maintain reliable performance in unseen test data, primarily due to the lack of systematic parameter optimization. To tackle this, we investigate various algorithms, including Support Vector Machine (SVM), Random Forest (RF), and Adaptive Boosting (AdaBoost), for building prediction models. Our dataset, derived from a slab quality analysis and monitoring system, includes 60 influencing factors—such as converter blowing parameters, refining conditions, steel composition, and continuous casting variables—collected and matched temporally to each slab. After preprocessing with the Z′-Score method for outlier handling and oversampling to address data imbalance, we train and test these models extensively. We find that SVM offers a superior balance, prompting us to develop a PSO-optimized SVM model. This paper details our methodology, results, and insights, emphasizing the relationship between training and testing accuracy to enable parameter optimization without prior knowledge of test outcomes, thereby enhancing the detection of slag inclusion defects and improving product quality stability.
Slag inclusion defects refer to non-metallic slag entrapped on the slab surface, distributed irregularly and originating from sources like refining slag, refractory materials, mold flux, and deoxidation products. The entrapment often occurs due to vortexing and uneven flow at the meniscus region. Key factors influencing slag inclusion defect formation include process parameters from steelmaking to casting, such as oxygen blowing time, ladle furnace stirring duration, steel chemical composition (e.g., [C], [Si], [Al]), mold level fluctuations, casting speed, and mold thermal behavior. In our work, we leverage a slab quality analysis system to acquire high-frequency data for these factors, aligning them with each slab via spatiotemporal matching models. This integration allows us to compute aggregated features (e.g., means) per slab, forming a comprehensive dataset for model development.

Data preprocessing is crucial for model reliability. Our dataset, comprising 194 slabs with slag inclusion defects and 669 defect-free slabs, exhibits imbalances and anomalies. We handle missing values through deletion or mean imputation and identify outliers using the Z′-Score method, defined as: $$Z’ = \frac{x_i – M(x)}{1.4826 \times D},$$ where \(M(x)\) is the median and \(D\) is the median absolute deviation, \(D = \text{median}(|x_i – M(x)|)\). Values with \(Z’ > 3\) are considered outliers and replaced with boundary values (e.g., \(M(x) \pm 3 \times 1.4826 \times D\)). For imbalance correction, we apply oversampling to the minority class (slag-containing slabs), expanding it fourfold based on correlation stability analysis—this ensures that key features, like submerged entry nozzle insertion depth, maintain consistent relationships with slag inclusion defect occurrence.
For modeling, we employ three algorithms: SVM, RF, and AdaBoost. SVM is particularly effective for binary classification tasks like slag inclusion defect prediction. We use a class-weighted SVM to address imbalance, with the optimization problem formulated as: $$\min_{w,b,\xi} \frac{1}{2} \| \omega \|^2 + C_+ \sum_{y_i=+1} \xi_i + C_- \sum_{y_i=0} \xi_i,$$ subject to \(y_i(\omega^T x_i + b) \geq 1 – \xi_i\) and \(\xi_i \geq 0\) for \(i=1,2,\dots,n\). Here, \(\omega\) is the weight vector, \(b\) is the bias, \(\xi_i\) are slack variables, and \(C_+\) and \(C_-\) are penalty parameters for positive (slag inclusion defect) and negative classes, respectively. RF and AdaBoost serve as benchmarks, with RF building an ensemble of decision trees and AdaBoost combining weak learners adaptively.
Model evaluation focuses on metrics critical for defect detection: false positive rate (FPR), false negative rate (FNR), and the Fβ score. We prioritize minimizing FNR to reduce missed slag inclusion defects, using F2 score (β=2) to emphasize recall. The formulas are: $$\text{FNR} = \frac{FN}{TP + FN}, \quad \text{FPR} = \frac{FP}{FP + TN}, \quad F_\beta = \frac{(1+\beta^2) \times Pr \times Rc}{\beta^2 \times Pr + Rc},$$ where \(Pr = TP/(TP+FP)\) is precision, \(Rc = TP/(TP+FN)\) is recall, and \(TP\), \(FN\), \(FP\), \(TN\) are counts from the confusion matrix. In practice, a prediction threshold of 0.4 is used: values below indicate no slag inclusion defect, 0.4–0.7 suggest potential slag inclusion defects for sampling, and above 0.7 mandate full inspection.
Our initial experiments involve 5,000 random training-test splits (70% training, 30% testing) with random hyperparameters. Table 1 summarizes the average performance across algorithms, highlighting SVM’s balanced trade-off between fitting and generalization.
| Algorithm | Training FPR | Training FNR | Training F2 Score | Testing FPR | Testing FNR | Testing F2 Score |
|---|---|---|---|---|---|---|
| SVM | 0.15 | 0.08 | 0.88 | 0.25 | 0.22 | 0.68 |
| Random Forest | ~0.01 | ~0.01 | ~0.99 | 0.28 | 0.35 | 0.58 |
| AdaBoost | 0.12 | 0.10 | 0.86 | 0.23 | 0.27 | 0.65 |
SVM demonstrates a testing F2 score of 0.68, outperforming others in generalization, though its testing FPR and FNR indicate room for improvement. This leads us to select SVM for further optimization. To systematically optimize hyperparameters (e.g., \(C_+\), \(C_-\), kernel parameters), we employ Particle Swarm Optimization (PSO), a metaheuristic that simulates social behavior. In PSO, each particle represents a candidate hyperparameter set, with position \(X_i = [X_{i1}, X_{i2}, \dots, X_{iD}]\) and velocity \(V_i = [V_{i1}, V_{i2}, \dots, V_{iD}]\) updated iteratively: $$V_{id}^{k+1} = \omega V_{id}^k + c_1 r_1 (P_{id}^k – X_{id}^k) + c_2 r_2 (P_{gd}^k – X_{id}^k),$$ $$X_{id}^{k+1} = X_{id}^k + V_{id}^{k+1},$$ where \(\omega\) is inertia weight, \(c_1\) and \(c_2\) are learning factors, \(r_1\) and \(r_2\) are random numbers in [0,1], \(P_{id}^k\) is the personal best, and \(P_{gd}^k\) is the global best. The fitness function for evaluation is critical—we aim to maximize testing F2 score, but this is unknown during training. Therefore, we derive a predictive relationship between training metrics and testing F2 score.
Using the 5,000 runs from SVM, we analyze correlations and build a polynomial regression model. Let \(t_1\), \(t_2\), and \(t_3\) represent standardized training FPR, FNR, and F2 score, respectively. The testing F2 score \(f\) is modeled as: $$f(t) = 0.62 – 1.19 t_2 t_3 – 0.89 t_3^2 – 0.4 t_2^2 – 0.35 t_1 t_3 – 0.28 t_1 t_2.$$ This model, with a mean squared error of 0.002564 and R² of 0.6103, captures key interactions, enabling us to estimate testing performance from training data alone. We integrate this into PSO: for each particle (hyperparameter set), we train SVM, compute \(t_1\), \(t_2\), \(t_3\), standardize them, and evaluate \(f(t)\) as the fitness to guide optimization toward higher predicted testing F2 score.
The PSO process runs for 100 iterations with a swarm size of 30, optimizing SVM parameters including penalty coefficients and kernel settings. Table 2 outlines the PSO parameters and their ranges, ensuring a comprehensive search.
| Parameter | Description | Range/Settings |
|---|---|---|
| Swarm Size | Number of particles | 30 |
| Iterations | Maximum updates | 100 |
| Inertia Weight (\(\omega\)) | Controls momentum | 0.5 to 0.9 (linear decay) |
| Learning Factors (\(c_1, c_2\)) | Individual/social influence | \(c_1 = 2.0, c_2 = 2.0\) |
| SVM Hyperparameters | Penalties \(C_+, C_-\) | [0.1, 100] (log scale) |
| Kernel Type | SVM kernel function | Radial Basis Function (RBF) |
After optimization, the best particle yields hyperparameters that achieve training metrics of FPR=0.11, FNR=0.015, and F2=0.92. Using the regression model, the predicted testing F2 score is 0.752. Actual testing on a holdout set confirms a testing F2 score of 0.727, with FPR=0.229 and FNR=0.186. This represents a significant improvement over baseline SVM, reducing missed slag inclusion defects by approximately 15% while maintaining manageable false alarms. The optimal model’s performance is summarized in Table 3, alongside comparative benchmarks.
| Metric | Training Value | Predicted Testing Value | Actual Testing Value |
|---|---|---|---|
| False Positive Rate (FPR) | 0.110 | — | 0.229 |
| False Negative Rate (FNR) | 0.015 | — | 0.186 |
| F2 Score | 0.920 | 0.752 | 0.727 |
To visualize results, Figure 1 shows actual versus predicted slag inclusion defect status for training and testing samples under the optimal parameters. In these plots, each point represents a slab, with circles indicating true labels (blue for defect-free, green for slag inclusion defect) and triangles showing predictions (red). The threshold zones (below 0.4, 0.4–0.7, above 0.7) guide inspection protocols. The model effectively identifies most slag inclusion defects, though some false positives and negatives persist due to process noise and data limitations.
The regression model \(f(t)\) is pivotal, as it encodes the relationship between training and testing accuracy. By analyzing the coefficients, we infer that training FNR (\(t_2\)) and F2 score (\(t_3\)) have strong nonlinear impacts on testing performance, with interaction terms like \(t_2 t_3\) being particularly influential. This suggests that overfitting (low training FNR but high \(t_3\)) can harm generalization, whereas balanced training metrics promote better testing outcomes. We validate this by testing alternative models; for instance, a high training F2 score alone does not guarantee high testing F2 score if FNR is neglected, emphasizing the need for multi-metric optimization.
In practical application, our PSO-SVM model integrates into the slab quality analysis system for real-time slag inclusion defect prediction. As new slabs are cast, process data are fed into the model, generating a probability of slag inclusion defect between 0 and 1. Based on the threshold strategy, operators can prioritize inspections: slabs with probabilities below 0.4 require no action, those between 0.4 and 0.7 undergo sampling, and those above 0.7 are fully inspected and ground if needed. This tiered approach reduces labor costs while minimizing the risk of missing slag inclusion defects. For example, in a trial with 500 slabs, the model flagged 120 for sampling, of which 80 were confirmed with slag inclusion defects—a detection rate that surpasses manual methods by 30%.
Beyond prediction, we explore feature importance to inform process control. Using SHAP (SHapley Additive exPlanations) analysis on the SVM model, we identify top contributors to slag inclusion defect occurrence. Table 4 lists key factors, highlighting that mold level fluctuations, casting speed variations, and mold thermal imbalances are primary drivers, aligning with known metallurgical principles.
| Rank | Factor | Description | Average SHAP Value |
|---|---|---|---|
| 1 | Mold Level Fluctuation | Standard deviation of meniscus height | 0.45 |
| 2 | Casting Speed Variation | Change rate during slab casting | 0.38 |
| 3 | Mold Wide-Narrow Heat Flux Ratio | Ratio of thermal gradients | 0.35 |
| 4 | Submerged Entry Nozzle Insertion Depth | Average depth during casting | 0.32 |
| 5 | Steel Aluminum Content ([Al]) | Total aluminum in steel | 0.28 |
These insights can guide operational adjustments; for instance, stabilizing mold level through automated controls or optimizing nozzle depth can mitigate slag inclusion defect formation. We further model the effect of these factors using multivariate regression. For a factor like mold level fluctuation \(L\), its impact on slag inclusion defect probability \(P\) can be approximated as: $$P(L) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 L)}},$$ where \(\beta_0\) and \(\beta_1\) are coefficients estimated from data. Such equations aid in root-cause analysis and preventive maintenance.
Our study also addresses data challenges inherent in industrial settings. The Z′-Score method proves effective for outlier handling, but we supplement it with domain knowledge—for example, physiochemical limits for temperature readings. Additionally, the oversampling factor of four for slag inclusion defect slabs is validated through correlation stability, but we test sensitivities by varying it from two to six. As shown in Table 5, expansion beyond fourfold does not significantly improve model performance, indicating diminishing returns.
| Oversampling Multiplier | Training F2 Score | Testing F2 Score | Correlation Stability Index |
|---|---|---|---|
| 2 | 0.85 | 0.65 | 0.75 |
| 4 | 0.88 | 0.68 | 0.92 |
| 6 | 0.89 | 0.69 | 0.93 |
In discussing limitations, we note that our dataset, while substantial, comes from a single steel grade (SPHC). Generalizability to other grades requires validation with diverse data. Moreover, the regression model \(f(t)\) is empirical and may need recalibration for different process conditions. Future work could incorporate online learning to adapt hyperparameters dynamically as new data streams in. We also plan to explore deep learning architectures, such as convolutional neural networks for image-based slag inclusion defect detection, to complement our parameter-driven approach.
From an implementation perspective, the PSO-SVM framework is computationally efficient, with training times under 10 minutes on standard hardware, making it suitable for real-time deployment. We encapsulate the model in a software module that interfaces with plant databases, providing alerts and reports via dashboards. This integration has been piloted in a continuous caster, resulting in a 20% reduction in rolled plate defects attributed to slag inclusion defects over a six-month period, demonstrating tangible benefits.
To conclude, slag inclusion defects are a critical quality issue in continuous casting, and their prediction via machine learning offers a proactive solution. Our research establishes that SVM, optimized with PSO using a derived relationship between training and testing metrics, achieves a balanced performance with testing F2 score of 0.727, FPR of 22.9%, and FNR of 18.6%. The polynomial model \(f(t) = 0.62 – 1.19 t_2 t_3 – 0.89 t_3^2 – 0.4 t_2^2 – 0.35 t_1 t_3 – 0.28 t_1 t_2\) enables parameter optimization without prior testing knowledge, addressing a key gap in existing approaches. By applying a threshold-based inspection strategy, this model enhances the detection of slabs with slag inclusion defects, reducing downstream defects and improving product consistency. Future directions include extending the framework to other defect types and integrating real-time adaptive optimization for continuous improvement in steel quality management.
In reflection, this work underscores the importance of holistic model development—from data preprocessing to algorithm selection and optimization—in tackling industrial challenges like slag inclusion defects. As steelmakers strive for higher efficiency and quality, such data-driven tools become indispensable, and we are committed to refining them through ongoing research and collaboration with industry partners.
