Optimization of Ball Mill Running State with Improved NFP-Growth Algorithm in Spark Framework

Green mining drives modern mineral processing, where production safety and energy efficiency are critical. Ball mills, consuming ~50% of concentrator operational costs, are pivotal for grinding quality and energy conservation. However, traditional optimization methods struggle with massive mining datasets. This work proposes BLPNFP-growth—a parallel frequent itemset mining (FIM) algorithm on Spark—to optimize ball mill parameters using historical operational data. Our approach resolves load imbalance during distributed mining and extracts actionable rules for real-time control, achieving significant improvements in product fineness and throughput.

1. Introduction

Grinding determines final mineral liberation and separation efficiency. As core grinding equipment, ball mills critically influence production quality and energy consumption. Current optimization strategies face limitations: 1) Poor scalability for big data; 2) Neglect of ore characteristics (hardness, size distribution); 3) Insufficient adaptation to dynamic control loop interactions. With mining data accumulating exponentially, distributed data mining becomes essential. We parallelize the NFP-growth algorithm on Spark to address these gaps, enabling efficient extraction of ball mill optimization rules from terabytes of operational records.


Ball mill structural diagram

2. Related Work

Existing ball mill optimization approaches include:

Method Contribution Limitation
Two-layer control [6] Handles MIMO grinding processes Ignores ore property variations
CBR-Reinforcement Learning [7] Adapts to diverse operating conditions Computationally expensive for big data
PSO-CBR optimization [8] Improves product fineness control Fails to model parameter interactions
Fuzzy expert control [9] Supervises semi-autogenous grinding Requires extensive domain knowledge

Distributed FIM algorithms like PFP-growth on Spark show promise but suffer from:

  • Memory overhead from redundant header tables
  • Load imbalance during parallel conditional tree construction

Our BLPNFP-growth algorithm overcomes these by introducing a compact tree structure and dynamic workload partitioning.

3. The Improved NFP-Growth Algorithm

3.1 NFP-Growth Fundamentals

NFP-growth improves FP-growth by eliminating header tables and scanning databases once. For transaction database TD:

Tid Itemset
101 I2,I1,I5
102 I2,I4

Procedure:

  1. Scan TD to build T-tree, remove infrequent items, generate frequent-1 itemset L
  2. Construct NFP-tree with root null and node table Node_T
  3. Merge identical nodes in Node_T
  4. Generate frequent itemsets per item

3.2 Parallelization on Spark (PNFP-Growth)

Four-stage parallel workflow:

  1. Data preprocessing:
    $$ \text{RDD} \xrightarrow{\text{flatMap}} \text{List} \xrightarrow{\text{filter}} F\text{-list} $$
    Partition transactions into P groups via Hash Partition
  2. Pattern decomposition: Split transactions into path sequences
  3. Parallel mining: Build local FP-trees using mapPartitions
  4. Result aggregation: Merge outputs to HDFS

3.3 Load Balancing Optimization (BLPNFP-Growth)

Original PNFP-growth partitions items equally, causing computational imbalance. We propose a conditional FP-tree size model:

Workload estimation:
Item position in F-list:
$$ \text{item_loc} = L(\text{item}, F\text{-list}) $$
Computational weight:
$$ \text{Calculation} = \log(\text{item_loc}) $$
Tree scale:
$$ \text{Tree_Size} = \text{item_sup} \times (\text{item_loc} + 1) $$
Higher support (item_sup) yields larger Calculation and Tree_Size. Partitions are balanced using these metrics.

4. Algorithm Performance Analysis

Cluster configuration:

Component Specification
CPU Intel i5-6200U 2.30GHz
RAM 8GB
Spark v2.1.3
Dataset Webdocs (1.48GB, 1.69M records)

4.1 Scalability Tests

Execution time (min_sup=0.6):

Data scale (×10⁴) PFP-growth (s) PNFP-growth (s) BLPNFP-growth (s)
40 2,850 2,150 1,780
80 3,120 2,410 1,920
120 3,450 2,680 2,050
160 3,820 2,950 2,210

4.2 Support Threshold Impact

Runtime on full Webdocs dataset:

min_sup PFP-growth (s) PNFP-growth (s) BLPNFP-growth (s)
0.2 3,480 2,720 2,180
0.4 2,860 2,250 1,760
0.6 2,210 1,780 1,410
0.8 1,320 1,050 830

4.3 Node Scaling Efficiency

Runtime with varying cluster size (min_sup=0.6):

Nodes PFP-growth (s) PNFP-growth (s) BLPNFP-growth (s)
1 11,200 8,950 7,210
2 6,810 5,420 4,180
3 4,930 3,910 2,950
4 3,740 2,980 2,230

BLPNFP-growth shows 20–25% speedup over PFP-growth and 12–18% over PNFP-growth across tests.

5. Ball Mill Optimization Experiments

5.1 Data Preparation

75,000 records sampled every 5 minutes from an operational ball mill:

Parameter Controllable Unit
Ore feed rate Yes t/h
Water feed rate Yes m³/h
Motor current Yes A
Discharge fineness Yes %

5.2 Key Parameter Selection

Discharge fineness is the optimization target. Pearson correlation identifies critical controllable parameters:

$$ \rho = \frac{\sum_{i=1}^{n} (x_i – \bar{x})(y_i – \bar{y})}{\sqrt{\sum_{i=1}^{n} (x_i – \bar{x})^2 \sum_{i=1}^{n} (y_i – \bar{y})^2}} $$

Parameter |ρ| vs. Fineness Rank
Ore feed rate 0.8261 1
Water feed rate 0.8134 2
Grinding aid rate 0.8062 3
Motor current 0.8115 4

5.3 Operational Regime Partitioning

Stable ball mill operation is defined by motor current ∈ [72A, 85A]. Ore hardness and size distribution are considered constant within regimes. Data is discretized into clusters:

Parameter Discretization Intervals
Motor current (A) [72.45,73.75], [73.76,76.29], …, [84.57,86.81]
Ore feed rate (t/h) [84.08,85.34], [85.35,85.87], …, [93.93,95.65]

5.4 Optimization Results

BLPNFP-growth extracts strong association rules from discretized data. Optimization targets vs. actual values:

Parameter Regime 1 (78.76A) Regime 5 (84.16A)
Actual Target Actual Target
Ore feed rate (t/h) 90.35 91.71 95.82 97.17
Water feed rate (m³/h) 82.93 81.46 87.91 85.95
Discharge fineness (%) 26.89 28.23 30.87 32.04

Key improvements: 1) Fineness increased by 4.2–5.1%; 2) Throughput elevated by 1.5–1.8 t/h; 3) Water/grinding aid consumption reduced by 1.8–2.1%.

6. Conclusion

BLPNFP-growth enables efficient ball mill optimization using Spark-based distributed mining. Our contributions: 1) Parallel NFP-growth implementation; 2) Conditional FP-tree workload balancing; 3) Operational parameter optimization framework. Experimental results confirm 18–25% faster execution versus benchmarks and significant production improvements. Future work will integrate real-time streaming for adaptive ball mill control under dynamic ore conditions.

Scroll to Top