Optimization of Ball Mill Running State with Improved NFP-Growth Algorithm in Spark Framework

Abstract

Green mine construction is a primary task in mining development, requiring solutions for safe production, energy conservation, and emission reduction. In the era of big data, mining enterprises face challenges in fully utilizing stored data. This paper proposes an improved parallel predictive control strategy for high-order process control. First, the NFP-growth (New FP-growth) controller is optimized using the Spark distributed computing framework. Second, a computational model based on conditional FP-tree is introduced to address load imbalance across groups. Finally, the algorithm is applied to optimize the operational state of the ball mill. Experimental results validate the feasibility of the algorithm and its superior performance over other control methods, effectively optimizing ball mill parameters and enhancing intelligent control systems.

Keywords: Data mining, Spark, Predictive control, Ball mill


1. Introduction

Grinding is an indispensable step in mineral processing, accounting for approximately 50% of operational costs in concentrators. The ball mill is critical for improving grinding quality and achieving energy efficiency. Optimizing its operational state holds significant importance for mining enterprises.

Previous studies, such as Zhao Dayong et al. [3], proposed a two-layer optimization control method for multi-input multi-output grinding processes. Dai Chuan et al. [4] integrated case-based reasoning with theoretical learning to enhance ball mill adaptability under varying conditions. However, existing strategies fail to address massive datasets or external constraints like ore composition, particle size distribution, and hardness.

This paper leverages data mining techniques and the Spark framework to overcome these limitations. We propose the BLPNFP-growth (Balanced Load Parallel NFP-growth) algorithm, which improves computational efficiency and load balancing. The algorithm is tested on ball mill operational data, demonstrating enhanced performance.


2. Research and Improvement of the Algorithm

2.1 Description of NFP-growth Algorithm

The NFP-growth algorithm improves upon FP-growth by reducing memory overhead and traversal time. Key steps include:

  1. Input: Transaction database (Table 1).
  2. Step 1: Scan data to build a temporary T-tree and filter infrequent items.
  3. Step 2: Construct an NFP-tree with a node table.
  4. Step 3: Merge nodes and generate frequent itemsets.

Table 1: Sample Transaction Database

TidItemset
10112, 11, 15
10212, 14
10312, 13

2.2 Parallelization of NFP-growth Algorithm Based on Spark

The PNFP-growth algorithm parallelizes NFP-growth into four stages:

  1. Stage 1: Convert raw data into RDDs, filter infrequent items, and partition transactions.
  2. Stage 2: Decompose transactions into suffix-pattern paths.
  3. Stage 3: Build local FP-trees using mapPartitions.
  4. Stage 4: Aggregate results and save to HDFS.

2.3 Load Balancing Strategy Optimization

Traditional grouping strategies cause uneven workload distribution. The proposed BLPNFP-growth algorithm estimates computational load using conditional FP-tree dimensions:

  • Tree Depth: Path length from root to node.
  • Tree Width: Number of suffix-pattern paths.

Formulas for load estimation:item_loc=L(item,Flist)item_loc=L(item,Flist​)Calculation=log⁡(item_loc)Calculation=log(item_loc)Tree_Size=item_sup×(item_loc+1)Tree_Size=item_sup×(item_loc+1)

Higher support (item_sup) increases computational load, ensuring balanced task allocation.


3. Performance Analysis of the Algorithm

3.1 Experimental Environment

Tests were conducted on a Spark cluster with the configuration in Table 2.

Table 2: Cluster Node Configuration

ComponentSpecification
CPUIntel i5-6200U 2.30GHz
RAM8GB
Spark Version2.1.3
Hadoop Version3.2.0

3.2 Case Study Analysis

The Webdocs dataset (Table 3) was used to evaluate performance under varying data scales, support thresholds, and node counts.

Table 3: Webdocs Dataset Characteristics

DatasetSize (GB)RecordsAttributesMax Transaction Length
Webdocs.dat1.4861,692,0825,267,65671,472

4. Ball Mill Performance Optimization Experiment

4.1 Data Sources

Historical data from a concentrator ball mill (75,000 records) was used. Key parameters include feed rate, water flow, and discharge fineness (Table 4).

Table 4: Ball Mill Operational Parameters

ParameterControllableUnit
Feed RateYest/h
Water FlowYesm³/h
Mill CurrentYesA
Discharge FinenessYes%

4.2 Determination of Optimization Parameters

Pearson correlation analysis (Formula 5) identified critical parameters influencing discharge fineness:ρ=∑(xi−xˉ)(yi−yˉ)∑(xi−xˉ)2∑(yi−yˉ)2ρ=∑(xi​−xˉ)2∑(yi​−yˉ​)2​∑(xi​−xˉ)(yi​−yˉ​)​

Table 5: Correlation Between Parameters and Discharge Fineness

Parameter∣ρ∣∣ρRank
Feed Rate0.82611
Water Flow0.81342
Grinding Aid Flow0.80623

4.3 Mining Results

Discretized parameters (Table 6) were analyzed using BLPNFP-growth to extract association rules (Table 7).

Table 6: Discretization of Parameters

ParameterIntervals
Feed Rate (t/h)[84.08, 85.34], [85.35, 85.87], …
Water Flow (m³/h)[73.68, 77.16], [77.17, 79.03], …

Table 7: Key Association Rules

ParameterTarget Interval
Mill Current (A)[71.34, 72.74], [72.74, 74.11], …
Feed Rate (t/h)[84.21, 85.36], [85.36, 86.61], …

4.4 Optimization Results

Comparative analysis (Table 8) shows improved discharge fineness and operational efficiency.

Table 8: Actual vs. Optimized Parameter Values

ParameterCase 1Case 2Case 3Case 4Case 5
Feed Rate (t/h)90.35→91.7191.79→93.0892.18→94.51
Discharge Fineness (%)26.89→28.2327.01→28.2327.60→29.54

5. Conclusion

This study proposes the BLPNFP-growth algorithm, which enhances load balancing and computational efficiency in Spark. Applied to ball mill optimization, it successfully identifies optimal operational parameters, improving production metrics like discharge fineness. Future work will integrate real-time data streams for dynamic control.

Scroll to Top