Extraction methods for uncertain inference rules by ant colony optimization
 Ling Chen^{1},
 Yun Sun^{1}Email author and
 Yuanguo Zhu^{1}
DOI: 10.1186/s4046701500339
© Chen et al.; licensee Springer. 2015
Received: 14 December 2014
Accepted: 20 April 2015
Published: 14 May 2015
Abstract
In recent years, the research on data mining methods has received increasing attention. In this paper, we design an uncertain system with the extracted uncertain inference rules to solve the classification problems in data mining. And then, two extraction methods integrated with ant colony optimization are proposed for the generation of the uncertain inference rules. Finally, two applications are given to verify the effectiveness and superiority of the proposed methods.
Keywords
Uncertain inference rule Uncertain system Ant colony optimization algorithm Rules extraction Data classificationIntroduction
Nowadays, databases and computer networks, coupled with the use of advanced automated data generation and collection tools, are widely used in many different fields such as finance, Ecommerce, logistics, etc. As a result, the amount of data that people have to deal with is dramatically increasing. People hope to carry out scientific research, business decision, or business management on the basis of the analysis of the existing data. However, the current data analysis tools have difficulty in processing the data in depth. To compensate for this deficiency, there come the data mining techniques. Data mining is the computational process of discovering some interesting, potentially useful patterns in large data sets. Those patterns can be concepts, rules, laws, and modes. The overall goal of data mining is to extract information from a data set and transform it into an understandable structure for further use. Data mining helps us to discover valuable information and knowledge. Data mining is applied to many fields in reality. There are many successful examples [1] of data mining in business and science research. For instance, data mining is widely used in financial data analysis, telecommunication, retail, and biomedical research. Therefore, the study of data mining technology has an important practical significance.
The main jobs of data mining are data description, data classification, data dependency, data compartment analysis, data regression, data aggregate, and data prediction. What data classification does is to find a couple of models or functions that can accurately describe the characteristics of the data sets. Then, we can identify the categories of the previously unknown data. After obtaining the models or functions from the set of training data with data mining algorithms, we use many methods to describe the output such as classification rules (ifthen), decision trees, mathematical formula, and neutral network.
There are a variety of approaches in data mining. For mining objects in different fields, many different specified methods are invented. The approaches we usually used are statistical methods, machine learning methods, and modern intelligent optimization methods. The statistical methods are very effective methods from the start. In addition, many other data mining methods are invented based on the statistical methods. When dealing with classification problems, Bayesian classification and Bayesian belief network are important classification methods that based on the statistical principle. Machine learning methods are mainly used to solve the conceptual learning, pattern classification, and pattern clustering problems. The core content of machine learning is inductive learning. And there already exist a number of mature technology methods, such as decision tree method for classification problems. Decision trees method is one of the most popular classification methods. The early decision trees algorithm is ID3 method. Later, based on ID3, many algorithms such as C4.5 method [2] are proposed. Besides, there are some variants of the decision trees algorithm including incremental tree structure ID4, ID5, and expandable tree structure SLIQ for massive data set.
In recent years, intelligent optimization algorithms are widely applied into data mining. Neutral network is a simulation model for complex system with nonlinear relations. It is very suitable to deal with complex nonlinear relations in spatial data. Researchers have already proposed different network models to realize the clustering, classification, regression, and pattern recognition of the data. Furthermore, many evolution algorithms such as simulated annealing algorithm are introduced into neutral network algorithm as the optimization strategies. Genetic algorithm is a global search algorithm that simulates the biological evolution and genetic mechanism. It plays an important role in optimization and classification machine learning. Mixed algorithms of genetic algorithm and other algorithms, such as decision trees, neutral network, have been applied to the data mining technology. Ant colony optimization algorithm is a bionic optimization algorithm that simulates the behavior of the ants. Based on that, a data mining technique antminer [3] was invented. And Herrera [4] applied it to fuzzy rules learning. However, ant colony optimization algorithm has some weakness such as slow convergence, random initial solutions. For this reason, some improved ant colony optimization algorithms are proposed. Zhu proposed an improved ant colony optimization algorithm (ACOA) [5] and a mutation ant colony optimization algorithm (MACO) [6] to speed up the algorithms and avoid the solutions getting stuck in local optimums. Hybrid genetic ant colony optimization [7] and hybrid particle swarm ant colony optimization algorithm [8] significantly improve the performance of the original ant colony optimization algorithm.
The real world is so complex that human being may face different types of indeterminacy everyday. To get a better understanding of the real world, many mathematical tools are created. One of them is probability theory which is used to model indeterminacy from samples. However, in many cases, no samples are available to estimate a probability distribution. In this situation, we have no choice but to invite some domain experts to evaluate the belief degree that each event may occur. We cannot use probability theory to deal with belief degree since human beings usually overweight unlikely events which makes the belief degrees deviate far from the frequency. In view of this, Liu [9] founded uncertainty theory based on normality axiom, duality axiom, subadditivity axiom, and product measure axiom. It has become a powerful mathematical tool dealing with indeterminacy. Many researchers have done a lot of theoretical work related to uncertainty theory. In 2008, Liu [10] presented the uncertain differential equation. Later, the existence and uniqueness theorem was given [11]. And the stability of uncertain differential equation was discussed [12,13]. Also, some analysis and numerical methods for solving uncertain differential equation were proposed. With uncertain differential equation describing the evolution of the system, we may solve some practical problems. Peng and Yao [14] studied an option pricing models for stocks. Zhu [15] proposed an uncertain optimal control model in 2010.
In [16,17], Liu proposed and studied the uncertain systems based on the concepts of uncertain sets, membership functions, and uncertain inference rules. An uncertain system is a function from its inputs to outputs based on the uncertain inference rule. Usually, an uncertain system consists of five parts: inputs, rulebase, uncertain inference rules, expected value operator, and outputs. Following that, Gao et al. [18] generalized uncertain inference rules and described uncertain systems with them. Peng and Chen [19] proved that uncertain systems are universal approximator and then demonstrated that the uncertain controller is a reasonable tool. Gao [20] designed an uncertain inference controller that successfully balanced an inverted pendulum with 5×5 ifthen rules. What is more important is that this uncertain inference controller has a good ability of robustness.
On the basis of uncertainty theory, we consider two extraction methods for uncertain inference rules by ant colony optimization algorithm. In the next section, we review the ant colony optimization algorithm and give some basic concepts about uncertain sets. Then, we formulate a model to extract inference rules based on data set. And then, we propose an extraction method for uncertain inference rules by ant colony optimization algorithm with a mutation operation. Finally, we combine the ant colony optimization algorithm with simulated annealing algorithm to speed up the extraction method. In the last section, we discuss two typical classification problems in data mining with our results.
Preliminary
In this section, we review the ant colony optimization algorithm. And then, we give some basic concepts on uncertainty sets.
Ant colony optimization algorithm
Ant colony optimization algorithm, initiated by Dorigo, is a heuristic optimization approach. It simulates the behavior of real ants when they forage for food which relies on the pheromone communication. In ant colony optimization algorithm, each path of artificial ants walking from the food sources to the nest is a candidate solution to the problem. When walking on the path, the ants will release pheromone which evaporates over time. And the artificial ants will lay down more pheromone on the path corresponding to the better solution. While one ant has many paths to go, it will make a choice according to the amount of the pheromone on the paths. The more pheromone there is on the path, the better the solution is. As a result, bad paths will disappear since the pheromone evaporates over time. And good paths will be reserved since ants walking on it increases the pheromone levels. Finally, one path which is used by most of the ants is left. Then, the optimal solution to the problem is obtained.
where x is the decision variable in the domain D. And f(x) is the objective function while g(x) is the constraint function.
We can use ant colony optimization algorithm to obtain the optimal solution to the problem (1). The parameters in the algorithm are initial pheromone τ _{0}, ant transfer probability p, number of ants M, pheromone evaporation rate ρ, and number of iterations T. The procedures are as follows. Step 1 Randomly generate a feasible solution x _{0} and set optimal solution s=x _{0}. Initialize all pheromone trails with the same pheromone level τ _{0}. Set k←0. Step 2 The artificial ant generates a walking path x in some probability p according to the pheromone trails. If x∈D, then go to Step 3; otherwise, repeat Step 2 until x∈D. Step 3 Repeat Step 2 until for each ant and generate M feasible solutions. Let s _{ k } be the best solution in this iteration. Step 4 If f(s _{ k })<f(s), then s←s _{ k } and update the pheromone trails according to the optimal solution in the current iteration. Step 5 If k<T, then k←k+1 and go to Step 2; otherwise, terminate. Step 6 Report the optimal solution.
Uncertain set
Let Γ be a nonempty set and be σalgebra over Γ. Each \(\Lambda \in \mathcal {L}\) is called an event. For any Λ, . The set function defined on is called an uncertain measure if it satisfies the following three axiom: for any for all \(\Lambda _{1}, \Lambda _{2},\cdots \in \mathcal {L}\). Then, the triplet is called an uncertainty space [9]. The product uncertain measure is an uncertain measure satisfying , where Λ _{ k } are arbitrarily chosen events from \(\mathcal {L}_{k}\) for k=1,2,⋯, respectively.
Definition 1.
[ 16 ] An uncertain set is a function ξ from an uncertainty space to a collection of sets of real numbers such that both {B⊂ξ} and {ξ⊂B} are events for any Borel set B.
Example 1.
Definition 2.
[ 16 ] The uncertain sets ξ _{1},ξ _{2},ξ _{3},⋯,ξ _{ n } are said to be independent if for any Borel sets B _{1},B _{2},B _{3},⋯,B _{ n }, we have
and
where \(\xi _{i}^{*}\) are arbitrarily chosen from \(\left \{\xi _{i}, {\xi _{i}^{c}}\right \}\), i=1,2,⋯,n, respectively.
Definition 3.
[ 21 ] An uncertain set ξ is said to have a membership function μ if for any Borel set B of real numbers, we have
The above equations will be called measures inversion formulas.
Remark 1.
When an uncertain set ξ does have a membership function μ, it follows from the first measure inversion formula that
Example 2.
denoted by (a,b,c) where a,b,c are real numbers with a<b<c.
Definition 4.
[ 21 ] A membership function μ is said to be regular if there exists a point x _{0} such that μ(x _{0})=1, and μ(x) is unimodal about the mode x _{0}. That is, μ(x) is increasing on (−∞,x _{0}] and decreasing on [x _{0},+∞).
Definition 5.
[ 16 ] Let ξ be an uncertain set. Then, the expected value of ξ is defined by
provided that at least one of the two integrals is finite and
Theorem 1.
where x _{0} is a point such that μ(x _{0})=1.
Example 3.
Uncertain inference rule
Here, we introduce concepts of the uncertain inference and uncertain system. Inference rules are the key points of the inference systems. In fuzzy systems, CRI approach [22], Mamdani inference rules [23] and TakagiSugeno inference rules [24] are the most common used inference rules. Fuzzy ifthen inference rules use fuzzy sets to describe the antecedents and the consequents. Unlike fuzzy inference, both antecedents and consequents in uncertain inference are characterized by uncertain sets. Uncertain inference [16] is a process of deriving consequences from human knowledge via uncertain set theory. First, we introduce the following inference rule.
Inference Rule 1.

Rule: If is ξ, then is η

From: is a constant a

Infer: is η ^{∗}=η_{ a∈ξ }
Theorem 2.
Based on Inference Rule 1, Gao et al. [18] proposed the multiinput, multiifthenrule inference rules.
Inference Rule 2.
where the coefficients are determined by

Rule 1: If \(\mathbb {X}_{1}\) is ξ _{11} and ⋯ and \(\mathbb {X}_{m}\) is ξ _{1m }, then is η _{1}

Rule 2: If \(\mathbb {X}_{1}\) is ξ _{21} and ⋯ and \(\mathbb {X}_{m}\) is ξ _{2m }, then is η _{2}

⋯

Rule k: If \(\mathbb {X}_{1}\) is ξ _{ k1} and ⋯ and \(\mathbb {X}_{m}\) is ξ _{ km }, then is η _{ k }

From: \(\mathbb {X}_{1}\) is a _{1} and ⋯ and \(\mathbb {X}_{m}\) is a _{ m }

Infer: is determined by Eq. (4)
Theorem 3.
Uncertain system
Uncertain system, proposed by Liu [16], is a function from its inputs to outputs based on the uncertain inference rule. Usually, an uncertain system consists of five parts: inputs that are crisp data to be fed into the uncertain system; a rulebase that contains a set of ifthen rules provided by the experts; an uncertain inference rule that infers uncertain consequents from the uncertain antecedents; an expected value operator that converts the uncertain consequents to crisp values; and outputs that are crisp data yielded from the expected value operator.

If \(\mathbb {X}_{1}\) is ξ _{11} and ⋯ and \(\mathbb {X}_{m}\) is ξ _{1m }, then \(\mathbb {Y}_{1}\) is η _{11} and \(\mathbb {Y}_{2}\) is η _{12} and ⋯ and \(\mathbb {Y}_{n}\) is η _{1n }

If \(\mathbb {X}_{1}\) is ξ _{21} and ⋯ and \(\mathbb {X}_{m}\) is ξ _{2m }, then \(\mathbb {Y}_{1}\) is η _{21} and \(\mathbb {Y}_{2}\) is η _{22} and ⋯ and \(\mathbb {Y}_{n}\) is η _{2n }

⋯

If \(\mathbb {X}_{1}\) is ξ _{ k1} and ⋯ and \(\mathbb {X}_{m}\) is ξ _{ km }, then \(\mathbb {Y}_{1}\) is η _{ k1} and \(\mathbb {Y}_{2}\) is η _{ k2} and ⋯ and \(\mathbb {Y}_{n}\) is η _{ kn }
Then, we get an uncertain system f. For the uncertain system we proposed, we have the following theorem.
Theorem 4.
Theorem 5.
Proof.
Given the m input data a _{1},a _{2},⋯,a _{ m }, we can calculate c _{ i } from Equation 7. Then, we can get the membership functions \(\nu _{i}^{*}\) of the consequence uncertain sets \(\eta _{i}^{*}\) according to Equation 6. Next, the computation of the expected value of uncertain consequence breaks into three cases.
Case 3: Assume c _{ i }>0.5. Similarly, we have \(E[\eta _{i}^{*}]=\beta _{i}\). Thus, we have proved the theorem. □
Problem formulation
In this section, we propose an extraction model to obtain uncertain inference rules.
Let X=(x _{1},x _{2},⋯,x _{ n }) be the decision vector, which represents a rule base consisting of n rules. Each rule has m antecedents which are described by Q uncertain sets and one consequent which is described by R uncertain sets. Each variable x _{ i } represents a sequence x _{ i1} x _{ i2}⋯x _{ im } x _{ i m+1}, where x _{ ij }∈{0,1,2,⋯,Q}(i=1,2,⋯,n;j=1,2,⋯,m) represent the antecedents of the inference rule. And x _{ i m+1}∈{0,1,2,⋯,R}(i=1,2,⋯,n) represent the consequent. Thus, each variable of decision vector represents one inference rule. Some x _{ ij }=0 means this antecedent is not included. And some x _{ i m+1}=0 means this inference rule will not be included in the rule base. For example, assume that we have one inference rule consists of 4 antecedents and 1 consequent. They are described by 5 uncertain sets which refer to five descriptions: very low, low, medium, high, and very high. We use 1,2,3,4,5 to denote them. Thus, sequence “23045”, for example, represents the rule: “if input 1 is low, input 2 is medium, and input 4 is high, then the output is very high”.
Extraction method for uncertain inference rules with mutations
In this section, we propose the extraction method for uncertain inference rules with mutations by ant colony optimization algorithm.
In this way, we could get a sequence x _{ i1} x _{ i2}⋯x _{ i m+1}. To speed up the algorithm, we mutate this sequence to get a new candidate sequence. The mutation is made as follows: randomly add 1 or subtract 1 to each element x _{ ij } in the sequence; if the element is 0, the mutated element is 1; if the element is Q, the mutated element is Q−1. Assume X ^{′} is the mutated solution, if Δ F=F(X ^{′})−F(X)≤0, then X←X ^{′}; otherwise, keep the current solution. If Q is very large, we could repeat this mutation until some termination condition is satisfied. (3) Pheromone Update: At each iteration t, let \(\hat {X}\) be the optimal solution found so far and X _{ t } be the best feasible solution in the current iteration. Assume \(F(\hat {X})\) and F(X _{ t }) are the corresponding objective function values.
If \(F(X_{t})<F(\hat {X})\), then \(\hat {X}\leftarrow X_{t}\).
where ρ(0<ρ<1) is the evaporation rate, g(x)(0<g(x)<+∞) is a function with that g(x)≥g(y) if F(x)<F(y), for example, g(x)=L/(F(x)+1) is a function satisfying the condition where L>0.
Let τ _{0} be the initial value of pheromone trails, n be the number of decision variables, M be the number of ants, ρ be evaporation rate and T be the number of iterations. Now, we summarize the algorithm as follows. Step 1 Initialize all pheromone trails with the same pheromone level τ _{0}. Randomly generate a feasible solution X _{0}, and set optimal solution \(\hat {X}=X_{0}\). Set l←0. Step 2 Ant movement in probability following Equation 12. Generate a decision variable x _{ i } after m+1 steps. Step 3 Repeat Step 2 until X=(x _{1},x _{2},⋯,x _{ n }) is generated; mutate every x _{ i }: thus, generate a new decision vector \(X^{\prime }=(x_{1}^{\prime },x_{2}^{\prime },\cdots,x_{n}^{\prime })\); if Δ F=F(X ^{′})−F(X)≤0, then X←X ^{′}. Step 4 Repeat Step 2 and Step 3 for all M ants. Step 5 Calculate the system outputs by Equation 5. Then, calculate the objective function values for the M candidate solutions by Equation 11. Denote the best solution in this iteration by X _{ l }. Step 6 If \(F(X_{l})<F(\hat {X})\), then \(\hat {X}\leftarrow X_{l}\); update the pheromone trails according to Equation 13. Step 7 l←l+1; if l=T, terminate; otherwise, go to Step 2. Step 8 Report the optimal solution \(\hat {X}\).
With this algorithm above, we obtain an uncertain rule base. Then, we successfully design an uncertain system and can use it for classification.
Extraction method for uncertain inference rules with SA
In the previous section, to speed up the algorithm, we introduce a mutation operation. Here, we introduce the simulated annealing algorithm as the local search operation.
Simulated annealing algorithm was initiated by Metropolis in 1953, applied to portfolio optimization by Kirkpatrick [25] in 1983. The name and inspiration come from annealing in metallurgy, a technique involving heating and controlled cooling of a material to increase the size of its crystals and reduce their defects. Simulated annealing algorithm is excellent at avoiding getting stuck in local optimums. It has a good robust property and is universal and easy to implement.
For optimization problem (1), we can use simulated annealing algorithm to search for the optimal solution. The algorithm is as follows. Step 1 Randomly generate a initial solution x _{0}; x←x _{0}; k←0; t _{0}←t _{ max }(initial temperature); Step 2 If the temperature satisfies the inner cycle termination criterion, go to Step 3; otherwise, randomly choose a point x ^{′} in the neighborhood N(x), calculate Δ f=f(x ^{′})−f(x). If Δ f≤0, then x←x ^{′}; otherwise, according to Metropolis acceptance criterion, if exp(−Δ f/t _{ k })>r a n d o m(0,1), then x←x ^{′}. Repeat Step 2. Step 3 t _{ k+1}=d(t _{ k }) (temperature decrease); k←k+1; if the termination criterion is satisfied, stop and report the optimal solution; otherwise, go to Step 2.
In this section, we combine ant colony optimization algorithm and simulated annealing algorithm. In each iteration of ant colony optimization algorithm, we get a feasible solution. Then, we use it as the initial solution of the simulated annealing algorithm to get a neighbor solution. This neighbor solution will be accepted in probability. And for each decision vector X=(x _{1},x _{2},⋯,x _{ n }), x _{ i }=x _{ i1} x _{ i2}⋯x _{ i m+1}, we build the neighbor solution as follows: for each x _{ i }, for some randomly generated p and q (1≤p<q≤m), reverse the order of the sequence x _{ ip }⋯x _{ iq }, i.e., \(x_{i}^{\prime }=x_{i1}\cdots x_{ip1}x_{\textit {iq}}x_{iq1}\cdots x_{ip+1}x_{\textit {ip}}x_{iq+1}\cdots x_{im+1}\). For example, assume x _{ i } is 0123456, p=2, q=6, and the neighbor solution \(x_{i}^{\prime }\) is 0543216. In this way, we obtain a neighbor solution X ^{′}. If Δ F=F(X ^{′})−F(X)≤0, X←X ^{′}; otherwise, if exp(−Δ F/t _{ k })>r a n d o m(0,1), then X←X ^{′}; otherwise, abandon this neighbor solution. Still denote the pheromone trail by τ _{ i;k,j }(t). The procedure are described as follows. (1) Initialization: Generate a feasible solution X _{0} randomly and set the optimal solution \(\hat {X}=X_{0}\). Set τ _{ i;k,j }(0)=τ _{0}, i=1,2,⋯,n, k=1,2,⋯,m+1, j=0,1,2,⋯,Q, where τ _{0} is a fixed parameter. (2) Ant movement: At each step k after building the sequence x _{ i1} x _{ i2}⋯x _{ ik }, select the next node in probability following Equation 12. In this way, we could get a sequence x _{ i1} x _{ i2}⋯x _{ i m+1}. In order to expand the search range, we use simulated annealing algorithm to search locally around the solution at this step. Assume the neighbor solution is X ^{′}. If Δ F=F(X ^{′})−F(X)≤0, X←X ^{′}; otherwise, if exp(−Δ F/t _{ k })>r a n d o m(0,1) where t _{ k } is the current temperature and t _{ k }→0 when k→∞, then X←X ^{′}; otherwise, abandon this neighbor solution and still choose the original feasible solution. (3) Pheromone Update: Let \(\hat {X}\) be the optimal solution found so far and X _{ t } be the best feasible solution in the current iteration t. Assume \(F(\hat {X})\) and F(X _{ t }) are the corresponding objective function values. To avoid the optimal solution \(\hat {X}\) getting stuck in local optimums, we also use acceptance function here.
If \(F(X_{t})< F(\hat {X})\), then \(\hat {X}\leftarrow X_{t}\).
Build a neighbor solution \(\hat {X}^{\prime }\).
If \(F(\hat {X}^{\prime })\leq F(\hat {X})\), then \(\hat {X}\leftarrow \hat {X}^{\prime }\);
If \(F(\hat {X}^{\prime })>F(\hat {X})\), check the Metropolis acceptance criterion, i.e., if \(\exp (\Delta \hat {F}/T_{t})>random(0,1)\), T _{ t }→0, t→∞, then \(X^{*}\leftarrow \hat {X}^{\prime }\).
where, ρ(0<ρ<1) is the evaporate rate, and g(x)(0<g(x)<+∞) is a function with that g(x)≥g(y) if F(x)<F(y). For example, g(x)=L/(F(x)+1) is an available function if L>0.
Now, we summarize the algorithm as follows. Step 1 Initialize all pheromone trails with the same pheromone level τ _{0}. Randomly generate a feasible solution X _{0}, and set optimal solution \(\hat {X}=X_{0}\). Set t←0. Step 2 Ant movement in probability following Equation 12. Generate a decision variable x _{ i } after m+1 steps. Step 3 Repeat Step 2 until decision vector X=(x _{1},x _{2},⋯,x _{ n }) is generated. Build the neighbor solution X ^{′}. If Δ F=F(X ^{′})−F(X)≤0, X←X ^{′}; otherwise, if exp(−Δ F/t _{ k })>r a n d o m(0,1) where t _{ k } is the current temperature and t _{ k }→0 when k→∞, then X←X ^{′}. Step 4 Repeat Step 2 and Step 3 until all ants finish their walk, and generate M candidate solutions. Step 5 Calculate the system outputs by Equation 5. Then, calculate the objective function values for the M candidate solutions by Equation 11. Denote the best solution in this iteration by X _{ t }. Step 6 If \(F(X_{t})<F(\hat {X})\), then \(\hat {X}\leftarrow X_{t}\). Build the neighbor solution of \(\hat {X}\), which is denoted by \(\hat {X}^{\prime }\). If \(\Delta \hat {F}=F(\hat {X}^{\prime })F(\hat {X})\leq 0\), then \(\hat {X}\leftarrow \hat {X}^{\prime }\); otherwise, if Metropolis acceptance criterion is satisfied, i.e., if \(\exp (\Delta \hat {F}/T_{t})>random(0,1), T_{t}\rightarrow 0, t\rightarrow \infty \), then \(X^{*}\leftarrow \hat {X}^{\prime }\). Step 7 Update the pheromone trails according to Equation 14. Step 8 t←t+1; if t=T, terminate; otherwise, go to Step 2. Step 9 Report the optimal solution \(\hat {X}\).
Experiments
In this section, we use our two extraction methods to extract uncertain inference rules. And then use the uncertain systems to solve some classification problems. We applied our methods to the IRIS [26] classification problem and the Wisconsin Breast Cancer (WBC) [27] classification problem.
IRIS classification
Parameters
a _{ p }  b _{ p }  c _{ p }  

p=1  0.5  1.01  1.52 
p=2  1.7  2.74  4.48 
p=3  5  6.07  7.14 
IRIS classification rules extracted by method A
IF  THEN  

SL  SW  PL  PW  Class 
1  3  1  3  1 
1  0  1  1  1 
1  2  3  2  1 
1  1  2  1  2 
2  1  0  3  2 
3  2  0  2  3 
1  1  3  3  3 
IRIS classification rules extracted by method B
IF  THEN  

SL  SW  PL  PW  Class 
3  2  3  1  2 
1  1  0  0  2 
0  2  1  1  3 
0  1  1  3  1 
1  1  3  3  2 
1  1  3  1  1 
2  1  1  2  1 
Wisconsin Breast Cancer classification
Parameters
a _{ p }  b _{ p }  c _{ p }  

p=1  0.3  1.01  1.72 
p=2  2  6.07  10.14 
WBC classification rules extracted by method A
IF  THEN  

CT  UCS  UCCS  MA  SPCS  BN  BC  NN  MT  Class 
1  5  0  3  2  0  4  2  1  2 
1  2  1  1  4  4  2  1  0  1 
1  3  5  2  3  2  1  1  4  2 
1  3  4  2  3  1  2  2  1  1 
2  4  4  1  1  2  4  5  1  2 
3  3  4  5  4  3  2  4  4  2 
5  2  4  0  3  0  0  2  1  1 
2  4  4  1  1  2  4  5  1  1 
2  4  2  3  5  3  2  5  5  2 
WBC classification rules extracted by method B
IF  THEN  

CT  UCS  UCCS  MA  SPCS  BN  BC  NN  MT  Class 
5  3  3  3  3  2  3  2  4  2 
0  0  0  0  0  0  0  4  0  1 
4  4  4  4  0  1  1  1  4  2 
1  4  4  1  1  2  1  5  1  2 
1  1  1  2  0  3  5  5  5  2 
0  3  0  0  0  0  0  1  2  1 
We apply our two extraction methods to the classification problems of IRIS data set and WBC data set. Compare our results with other researchers’ work, we can find that both methods have higher accuracy rate than ACOA and MACO in two classification problems. And for IRIS data set, accuracy rates of method A and B are lower than HNFQ but higher than C4.5. For WBC data set, their accuracy rates are higher than C4.5 and FMM.
Conclusions
In this paper, we designed an uncertain system for data classification. And we proposed two extraction methods for uncertain inference rules by using ant colony optimization algorithm. Then, we applied our methods to IRIS classification problem and WBC classification problem. Our methods are shown to be superior in accuracy to some existing methods.
Declarations
Acknowledgements
This work is supported by the National Natural Science Foundation of China (No.61273009).
Authors’ Affiliations
References
 Kantardzic, M: Data Mining: Concepts, Models, Methods, and Algorithms. 2nd ed. Wiley, Hoboken (2011).View ArticleGoogle Scholar
 Quinlan, JR: Improved use of continuous attributes in C4.5. J. Artif. Intell. Res. 4(1), 77–90 (1996).MATHGoogle Scholar
 Parpinelli, RS, Lopes, HS, Freitas, AA: Data mining with an ant colony optimization algorithm. IEEE Trans. Evolut. Comput. 6(4), 321–332 (2002).View ArticleGoogle Scholar
 Casillas, J, Cordón, O, Herrera, F: Learning fuzzy rules using ant colony optimization algorithms. In: Proceedings of the 2nd International Workshop on Ant Algorithms: From Ant Colonies to Artificial Ants, pp. 13–21, Brussels (2000).
 Zhu, Y: Ant colony optimizationbased hybrid intelligent algorithms. World J. Modell. Simul. 2(5), 283–289 (2006).Google Scholar
 Zhu, Y: An intelligent algorithm: MACO for continuous optimization models. J. Intell. Fuzzy Syst. 24, 31–36 (2013).Google Scholar
 Lee, Z, Su, S, Chuang, C, Liu, K: Genetic algorithm with ant colony optimization (GAACO) for multiple sequence alignment. Appl. Soft Comput. 8(1), 55–78 (2008).View ArticleGoogle Scholar
 Shelokar, PS, Siarry, P, Jayaraman, VK, Kulkarni, BD: Particle swarm and ant colony algorithms hybridized for improved continuous optimization. Appl. Math. Comput. 188(1), 129–142 (2007).View ArticleMATHMathSciNetGoogle Scholar
 Liu, B: Uncertainty Theory. 2nd ed. Springer, Berlin (2007).View ArticleMATHGoogle Scholar
 Liu, B: Fuzzy process, hybrid process and uncertain process. J. Uncertain Syst. 2(1), 3–16 (2008).Google Scholar
 Chen, X, Liu, B: Existence and uniqueness theorem for uncertain differential equations. Fuzzy Optimization Decis. Mak. 9(1), 69–81 (2010).View ArticleMATHGoogle Scholar
 Liu, B: Some research problems in uncertainty theory. J. Uncertain Syst. 3(1), 3–10 (2009).Google Scholar
 Liu, B: Uncertainty Theory: A Branch of Mathematics for Modeling Human Uncertainty. Springer, Berlin (2010).View ArticleGoogle Scholar
 Peng, J, Yao, K: A new option pricing model for stocks in uncertainty markets. Int. J. Oper. Res. 8(2), 18–26 (2011).MathSciNetGoogle Scholar
 Zhu, Y: Uncertain optimal control with application to a portfolio selection model. Cybern. Syst. 41(7), 535–547 (2010).View ArticleMATHGoogle Scholar
 Liu, B: Uncertain set theory and uncertain inference rule with application to uncertain control. J. Uncertain Syst. 4(2), 83–98 (2010).Google Scholar
 Liu, B: Uncertain logic for modeling human language. J. Uncertain Syst. 5(1), 3–20 (2011).Google Scholar
 Gao, X, Gao, Y, Ralescu, DA: On Liu’s inference rule for uncertain systems. Int. J. Uncertain. Fuzz. KnowledgedBased Syst. 18(1), 1–11 (2010).View ArticleMATHMathSciNetGoogle Scholar
 Peng, Z, Chen, X: Uncertain systems are universal approximators. J. Uncertainty Anal. Appl. 2, Article, 13 (2014).
 Gao, Y: Uncertain inference control for balancing inverted pendulum. Fuzzy Optimization Decis. Mak. 11(4), 481–492 (2012).View ArticleMATHGoogle Scholar
 Liu, B: Membership functions and operational law of uncertain sets. Fuzzy Optimization Decis. Mak. 11(4), 387–410 (2012).View ArticleMATHGoogle Scholar
 Zadeh, LA: Outline of a new approach to the analysis of complex systems and decision processes. IEEE Trans. Syst. Man Cybern. 3(1), 28–44 (1973).View ArticleMATHMathSciNetGoogle Scholar
 Mamdani, EH: Applications of fuzzy algorithms for control of a simple dynamic plant. Proc. Institution Electr. Eng. Control Sci. 121(12), 1585–1588 (1974).View ArticleGoogle Scholar
 Takagi, K, Sugeno, M: Fuzzy identification of system and its applications to modeling and control. IEEE Trans. Syst. Man Cybern. 15(1), 116–132 (1985).View ArticleMATHGoogle Scholar
 Kirkpatrick, S, Gelatt, CD, Vecchi, MP: Optimization by simmulated annealing. Science. 220(4598), 671–680 (1983).View ArticleMATHMathSciNetGoogle Scholar
 Iris dataset (1936). https://archive.ics.uci.edu/ml/datasets/Iris.
 Wisconsin Breast Cancer Dataset (1992). https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Original).
 de Souza, FJ, Vellasco, M, Pacheco MA: Hierarchical neurofuzzy quadtree models. Fuzzy Sets Syst. 130(2), 189–205 (2002).View ArticleMATHGoogle Scholar
 Gabrys, B, Bargiela, A: General fuzzy minmax neural network for clustering and classification. IEEE Trans. Neural Networ. 11(3), 769–783 (2000).View ArticleGoogle Scholar
Copyright
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.