Analysis of ID3 algorithm in artificial intelligence

Foreword: Artificial intelligence machine learning related to algorithm content, artificial intelligence machine learning has three main categories: 1) classification; 2) regression; 3) clustering. Today we focus on the ID3 algorithm.

Hunt, Marin, and Stone developed a concept learning system CLS in 1966, which can learn a single concept, and use the concepts learned to classify new examples. John Ross Quinlan (University of Sydney) developed the ID3 algorithm in 1983.

The ID3 algorithm is a kind of decision tree. It is based on the Occam's razor principle, that is, use less things to do more things.

The ID3 algorithm is based on information theory and takes information entropy and information gain as the measurement standards, so as to realize the inductive classification of data.

Analysis of ID3 algorithm in artificial intelligence

ID3 algorithm concept:

ID3 (IteraTIve Dichotomiser 3), that is, iterative binary tree 3 generations, this algorithm is a greedy algorithm used to construct a decision tree [please participate in artificial intelligence (23)]. The ID3 algorithm originated from the concept learning system (CLS), and the rate of decline in information entropy was used as the criterion for selecting test attributes. That is, at each node, the attribute with the highest information gain that had not yet been used for division was selected as the division criterion, and then continue this Process until the generated decision tree can perfectly classify the training examples.

ID3 algorithm core:

The core of the ID3 algorithm is "information entropy". The ID3 algorithm calculates the information gain of each attribute and thinks that the one with the highest information gain is a good attribute. Each time the attribute with the highest information gain is selected as the division criterion, the process is repeated until a decision tree that can perfectly classify the training examples is generated.

The essence of ID3 algorithm:

In information theory, the smaller the expected information, the greater the information gain and thus the higher the purity. The essence of the ID3 algorithm is to measure the choice of attributes by information gain, and select the attribute with the largest information gain after splitting to split. The algorithm uses a top-down greedy search to traverse possible decision spaces.

Before dividing each non-leaf node of the decision tree, first calculate the information gain brought by each attribute, and select the attribute with the largest information gain to divide, because the greater the information gain, the stronger the ability to distinguish samples and the more Representation, obviously this is a top-down greedy strategy.

ID3 algorithm steps:

Calculate the information gain of each attribute and find the largest one as the root node
1) A priori entropy: the average uncertainty when no other attributes are received;

2) Posterior entropy: uncertainty about the source when the output symbol Vj is received;

3) Conditional entropy: the expectation of the posterior entropy in the output symbol set V, the uncertainty of the source after receiving all the symbols;

4) Information gain: the difference between the prior entropy and the conditional entropy is the amount of information obtained by the sink terminal;

5) Repeat the above steps for the remaining attributes.

The ID3 algorithm calculates the information gain of each attribute, and selects the attribute with the highest gain as the test attribute for a given set. Create a node for the selected test attribute, and mark the attribute of the node, and create a branch for each value of the attribute to divide the sample accordingly.

The specific algorithm flow is as follows:

ID3 advantages:

1) The algorithm structure is simple;

2) The algorithm is clear and easy to understand;

3) Very flexible and convenient;

4) There is no danger of no solution;

5) The statistical nature of all training examples can be used to make decisions to resist noise.

ID3 disadvantages:

1) The processing speed of large data is slow, and there is often insufficient memory;

2) Continuous data cannot be processed, and continuous data can only be converted into discrete data through discretization;

3) It cannot be parallel and cannot handle numerical data;

4) Only applicable to non-incremental data sets, not to incremental data sets. It may converge to a local optimal solution rather than a global optimal solution. The best separation attribute is easy to select attributes with more attribute values;

5) Without pruning the decision tree, it is likely that overfitting will occur.

Note: ID3 (parallel) and ID3 (number) solve the two problems of disadvantage 3).

ID3 application scenarios:

The decision tree ID3 algorithm is a very practical example learning algorithm. Its basic theory is clear, the algorithm is relatively simple, and its learning ability is strong. It is suitable for handling large-scale learning problems and is a very important field in data mining and knowledge discovery. A good example has laid a theoretical foundation for later scholars to propose optimization algorithms. The ID3 algorithm has been greatly developed especially in the fields of machine learning, knowledge discovery and data mining.

Conclusion:

The ID3 algorithm is a basic decision tree construction algorithm. As a classic decision tree construction algorithm, it has the characteristics of simple algorithm structure, clear and easy to understand theory, strong learning ability, and flexibility and convenience. However, there are also shortcomings that cannot handle continuous data, are not suitable for incremental data sets, and processing large data is slow, and overfitting may occur. The ID3 algorithm is widely circulated in the world and has received great attention. The ID3 algorithm has been greatly developed especially in the fields of machine learning, knowledge discovery and data mining.

Fiber Optic Tools

Our Fiber Optic Tools including Fiber Optic Tool Kits, Fiber Termination Tools: Fiber Splicing Tools, Fiber Network Tools, Crimp Tools, Fiber Connector Tool, Corning Fiber Tools, Fiber Stripping Tools, Cleaving Tools, Fiber Scribe Tools, Fiber Optic Mid-Access Tools, cable slitter, cable cutters, Kevlar cutters, optical connector removal tools, cable pulling tools, fiber optic work table, distance measuring wheels, heat gun and Fiber Optic Cable dispenser.

Fiber Optic Tools, Fiber Cutter And Stripper, Fiber Cleaner, Fiber Susion Splicer, Fiber Optic Stripper, Fiber Splice Closure, Fiber optic Splice Boxes

NINGBO YULIANG TELECOM MUNICATIONS EQUIPMENT CO.,LTD. , https://www.yltelecom.com