@HaomingJiang 2016-06-25T06:08:42.000000Z 字数 2499 阅读 2140

Chp5 Alternative Classification

数据挖掘导论 笔记

Chp5 Alternative Classification

5.1 Based on Rule

$r:A \rightarrow \gamma$
$Coverage(r)=Support(r)$
$Accuracy(r)=Confidence(r)$

Properties:
Mutually Exclusive Rule
One observation will not trigger two different rules
Exhaustive Rule
Any observation can be classify with a rule.

Dealing with Rule which is not Mutually Exclusive: 1. Ordered Rule: use rules with higher priority 2. Unorder Rule: use voting schemes.

Rules may not trigger any rules: use a default rule.

Rule ordering
Based on rules: using some kind of metric. However, the last rules will be hard for explain.
Based on classes: Rules belong to the same class appear together. Triggering one of them is triggering all in the same group.

Extracting Rules
Direct Method:RIPPER,CN2,Holte's 1R...

Indirect Method:C4.5...(from other models)

5.1.1Sequential Covering

Select one class a time. For all data in this class.

Start from an empty rule

Grow a rule using the Learn-One-Rule function

Remove training records covered by the rule

Repeat Step (2) and (3) until stopping criterion is met

Learn-One-Rule
it should cover most positive records and slight negative records.
It is found by a greedy method. Starting from a initial rule r, and then keep refining it. Finally, trim that rule.

Rule Growing: General to specific, specific to general.

Rule Evaluation:
1. likelihood ratio: $R=2\sum_{i=1}^kf_ilog(f_i/e_i)$
2. Laplace: $\frac{f_++1}{n+k}$
3. m estimator: $\frac{f_++kp_+}{n+k}$ ( $p_+$ is the priori possibility)
4. FOIL's information gain: $r_0:A\rightarrow+$ covers $p_0$ possitive records, $n_0$ negative records. $r_1:A \wedge B \rightarrow +$ covers $p_1$ possitive records, $n_1$ negative records.

RIPPER
The time complexity is linear growth with the amount of records, it is useful in imbalance data.
It first build rules for minority class. The bigest class is the default class.
Use FOIL's information gain.
Use the (p-n)/(p+n) (p,n is the number of possitive and negative reocrds of validation set) to judge whether the rule need to be trimmed. Trimming strat from the last conjuncts.

Rule Purning
Remove one of the conjuncts in the rule.Compare error rate on validation set before and after pruning.If error improves, prune the conjunct.

5.1.2 Characteristics

As highly expressive as decision trees
Easy to interpret
Easy to generate
Can classify new instances rapidly
Performance comparable to decision trees

5.2 Nearest neighbor

the k nearest neighbor can be given equal weight or inequal weight.

1-nearest neigbhor ---- Voronoi Diagram

5.3 Bayes Classifier

5.3.1 Naive Bayes

5.3.2 Bayesian Belirf Networks

It consist of a direct graph and probability list.
Conditional Independence: For each node, if its parents is known. It is independent on other node which is not offspring of this one.