`@HaomingJiang`

`2016-06-25T14:08:42.000000Z`

`字数 2499`

`阅读 1781`

`数据挖掘导论`

`笔记`

Properties:

*Mutually Exclusive Rule*

One observation will not trigger two different rules

*Exhaustive Rule*

Any observation can be classify with a rule.

Dealing with Rule which is not Mutually Exclusive: 1. Ordered Rule: use rules with higher priority 2. Unorder Rule: use voting schemes.

Rules may not trigger any rules: use a default rule.

**Rule ordering**

Based on rules: using some kind of metric. However, the last rules will be hard for explain.

Based on classes: Rules belong to the same class appear together. Triggering one of them is triggering all in the same group.

**Extracting Rules**

Direct Method:RIPPER,CN2,Holte's 1R...

Indirect Method:C4.5...(from other models)

Select one class a time. For all data in this class.

- Start from an empty rule
- Grow a rule using the Learn-One-Rule function
- Remove training records covered by the rule
- Repeat Step (2) and (3) until stopping criterion is met

**Learn-One-Rule**

it should cover most positive records and slight negative records.

It is found by a greedy method. Starting from a initial rule r, and then keep refining it. Finally, trim that rule.

Rule Growing: General to specific, specific to general.

Rule Evaluation:

1. likelihood ratio:

2. Laplace:

3. m estimator: (is the priori possibility)

4. FOIL's information gain: covers possitive records, negative records. covers possitive records, negative records.

**RIPPER**

The time complexity is linear growth with the amount of records, it is useful in imbalance data.

It first build rules for minority class. The bigest class is the default class.

Use FOIL's information gain.

Use the (p-n)/(p+n) (p,n is the number of possitive and negative reocrds of validation set) to judge whether the rule need to be trimmed. Trimming strat from the last conjuncts.

**Rule Purning**

Remove one of the conjuncts in the rule.Compare error rate on validation set before and after pruning.If error improves, prune the conjunct.

As highly expressive as decision trees

Easy to interpret

Easy to generate

Can classify new instances rapidly

Performance comparable to decision trees

the k nearest neighbor can be given equal weight or inequal weight.

1-nearest neigbhor ---- Voronoi Diagram

It consist of a direct graph and probability list.

**Conditional Independence**: For each node, if its parents is known. It is independent on other node which is not offspring of this one.

添加新批注

在作者公开此批注前，只有你和作者可见。

回复批注