@nanmeng 2016-05-19T08:43:41.000000Z 字数 4208 阅读 2208

Probabilistic Graphical Models(Stanford) - 1

notes Probabilistic_Graphical_Models

Pre-Class

Why factors?
* Fundamental building block for defining distributions in high-dimensional spaces
* Set of basic operations for manipulating these probability distributions

Week1 Bayesian Network Fundamentals

1. Semantics & Factorization

An example for bayesian network:
PGM1_1

The calculation rule: chain rule

PGM1_2

The illustration of calculation the joint distribution:
(how to calculate with the value)
PGM1_3

Bayesian Network:

A directed acyclic graph(DAG)

For each node $X_i$ a CPD $P(X_i|Par_G(X_i))$

A trick in BN:

As shown below, in the calculation of Bayesian Network, the summation can be calculated on ''part'' of the whole equation.
PGM1_4

P Factorizes over G:

PGM1_5
An example of Genetic Inheritance
PGM1_6

2. Reasoning Patterns

Causal Reasoning(top down)

Evidential Reasoning(bottom up)

Intercausal Reasoning(flow information between two causes)

An illustration of the Intercausal Reasoning is hard to figure out:
PGM1_7

PGM1_8
Student aces the SAT contribute to the increase prob. of $P(i^1|g^3,d^1)$ and the prob. of $P(d^1|g^3,s^1)$ .
PGM1_9

3. Flow of Probabilistic Influence

An example of active trail:
PGM1_10

A trail $X_1 - ... - X_k$ is active if: it has no v-structures like $X_{i-1} \rightarrow X_i \leftarrow X_{i+1}$
Notice: v-structure is a structure that two nodes point to the same one. (like $X_{i-1} \rightarrow X_i \leftarrow X_{i+1}$ ).
Then the rules for what condition influence the information flow is like what shown below.
PGM1_11
(The final line in the table of the picture above is: $X$ and all of its descendants not in $Z$ | either if $W$ or one of its descendants is in $Z$ )

Summary

PGM1_12

Independencies in BNs

Types of three-variable structures: chain (aka causal trail or evidential trail), common parent
(aka common cause), v-structure (aka common effect)

Property: A variable $X$ is independent of its non-descendants given its parents

Property: A variable $X$ is independent of all other variables in the network given its Markov blanket, which consists of its parents, its children, and its co-parents

materials：Probabilistic Graphical Models 10-708 Recitation 1 Handout

4. Conditional Independence

basic for conditional independence & symbol
PGM1_13
The definition of conditional independence
PGM1_14
An example of conditional independence:
PGM1_15
when people have not been told that the coin is a fair coin, then the prob of second time toss the coin and get head is higher given the first time get head. However, when people have been told that this is a fair coin, the two times tosses are independent with each other.

5. Independencies in Bayesian Networks

Recap:
PGM1_16
A new question:
PGM1_17

Theorem: If $P$ factorizes over $G$ , and $d-sep_G(X,Y|Z)$ then $P$ satisfies $(X \perp Y | Z)$

PGM1_18

PGM1_19

red line: all the non-descendants of Letter

descendants of Letter are Job, Happy.

I-maps

PGM1_20
Example:
PGM1_21
G1 is the I-map of P1, while G2 is the I-map of P1 or P2.

I-Maps
• I-Map: A graph G is an I-map for a distribution P if $I(G) \subseteq I(P)$
• Minimal I-Map: A graph G is a minimal I-map for a distribution P if you cannot remove any
edges from G and have it still be an I-map for P
• Perfect I-Map: A graph G is a perfect I-map for a distribution P if $I(G) = I(P)$
• I-Equivalence: Two graphs G1 and G2 are I-equivalent if $I(G1) = I(G2)$

PGM1_22

Illustrate one example in the picture: We know that when knowing the parrent of a node then it is independent with its non-descendants.Thus $P(S|D,I,G) \Rightarrow P(S|I)$
Thus, the first equation in the picture is equall to the second equation.

Summary

PGM1_23

6. Naive Bayes

What independence assumption does the Naive Bayes model make?
Given the class variable, each observed variable is independent of the other observed variables.

PGM1_24
If given the class, the variables are independent with each other.
PGM1_25
green: prior probabilities of two classes
blue: odds ratio
An example of Bernoulli Naive Bayes for text
PGM1_26
PGM1_27

Summary

PGM1_28

7. Medical Diagnosis

PGM1_29

8. Knowledge Engineering Example -SAMIAM

SIAMIAM

Relative materials

Markov Random Fields

3.1 Independencies in MRFs

Two variables $X$ and $Y$ are independent if there is no active trail between them; a trail is active if it doesn’t contain any observed variables.

Property: A variable $X$ is independent of all other variables in the network given its Markov blanket, which consists of its direct neighbors in the graph.

3.2 Parameterization of MRFs

Markov random fields are parameterized by a set of factors defined over cliques in the graph; factors are not distributions as they do not have to sum to 1.

The joint probability distribution of the variables in an MRF can be written in factorized form as a normalized product of factors, i.e. $P(X1, ..., Xn) = \frac{1}{Z}\phi_i(C_i)$ where $C_i$ is the set of variables in the ith clique, and $Z$ is the partition function.

SMO(Sequential Minimal Optimization)

Features:
- Memory: The amount of memory required for SMO is linear in the training set size(which allows SMO to handle very large training sets)
- Speed: SMO is fastest for linear SVMs and sparse data sets.
- relevant materials: http://research.microsoft.com/pubs/69644/tr-98-14.pdf

Probabilistic Graphical Models(Stanford) - 1

Pre-Class

Week1 Bayesian Network Fundamentals

1. Semantics & Factorization

A trick in BN:

P Factorizes over G:

2. Reasoning Patterns

3. Flow of Probabilistic Influence

Summary

Independencies in BNs

4. Conditional Independence

5. Independencies in Bayesian Networks

I-maps

Summary

6. Naive Bayes

Summary

7. Medical Diagnosis

8. Knowledge Engineering Example -SAMIAM

Relative materials

SMO(Sequential Minimal Optimization)

内容目录