@wuqi0616
2018-03-29T14:08:37.000000Z
字数 11224
阅读 1481
机器学习入门资料
:表示的概率密度函数,它是一个以为参数的函数。分号左边是随机变量;右边是模型参数。
:前者是Z的以为参数的先验概率密度函数,后者是给定Z下的条件概率。
如图所示:
选择合适的关于的分布。
寻找参数最大化期望似然
直接求导,依然很麻烦,不过可以用迭代来最大化
(1)由,
(2)
概率密度函数为:
令。
更新:
分簇
MOG-EM Algorithm
伪代码:
输入:样本集
高斯混合分布个数.
过程:
1:初始化高斯混合分布的模型参数
2:repeat
3:for do
4:计算由各混合成分生成的后验概率,即
5:end for
6:for do
7:计算新均值向量:
8:计算新协方差矩阵:
9:计算新混合系数:
10:end for
11:将模型参数更新为
12:until满足停止条件
13:
14:for do
15:确定簇标记
16:将划入相应的簇
17:end for
输出:簇划分
%% Example of the Mixture-of-Gaussian EM algorithm
% 2018.3.28
% by WuQi
%% Algorithm Start
clear;clc;
tic; % Timing start
%% Read data
Data = load('data.txt');
%% Initialization parameters
k = 3; sample = 4; % cluster's number is 3
[m,n] = size(Data); % m = 80, n =2
alpha = [1/3 1/3 1/3]; % parameter alpha = 1/3
mu = [Data(6,:); Data(22,:); Data(27,:)]; % prior distribution
Sigma(:,:,1)=[0.1,0.0;0.0,0.1]; % parameter sigma = [0.1,0.0;0.0,0.1]
Sigma(:,:,2)=[0.1,0.0;0.0,0.1];
Sigma(:,:,3)=[0.1,0.0;0.0,0.1];
itera = 50; % iteration is 50
count = 0;
Save_mu = zeros(k,n,itera); % parameter mu
Save_index = zeros(m,1,itera); % index of cluster
Sample_itera = [5 10 20 50]; % sampling piont
while count <= itera
%% E-Step :Posterior Distribution
for j = 1:m
sum1=zeros(1,3);
for i = 1:k
sum1(j,i) = alpha(i) * Gf(Data(j,:),mu(i,:),Sigma(:,:,i));
end
sum2=sum(sum1(j,:));
gamma(j,:) = sum1(j,:) / sum2; % gained
end
[max_gamma,index] = max(gamma,[],2); % classification ; large values are retained
%% M-Step :Update the Parameters
sum3 = sum(gamma);
for i = 1:k
sum4 = 0; sum5 = zeros(2,2);
for j = 1:m
sum4 = sum4 + gamma(j,i) * Data(j,:);
end
mu1(i,:) = sum4 / sum3(i); % update the mu
for j = 1:m
Temp = Data(j,:)-mu1(i,:);
Temp = Temp' * Temp;
sum5 = sum5 + gamma(j,i) * Temp;
end
Sigma1(:,:,i) = sum5 / sum3(i); % update the sigma
alpha1(i,:) = sum3(i) / m; % update the alpha
end
mu = mu1;
Sigma = Sigma1;
alpha = alpha1;
count = count + 1;
Save_mu(:,:,count) = mu; % save the parameter mu
Save_index(:,:,count) = index; % save the parameter index
end
%% Plot and Classification
shape = ['o' 's' '^']; % point's shape
color = ['p' 'g' 'k']; % point's color
padding = 'filed';
figure(1);
subplot(221);hold on; % 221
T_index = Save_index(:,:,Sample_itera(1));
for i = 1:k
Temp = find(T_index == i);
scatter(Data(Temp,1),Data(Temp,2),shape(i),color(i),padding);
Temp = [];
end
scatter(Save_mu(:,1,5),Save_mu(:,2,5),'+','r');
xlabel('密度');ylabel('含糖率');
title('(a)5轮迭代后');
subplot(222);hold on; % 222
T_index = Save_index(:,:,Sample_itera(2));
for i = 1:k
Temp = find(T_index == i);
scatter(Data(Temp,1),Data(Temp,2),shape(i),color(i),padding);
Temp = [];
end
scatter(Save_mu(:,1,10),Save_mu(:,2,10),'+','r');
xlabel('密度');ylabel('含糖率');
title('(b)10轮迭代后');
subplot(223);hold on; % 223
T_index = Save_index(:,:,Sample_itera(3));
for i = 1:k
Temp = find(T_index == i);
scatter(Data(Temp,1),Data(Temp,2),shape(i),color(i),padding);
Temp = [];
end
scatter(Save_mu(:,1,20),Save_mu(:,2,20),'+','r');
xlabel('密度');ylabel('含糖率');
title('(c)20轮迭代后');
subplot(224);hold on; % 224
T_index = Save_index(:,:,Sample_itera(4));
for i = 1:k
Temp = find(T_index == i);
scatter(Data(Temp,1),Data(Temp,2),shape(i),color(i),padding);
Temp = [];
end
scatter(Save_mu(:,1,50),Save_mu(:,2,50),'+','r');
xlabel('密度');ylabel('含糖率');
title('(d)50轮迭代后');
toc; % Timing end
---
%% Likehood Function
function f=Gf(x,u,s)
sum1= (-1/2)*(x-u)*(inv(s))*(x-u)';
sum2= 1/(2*pi*det(s)^(1/2));
f=sum2*exp(sum1);
end
效果图
高斯混合聚类()在不同轮迭代后的聚类结果。其中样本簇与中的样本点分别用“圆形”,“方块”“三角形”表示,各高斯混合成分的均值向量用"+"表示
Example(三硬币模型)
假设有3枚硬币,分别记作A,B,C。这些硬币正面出现的概率分别是和.
实验:
样本:
1,1,0,1,0,0,1,0,1,1
只能观测到硬币投掷后的结果,不能观测其过程,如何估计三硬币正面出现的概率?即三硬币的模型参数。
解:
三硬币模型:
M-Step
计算模型参数的新估计值
NB-EM Algorithm
伪代码:
输入:样本集
硬币A的状态为.
过程:
1:初始化模型参数
2:repeat
3:for do
4:计算由硬币B,C生成的后验概率,即
5:end for
6:for do
7:计算新的参数:
8:计算新的参数:
9:end for
10:将模型参数更新为
11:until满足停止条件
12:
13:for do
14:确定簇标记
15:将划入相应的簇
16:end for
输出:簇划分
%% Example of the Naive-Bayes EM algorithm
% 2018.3.29
% by WuQi
%% Algorithm Start
clear;clc;
tic; % Timing start
%% Read data
Data = [1 1 0 1 0 0 1 0 1 1]';
%% Initialization parameters
k = 2; % cluster's number is 2
[m,n] = size(Data); % m = 10, n = 1
p_pi = 0.4; % parameter pi
parameter = [0.6 0.7]; % parameter p,q
itera = 50; % iteration is 50
count = 0;
Save_index = zeros(m,1,itera); % index of cluster
Sample_itera = [5 10 20 50]; % sampling piont
while count <= itera
%% E-Step :Posterior Distribution
sum1 = zeros(m,k);
for j = 1:m
for i = 1:k
sum1(j,i) = (p_pi^(2-i)) * ((1-p_pi)^(i-1)) * Bf(Data(j),parameter(i));
end
sum2=sum(sum1(j,:)); % gained
gamma(j,:) = sum1(j,:) / sum2;
end
[max_gamma,index] = max(gamma,[],2); % classification ; large values are retained
%% M-Step :Update the Parameters
sum3 = sum(gamma);
p_pi1 = sum3(1) / m; % update the pi
for i = 1:k
sum4 = 0;
for j = 1:m
sum4 = sum4 + gamma(j,i) * Data(j,:);
end
parameter1(i) = sum4 / sum3(i); % update the parameter
end
p_pi = p_pi1;
parameter = parameter1;
count = count + 1;
Save_index(:,:,count) = index; % save the parameter index
end
%% Plot and Classification
shape = ['o' 's' '^']; % point's shape
color = ['p' 'g' 'k']; % point's color
padding = 'filed';
figure(1);
subplot(221);hold on; % 221
T_index = Save_index(:,:,Sample_itera(1));
for i = 1:k
Temp = find(T_index == i);
[a,b] = size(Temp);
scatter(1:a,Data(Temp),shape(i),color(i),padding);
Temp = [];
end
xlabel('次');ylabel('面');
title('(a)5轮迭代后');
subplot(222);hold on; % 222
T_index = Save_index(:,:,Sample_itera(2));
for i = 1:k
Temp = find(T_index == i);
[a,b] = size(Temp);
scatter(1:a,Data(Temp),shape(i),color(i),padding);
Temp = [];
end
xlabel('次');ylabel('面');
title('(b)10轮迭代后');
subplot(223);hold on; % 223
T_index = Save_index(:,:,Sample_itera(3));
for i = 1:k
Temp = find(T_index == i);
[a,b] = size(Temp);
scatter(1:a,Data(Temp),shape(i),color(i),padding);
Temp = [];
end
xlabel('次');ylabel('面');
title('(c)20轮迭代后');
subplot(224);hold on; % 224
T_index = Save_index(:,:,Sample_itera(4));
for i = 1:k
Temp = find(T_index == i);
[a,b] = size(Temp);
scatter(1:a,Data(Temp),shape(i),color(i),padding);
Temp = [];
end
xlabel('次');ylabel('面');
title('(d)50轮迭代后');
p_pi
parameter
toc; % Timing end
---
%% Example of the Naive-Bayes EM algorithm
% Likehood Function
% 2018.3.29
% by WuQi
function f=Bf(x,parameter)
sum1 = (parameter ^ x);
sum2 = ((1-parameter) ^ (1-x));
f=sum1 * sum2;
end
一般地:
:表示观测随机变量的数据,
:表示隐随机变量的数据,
和: 连在一起称为完全数据(complete - data)
假设:给定观测数据,其概率分布为,其中是需要估计得模型参数,对于不完全数据的似然函数是,对数似然函数
假设:和的联合概率分布是,那么完整数据的对数似然函数是。
最后补充:
1.EM算法对初始值很敏感。
2.停止迭代条件是: