@wuqi0616
2018-03-29T14:08:37.000000Z
字数 11224
阅读 1567
机器学习入门资料
:表示的概率密度函数,它是一个以为参数的函数。分号左边是随机变量;右边是模型参数。
:前者是Z的以为参数的先验概率密度函数,后者是给定Z下的条件概率。
如图所示:
选择合适的关于的分布。
寻找参数最大化期望似然
直接求导,依然很麻烦,不过可以用迭代来最大化
(1)由,
(2)

概率密度函数为:
令。
更新:
分簇
MOG-EM Algorithm
伪代码:
输入:样本集
高斯混合分布个数.
过程:
1:初始化高斯混合分布的模型参数
2:repeat
3:for do
4:计算由各混合成分生成的后验概率,即
5:end for
6:for do
7:计算新均值向量:
8:计算新协方差矩阵:
9:计算新混合系数:
10:end for
11:将模型参数更新为
12:until满足停止条件
13:
14:for do
15:确定簇标记
16:将划入相应的簇
17:end for
输出:簇划分
%% Example of the Mixture-of-Gaussian EM algorithm% 2018.3.28% by WuQi%% Algorithm Startclear;clc;tic; % Timing start%% Read dataData = load('data.txt');%% Initialization parametersk = 3; sample = 4; % cluster's number is 3[m,n] = size(Data); % m = 80, n =2alpha = [1/3 1/3 1/3]; % parameter alpha = 1/3mu = [Data(6,:); Data(22,:); Data(27,:)]; % prior distributionSigma(:,:,1)=[0.1,0.0;0.0,0.1]; % parameter sigma = [0.1,0.0;0.0,0.1]Sigma(:,:,2)=[0.1,0.0;0.0,0.1];Sigma(:,:,3)=[0.1,0.0;0.0,0.1];itera = 50; % iteration is 50count = 0;Save_mu = zeros(k,n,itera); % parameter muSave_index = zeros(m,1,itera); % index of clusterSample_itera = [5 10 20 50]; % sampling piontwhile count <= itera%% E-Step :Posterior Distributionfor j = 1:msum1=zeros(1,3);for i = 1:ksum1(j,i) = alpha(i) * Gf(Data(j,:),mu(i,:),Sigma(:,:,i));endsum2=sum(sum1(j,:));gamma(j,:) = sum1(j,:) / sum2; % gainedend[max_gamma,index] = max(gamma,[],2); % classification ; large values are retained%% M-Step :Update the Parameterssum3 = sum(gamma);for i = 1:ksum4 = 0; sum5 = zeros(2,2);for j = 1:msum4 = sum4 + gamma(j,i) * Data(j,:);endmu1(i,:) = sum4 / sum3(i); % update the mufor j = 1:mTemp = Data(j,:)-mu1(i,:);Temp = Temp' * Temp;sum5 = sum5 + gamma(j,i) * Temp;endSigma1(:,:,i) = sum5 / sum3(i); % update the sigmaalpha1(i,:) = sum3(i) / m; % update the alphaendmu = mu1;Sigma = Sigma1;alpha = alpha1;count = count + 1;Save_mu(:,:,count) = mu; % save the parameter muSave_index(:,:,count) = index; % save the parameter indexend%% Plot and Classificationshape = ['o' 's' '^']; % point's shapecolor = ['p' 'g' 'k']; % point's colorpadding = 'filed';figure(1);subplot(221);hold on; % 221T_index = Save_index(:,:,Sample_itera(1));for i = 1:kTemp = find(T_index == i);scatter(Data(Temp,1),Data(Temp,2),shape(i),color(i),padding);Temp = [];endscatter(Save_mu(:,1,5),Save_mu(:,2,5),'+','r');xlabel('密度');ylabel('含糖率');title('(a)5轮迭代后');subplot(222);hold on; % 222T_index = Save_index(:,:,Sample_itera(2));for i = 1:kTemp = find(T_index == i);scatter(Data(Temp,1),Data(Temp,2),shape(i),color(i),padding);Temp = [];endscatter(Save_mu(:,1,10),Save_mu(:,2,10),'+','r');xlabel('密度');ylabel('含糖率');title('(b)10轮迭代后');subplot(223);hold on; % 223T_index = Save_index(:,:,Sample_itera(3));for i = 1:kTemp = find(T_index == i);scatter(Data(Temp,1),Data(Temp,2),shape(i),color(i),padding);Temp = [];endscatter(Save_mu(:,1,20),Save_mu(:,2,20),'+','r');xlabel('密度');ylabel('含糖率');title('(c)20轮迭代后');subplot(224);hold on; % 224T_index = Save_index(:,:,Sample_itera(4));for i = 1:kTemp = find(T_index == i);scatter(Data(Temp,1),Data(Temp,2),shape(i),color(i),padding);Temp = [];endscatter(Save_mu(:,1,50),Save_mu(:,2,50),'+','r');xlabel('密度');ylabel('含糖率');title('(d)50轮迭代后');toc; % Timing end---%% Likehood Functionfunction f=Gf(x,u,s)sum1= (-1/2)*(x-u)*(inv(s))*(x-u)';sum2= 1/(2*pi*det(s)^(1/2));f=sum2*exp(sum1);end
效果图
高斯混合聚类()在不同轮迭代后的聚类结果。其中样本簇与中的样本点分别用“圆形”,“方块”“三角形”表示,各高斯混合成分的均值向量用"+"表示
Example(三硬币模型)
假设有3枚硬币,分别记作A,B,C。这些硬币正面出现的概率分别是和.
实验:
样本:
1,1,0,1,0,0,1,0,1,1
只能观测到硬币投掷后的结果,不能观测其过程,如何估计三硬币正面出现的概率?即三硬币的模型参数。
解:
三硬币模型:
M-Step
计算模型参数的新估计值
NB-EM Algorithm
伪代码:
输入:样本集
硬币A的状态为.
过程:
1:初始化模型参数
2:repeat
3:for do
4:计算由硬币B,C生成的后验概率,即
5:end for
6:for do
7:计算新的参数:
8:计算新的参数:
9:end for
10:将模型参数更新为
11:until满足停止条件
12:
13:for do
14:确定簇标记
15:将划入相应的簇
16:end for
输出:簇划分
%% Example of the Naive-Bayes EM algorithm% 2018.3.29% by WuQi%% Algorithm Startclear;clc;tic; % Timing start%% Read dataData = [1 1 0 1 0 0 1 0 1 1]';%% Initialization parametersk = 2; % cluster's number is 2[m,n] = size(Data); % m = 10, n = 1p_pi = 0.4; % parameter piparameter = [0.6 0.7]; % parameter p,qitera = 50; % iteration is 50count = 0;Save_index = zeros(m,1,itera); % index of clusterSample_itera = [5 10 20 50]; % sampling piontwhile count <= itera%% E-Step :Posterior Distributionsum1 = zeros(m,k);for j = 1:mfor i = 1:ksum1(j,i) = (p_pi^(2-i)) * ((1-p_pi)^(i-1)) * Bf(Data(j),parameter(i));endsum2=sum(sum1(j,:)); % gainedgamma(j,:) = sum1(j,:) / sum2;end[max_gamma,index] = max(gamma,[],2); % classification ; large values are retained%% M-Step :Update the Parameterssum3 = sum(gamma);p_pi1 = sum3(1) / m; % update the pifor i = 1:ksum4 = 0;for j = 1:msum4 = sum4 + gamma(j,i) * Data(j,:);endparameter1(i) = sum4 / sum3(i); % update the parameterendp_pi = p_pi1;parameter = parameter1;count = count + 1;Save_index(:,:,count) = index; % save the parameter indexend%% Plot and Classificationshape = ['o' 's' '^']; % point's shapecolor = ['p' 'g' 'k']; % point's colorpadding = 'filed';figure(1);subplot(221);hold on; % 221T_index = Save_index(:,:,Sample_itera(1));for i = 1:kTemp = find(T_index == i);[a,b] = size(Temp);scatter(1:a,Data(Temp),shape(i),color(i),padding);Temp = [];endxlabel('次');ylabel('面');title('(a)5轮迭代后');subplot(222);hold on; % 222T_index = Save_index(:,:,Sample_itera(2));for i = 1:kTemp = find(T_index == i);[a,b] = size(Temp);scatter(1:a,Data(Temp),shape(i),color(i),padding);Temp = [];endxlabel('次');ylabel('面');title('(b)10轮迭代后');subplot(223);hold on; % 223T_index = Save_index(:,:,Sample_itera(3));for i = 1:kTemp = find(T_index == i);[a,b] = size(Temp);scatter(1:a,Data(Temp),shape(i),color(i),padding);Temp = [];endxlabel('次');ylabel('面');title('(c)20轮迭代后');subplot(224);hold on; % 224T_index = Save_index(:,:,Sample_itera(4));for i = 1:kTemp = find(T_index == i);[a,b] = size(Temp);scatter(1:a,Data(Temp),shape(i),color(i),padding);Temp = [];endxlabel('次');ylabel('面');title('(d)50轮迭代后');p_piparametertoc; % Timing end---%% Example of the Naive-Bayes EM algorithm% Likehood Function% 2018.3.29% by WuQifunction f=Bf(x,parameter)sum1 = (parameter ^ x);sum2 = ((1-parameter) ^ (1-x));f=sum1 * sum2;end
一般地:
:表示观测随机变量的数据,
:表示隐随机变量的数据,
和: 连在一起称为完全数据(complete - data)
假设:给定观测数据,其概率分布为,其中是需要估计得模型参数,对于不完全数据的似然函数是,对数似然函数
假设:和的联合概率分布是,那么完整数据的对数似然函数是。
最后补充:
1.EM算法对初始值很敏感。
2.停止迭代条件是: