@Gizmosir 2016-05-06T03:20:10.000000Z 字数 27151 阅读 796

Game Prediction Base On Machine Learning

EESM5720 Signal Analysis and Pattern Recognition Final report

Zheng Jiongbin
StuID: 20303666

`Final_Report`

Background Information

Kobe Bean Bryant is an American born basketballer, he was born on August 23, 1978. He played for Los Angeles Lakers of the National Basketball Association (NBA). He has been playing for Lakers for almost his entire lifetime, he joined NBA directly during his high school. Kobe Bryant’s successful career as a basketballer has inspired many young men. However, at the beginning of his 20th season with Lakers, Kobe made a surprised statement by announcing his retirement after the season.

Project Goals

The aim of this project is to predict the success of free shot Kobe is about to make during a game, at the moment he makes the free shoot we intend to predict as accurate as possible if the ball will score or not. In order to achieve this aim, we separated the project into four sub tasks, which are as listed below:

Binary Classification problem:
- Determine the input image is Kobe or not.
- Input: color image with fixed size.
- Output: Yes | No.
Binary Classification problem:
- Determine Kobe is in a present frame or not.
- Input: color image with relatively large size using task 1 to determine.
- Output: Yes | No.
Multi-class Classification problem:
- Determine what is Kobe doing in the given frame.
- Input: color image from task 2.
- Output: Multiclass class binary tag. For example it may including: Free throw; Defending; Running with ball,etc. And output is a column vector.
Binary Classification problem:
- Predict the free shot is going to score or not. It based on features such as the angle of knee, angle of arm, the angle of his forefinger and middle finger.
- Input: color image from task 3.
- Output: Yes | No.

Neural Network implementation

Network Architecture

In the implementation of Nerual Network, the first step is to determine the network architecture, which including the number of hidden layer as well as the number of units for each layer.

Since task1 aims at solving a binary classification problem, the output unit can only be one which has the value 1 as 'Yes' and 0 as 'No'.

There are usually two way to go about solving the hidden unit. One is to cosider only one hidden layer with many hidden units and the other is to consider having multiple hidden layers with the same number of hidden units. I decided to choose the one with only one hidden layer considering the fact that it is more popular and widely used. I used 25 as my hidden unit number and it work well. Usually more hidden unit produce better performance in term of accuracy.

As for input layer unit. Firstly, I used each image pixel intensity as my input features. So the input layer unit is bound to the size of the image. And the image size is also bound to different face recognition algorithm. The minimum image size requested for the two most commonly used algorithm EBSM[1] is 128*128, and LDA[2] is 96*84 or 24*24. So finally I decided to choose 128*128 as the image size as well as the input layer unit.

Forward propogation

The second step of Neural Network algorithm is to implement forward propagation to calculate hidden unit activation $a_n^{(j)}$ which denotes the activation of unit $n$ in layer $j$ .

Here I use logistic sigmoid as the activation function:

$z_n^{(j)} = \sigma(a_n^{(j)}) = \frac{1}{1+e^{-a_n^{(j)}}}$

$a_n^{(j)} = \sum_{i=1}^{D} (\Theta_{ni}^{(j-1)})^T z_{i}^{(j-1)} + \Theta_{n0}^{(j-1)}$

$\Theta_{ni}^{(j-1)}$ denotes the parameter or weight between unit $i$ in layer $(j-1)$ and $n$ in layer $j$ .

Regularized Cost function

The thrid step is to implement code to compute cost function $J(\Theta)$ . Here I decided to use cross entroy as my error function. In order to prevent overfitting problem, I also add a regularization term in the cost function. And after that the cost function is look like below:

$E(W) = - \sum_{i=1}^N \sum_{m=1}^M t_{im} \ln y_m(X_i, W) + (1 - t_{im}) \ln (1-y_m(X_i, W)) \\+ \frac{\lambda}{2N}\sum_{l=1}^{L-1} \sum_{i=1}^{s_j} \sum_{n=1}^{s_{j+1}}(\Theta_{ni}^{(l)})^2$

Here N denotes the number of training dataset, M denotes the dimensions of output unit. $\Theta_{ni}^{(l)}$ denotes the parameter between unit $i$ in layer $(l-1)$ and $n$ in layer $l$ . $L$ , $s_j$ and $s_{j+1}$ are represent the total number of layer, the number of unit in layer $j$ and the number of units in layer $(j+1)$ seperately. It is important to mention that both $s_j$ and $s_{j+1}$ are not including the bias unit.

Back propogation

The four step is to implement back propogation to compute the partial derivatives of cost function $\frac{\partial}{\partial \Theta_{ni}^{(l)}} J(\Theta)$ .

First we have to calculate the $\delta$ for all the output units using:

$\delta_k = y_k - t_k$

$y_k$ denotes the activations calculate using forward propogation and $t_k$ denotes the example's lable value.

Then we need to compute $\delta_j$ for all hidden units using:

$\delta_j^{(l-1)} = \sigma'(a_j) \sum_k^{s_k} \Theta_{kj}^{(l)} \delta_k^{(l)}$

Again $\delta_j^{(l-1)}$ denotes j-th unit "error" in layer $l - 1$ , $\Theta_{kj}^{(l)}$ denotes the weights between k-th unit in layter $l$ and j-th unit in layer $(l-1)$ , $s_k$ denotes the number of unit in layer $l$ . It also not including the bias unit.

Random Initialization

In order to apply gradient descent or other advanced optimization algorithm, it is significantly important to initialize the parameters properly. Otherwise it may lead to symmetry problem which lead to identical hidden unit value responding to the result input after each update.

One way to initialize the parameter is to choose value randomly. But due to the reason that the cost function of Neural Network is not a convex function, the algorithm sometimes may stick to some local optimum. One possible way to solve this problem is to run the algorithm with different initial parameter hundreds of time and pick the one with lowest validation error.

Gradient Checking

After defining and implementation of back propogation, it is always useful to implement gradient checking to see whether the gredient is compute correctly or not. In order to do this, we use numerical approximation:

$\frac{\partial E_n}{\partial \Theta_{ji}} \approx {E_n(\Theta_{ji} + \varepsilon) - E_n(\Theta_{ji} - \varepsilon) \over 2\varepsilon}$

As the $\varepsilon$ decrease, the value of numerical approximation is become closer to the gradient value. As the result the complety and the calculate time would also increase. So $\varepsilon = 10^{-4}$ would be a prefect reasonable default value.

Due to the fact that running numerical approximation is quite slow and time-comsuing, I have to turn it off before I actually train the classifier.

Data Collection

Data Acquisition

Acquiring Kobe's images is pretty much easy as he is a famous basketball player, there are a lot of his imaes on internet.

Above is showing a picture I collected from internet and the rest of the images I collected is look like this one. What we need to do is to extract the bounding face image from the picture.

In order to accelerate the processing, I worte a script to help me to do such work. The GUI of the script is shown below:

I used vision.CascadeObjectDetector(), which is included in the Matlab Computer vision system toolbox, to create a faceDetector. This script allows me to detect all the face appear in each picture and label it with yes or no.

Then in order to guarantee the correctness of the labeling result. I wrote a script to show all the face images belong to the same label, which is shown below:

Data Partitioning

After data acquisition, A totoal number of 1853 images were obtained. I decided to seperate it by 80%-10%-10% for training, cross-validation and testing set seperately. Finally I got precentage table which is listed below as my database.

Section	Number	Kobe	Non-Kobe
Training	1482(80%)	563(38%)	919(62%)
Validation	185(10%)	70(38%)	115(62%)
Testing	186(10%)	73(39%)	113(61%)
Total	1853	706(38%)	1147(62%)

Data Usage

Usually people use training dataset to train the classifier and use validation dataset to stop the algorithm from overfitting. However, because I implement neural network and also collect the data myself. So I use training dataset to train the classifier and use validation dataset as the performace critiria. However, I keep the testing dataset untouched until all the parameters of my Neural Network algorithm are sentle down.

Parameters Optimization

Input unit number

At the very begining I used each pixel intensity as the Nueral Network input features, which produced features of about 16384. As expected, the training speed is extremely slow and take about 10 minutes to train the classifier once.

However, consider the fact that natural image pixel intensity are highly correlated, I decide to implement PCA(Principal Component Analysis), which can reduce the data features from high dimension to low dimension.

$Su_1 = \lambda_1 u_1$

$S = \frac{1}{N} \sum_{n=1}^N (X_n - \overline X)(X_n - \overline X)^T$

S denotes the empirical covariance matirx. An intuitive way to think about PCA is to extract the most important part, namely eigenvector, from the original data.

Using Matlab built-in function like svd could eaily get he eigenvector of S. And since S is a $D*D$ matrix which D is the dimension of input feature, is our case is 16384, it is quite computational-expensive.

After applying PCA, the training time drop dramatically from 10 minutes to only 30 seconds while 99% of variance are remained.

Regularization Lambda

After applying PCA, it makes it possible to train the classifier many times using different regularization term lambda to find out the optimal value.

As shown in the figure above, the training and validation error from different lambda produces the lowest error when lambda is 10. Therefore, I choose 10 as my regularization lambda.

Iteration times

I also trained the classifier many time with different iteration times to seek for the optimal parameter.

As shown in the figure above, the accuracy of training and validaton dataset increase significantly from around 65% to 85% at 300 iteration times. Then the accuracy of validation drops smoothly while as excepted the training accuracy increases. Therefore, I choose 300 as my training times.

Comparison

At the end, I also use PhD(Pretty helpful Development)[3][4] face recognition toolbox to apply LDA on my database.

After LDA, the feature dimension become $1$ and the data distribution looks like above. The training point is clearly seperated.

Then I projected the validation point using the same feature mapping rule. After that I run logistic regression to seperate the data and got the result which is shown below:

Section	Accuracy
Training	100%
Validation	82%

Evaluation

Where to improve

At the parameter optimization section, I also tried to train the classifier with different training examples. The learning curve of result is look like above. As shown in the figure, the accuracy fluctuate a low as the number of training examples keep increasing.

First I tried to test the algorithm using a hand written digits and the result look like above.

So I guessed it may due to the extracted features from each image. To solve this problem I am excepted to implement face aligned before extracting the features from face image. However, it will become more complex and time-consuming.

What to continue

Due to the time limitation, We are only able to finished task 1, tast 2 and half of the task3. So we are going to continue our project with task3 and task4.

The idea of task 4 is to do segementation and motion estimation of kobe so that we could extraxt valuable features in time sequences and train the classifier.

Group Number & Responsibility

Zheng Jiongbin (StuID: 20303666)
1. Data Collection;
2. Data Labeling;
3. Neural Network approach for Task 1.
Adetunji Adeolu Oluwaseun (StuID: 20322624)
1. Data Collection;
2. SVM approach for Task2.

Schedule:

TimePeriod	Details	State
Apr. 11th ~ Apr. 17th	Collect data; pre-processing for the data.	100%
Apr. 18th ~ Apr. 24th	Train the classifier for Task1 and Task2 separately.	100%
Apr. 25th ~ May. 1st	Train the classifier for Task 3 using SVM and NN; compare the result.	50%
May. 2nd ~ May. 8th	Preparation for the oral presentation and final paper.	100%

Bibliography:

[1]: K. Okada, et al, "The Bochum/USC Face Recognition System and How it Fared in the FERET Phase III Test," Face Recognition From Theory to Applications, Springer-Verlag, pp. 186~205, 1998.
[2]: W. Zhao et al, "Subspace Linear Discriminant Analysis for Face Recognition," Technical Report CAR-TR-914, Center for Automation Research University of Maryland, 1999.
[3]: V. Sˇtruc and N Paveˇsi ́c, “The Complete Gabor-Fisher Classifier for Robust Face
Recognition”, EURASIP Advances in Signal Processing, pp. 26, 2010,
doi=10.1155/2010/847680.
[4]: V. Sˇtruc and N. Paveˇsi ́c, “Gabor-Based Kernel Partial-Least-Squares Discrimination
Features for Face Recognition”, Informatica (Vilnius), vol. 20, no. 1, pp. 115–138, 2009.
[5]: W. Zhao, et al. “Face Recognition: A Literature Survey”, Techinical Report CS-TR-4167, Center For Automation Research, University of Maryland, 2000.
[6]: A. Ng, “Machine Learning”, Coursera, 2015 [Online].
Available: https://www.coursera.org/learn/machine-learning/. [Accessed by Apr. 18th].

Appendix (Source Code)

Important: I usually sign my name as 'Gizmosir' at all my written document.

nnTrainer.m

%%---nnTrainer---
%% train a neural network using training image set
%% ---Written by Gizmosir 2016-04-19----
%% Initialization
close all;
clear;
clc;
%% Setup the structure parameters
inputLayerUnit = 128*128;   %default input image size 128*128
hiddenLayerUnit = 25;       %default 
outputLayerUnit = 2;        %only 2: yes or no
%% Load dataset
load trainingData% trainingData trainingDataTest
% imageTrain -- 1*1482 cell
% labelTrain -- 1*1482 double
% imageVal -- 1*185 cell
% labelVal -- 1*185 double
%------
% data parameters
numTrain = length(imageTrain);
numVal = length(imageVal);
X = zeros(numTrain, inputLayerUnit);
XVal = zeros(numVal, inputLayerUnit);
% reorder
for i = 1:numTrain
    image = imageTrain{i};
    image = imresize(image, [sqrt(inputLayerUnit), sqrt(inputLayerUnit)]);
    X(i,:) = image(:);
end
for i = 1:numVal
    image = imageVal{i};
    image = imresize(image, [sqrt(inputLayerUnit), sqrt(inputLayerUnit)]);
    XVal(i,:) = image(:);
end
y = labelTrain';
yVal = labelVal';
% clear dataset
clear image imageTrain imageVal labelTrain labelVal;
fprintf('Loading data finished...\n');
%% Visualizing Data
% randomly pick 20 row of data
sel = randperm(size(X, 1));
sel = sel(1:20);
displayData(X(sel, :));
fprintf('Visualizing Data finished...\n');
%% Apply PCA towards dataset to accelerate training speed
load PCA
% U_deduce -- 16384*428 double
% K -- 428 double
%------
% Because PCA is extremely time-consuming so I load the pre-run data 
% instead of run it again every time.
% [U_deduce, K] = PCA(X);
% update new X and XVal
X = X * U_deduce;
XVal = XVal * U_deduce;
% update new inputLayerUnit for Neural Network
inputLayerUnit = K;
%% Randomly initialize paremeters
theta1Init = randInitializeWeights(inputLayerUnit, hiddenLayerUnit);
theta2Init = randInitializeWeights(hiddenLayerUnit, outputLayerUnit);
% Unroll parameters
nnParamsInit = [theta1Init(:); theta2Init(:)];
fprintf('Rradomly initialize paremeters finished...\n');
%}
% load selected parameters for better performance 
% since it highly related to the initial value 
% load nnParams
%nnParamsInit -- 10777*1 double
%-------
%% Compute cost (Feedforward)
% regularization parameter
% lambda = 0 -- cost function without regularization term
lambda = 0;
J = nnCostFunction(nnParamsInit, inputLayerUnit, hiddenLayerUnit, ...
                    outputLayerUnit, X, y, lambda);
fprintf('Compute cost finiished...\n');
fprintf('The cost with initial paramters is: %f\n', J);
%% Gradinent checking(Should trun off in real training)
lambda = 3;
checkNNGradients(lambda);
%% Train Neural Network
% Using "fmincg" to find the optimal parameters
%change the MaxIter number will get different result
options = optimset('MaxIter', 300); %300 is the optimal value
lambda = 23; %23 is the optimal value
% Create "short hand" for the cost function to be minimized
costFunction = @(p) nnCostFunction(p, ...
                                   inputLayerUnit, ...
                                   hiddenLayerUnit, ...
                                   outputLayerUnit, X, y, lambda);
% Now, costFunction is a function that takes in only one argument (the
% neural network parameters)
tic
[nn_params, cost] = fmincg(costFunction, nnParamsInit, options);
toc
% Obtain Theta1 and Theta2 back from nn_params
Theta1 = reshape(nn_params(1:hiddenLayerUnit * (inputLayerUnit + 1)), ...
                 hiddenLayerUnit, (inputLayerUnit + 1));
Theta2 = reshape(nn_params((1 + (hiddenLayerUnit * (inputLayerUnit + 1))):end), ...
                 outputLayerUnit, (hiddenLayerUnit + 1));
fprintf('Tranining finished...\n');
%% Visualize weights
%round the parameter into a square number to display properly
M = floor(sqrt(size(Theta1,2)))^2 + 1;
figure;displayData(Theta1(:, 2:M));  
%% Predict
pred = predict(Theta1, Theta2, X);
fprintf('Training set accuracy: %f\n', mean(double(pred == y)) *100);
pred = predict(Theta1, Theta2, XVal);
fprintf('Validation set accuracy: %f\n', mean(double(pred == yVal)) *100);
% clear
% clearvars -except hiddenLayerUnit inputLayerUnit outputLayerUnit lambda;
%% Validation for selecting lambda
[lambdaVec, errorTrain, errorVal] = validationCurve(X, y, XVal, yVal, ...
                                       inputLayerUnit, hiddenLayerUnit, ...
                                       outputLayerUnit, nnParamsInit);
% plot 
plot(lambdaVec, errorTrain, lambdaVec, errorVal);
legend('Train', 'Cross Validation');
xlabel('lambda');
ylabel('Error');
%}
% lambda = 23; %optimal value 
%% Apply learning curve for selecting training example number
[x, errorTrain, errorVal] = learningCurve(X, y, XVal, yVal, lambda, ...
                                       inputLayerUnit, hiddenLayerUnit, ...
                                       outputLayerUnit, nnParamsInit);
% plot
figure;plot(x, errorTrain, x, errorVal);
title('Learning curve for Neural Network')
legend('Train', 'Cross Validation')
xlabel('Number of training examples')
ylabel('Accuracy')    
%% Apply learning curve for selecting iteration times
[x, errorTrain, errorVal] = learningCurve2(X, y, XVal, yVal, lambda, ...
                                       inputLayerUnit, hiddenLayerUnit, ...
                                       outputLayerUnit, nnParamsInit);
% plot
figure;plot(x, errorTrain, x, errorVal);
title('Learning curve for Neural Network')
legend('Train', 'Cross Validation')
xlabel('Number of iteration times')
ylabel('Accuracy')
%}

nnCostFunction.m

function [J grad] = nnCostFunction(nn_params, ...
                                   input_layer_size, ...
                                   hidden_layer_size, ...
                                   num_labels, ...
                                   X, y, lambda)
%%----nnCostFunction----
%% Return the cost function error and gradient
%% ---Written by Gizmosir 2014-04-19---
m = size(X, 1);
%forwardPropogation to calculate the cost function
a1 = [ones(1,m); X']; 
a2 = [ones(1,m); sigmoid(Theta1 * a1)];
a3 = sigmoid(Theta2 * a2);
%y is a row vector contains 1 & -1.
%make it into 2*m matrix which each column contains [1 0] or [0 1]
Y = zeros(num_labels, m);
for i = 1:m
    if y(i) == 1
        Y(1, i) = 1;
    else
        Y(2, i) = 1;
    end
end
% for algorithm testing label
% for i = 1:m
%     Y(y(i), i) = 1;
% end
%%backpropagation
delta3 = a3 - Y; 
delta2 = Theta2' * delta3 .* sigmoidGradient([ones(1,m);Theta1 * a1]);
delta2(1, :) = [];
Delta2 = delta3 * (a2)';
Delta1 = delta2 * (a1)';
% cost function and gradient 
Theta1(:,1) = [];
Theta2(:,1) = [];
J = 1/m * sum(sum( - Y .* log(a3) - (1 - Y) .* log(1 - a3))) ...
    + lambda/(2*m) * ( sum(sum( Theta1.^2 )) + sum(sum( Theta2.^2 )));
Theta2_grad = 1/m .* Delta2;
Theta1_grad = 1/m .* Delta1;
%Regularized
Theta2_grad = [Theta2_grad(:,1), Theta2_grad(:,2:end) + lambda/m * Theta2];
Theta1_grad = [Theta1_grad(:,1), Theta1_grad(:,2:end) + lambda/m * Theta1];
% Unroll gradients
grad = [Theta1_grad(:) ; Theta2_grad(:)];

LearningCurve.m

function [x, error_train, error_val] = ...
    learningCurve(X, y, Xval, yval, lambda, inputLayerUnit, ...
                  hiddenLayerUnit, outputLayerUnit, nnParamsInit)
% Number of training examples
m = size(X, 1);
error_train = [];
error_val = [];
% parameter setup
options = optimset('MaxIter', 500);
x = 1:10:m;
for i = 1:10:m
    costFunction = @(p) nnCostFunction(p, ...
                                   inputLayerUnit, ...
                                   hiddenLayerUnit, ...
                                   outputLayerUnit, X(1:i,:), y(1:i,:), ...
                                   lambda);
    [nn_params, cost] = fmincg(costFunction, nnParamsInit, options);
    %predict
    Theta1 = reshape(nn_params(1:hiddenLayerUnit * (inputLayerUnit + 1)), ...
                 hiddenLayerUnit, (inputLayerUnit + 1));
    Theta2 = reshape(nn_params((1 + (hiddenLayerUnit * (inputLayerUnit + 1))):end), ...
                 outputLayerUnit, (hiddenLayerUnit + 1));
    pred = predict(Theta1, Theta2, X(1:i,:));
    error_train = [error_train; mean(double(pred == y(1:i,:))) * 100];
    pred = predict(Theta1, Theta2, Xval);
    error_val = [error_val; mean(double(pred == yval)) *100];
end

LearningCurve2.m

function [x, error_train, error_val] = ...
    learningCurve2(X, y, Xval, yval, lambda, inputLayerUnit, ...
                  hiddenLayerUnit, outputLayerUnit, nnParamsInit)
% Number of iteration times
x = 100:100:1000;
costFunction = @(p) nnCostFunction(p, ...
                               inputLayerUnit, ...
                               hiddenLayerUnit, ...
                               outputLayerUnit, X, y, ...
                               lambda);
% Compute the error
for i = 100:100:1000
    % parameter setup
    options = optimset('MaxIter', i);
    [nn_params, cost] = fmincg(costFunction, nnParamsInit, options);  
    %predict
    Theta1 = reshape(nn_params(1:hiddenLayerUnit * (inputLayerUnit + 1)), ...
                 hiddenLayerUnit, (inputLayerUnit + 1));
    Theta2 = reshape(nn_params((1 + (hiddenLayerUnit * (inputLayerUnit + 1))):end), ...
                 outputLayerUnit, (hiddenLayerUnit + 1));
    pred = predict(Theta1, Theta2, X);
    error_train = [error_train; mean(double(pred == y)) * 100];
    pred = predict(Theta1, Theta2, Xval);
    error_val = [error_val; mean(double(pred == yval)) *100];
end

PCA.m

function [U_deduce, K] = PCA(X)
%%---PCA---
%% Apply PCA on given data
%%---Written by Gizmosir 2016-04-23---
%normalize X by subtracting the mean value from each feature
mu = mean(X);
X_norm = bsxfun(@minus, X, mu);
%calculate the covariance matrix
Sigma = 1/numTrain .* X_norm' * X_norm; 
[U, S] = svd(Sigma);
% find the minimum K such that 99% variance retained
total = sum(sum(S));
for i = 1:size(Sigma,1)
    accuracy = sum(sum(Sigma(1:i,1:i))) / total;
    if accuracy > 0.99
        K = i;
        break;
    end
end
%Keep only the importance of U
U_deduce = U(:,1:K);

PhD_LDA_LR_FaceRecognition.m

%PhD_LDA_LR_FaceRecognition
%% Apply lda based Logistic Regression for face recogntion
%% PhD_toolbox is used in this method
%% ---Written by Gizmosir 2016-04-24---
%% initialize
close all;
clear;
clc;
%% Setup the parameters
inputLayerUnit = 128*128;     %default input image size 128*128
hiddenLayerUnit = 25;       %default 
outputLayerUnit = 2;        %only 2: yes or no
%% Load dataset
load trainingData% trainingData trainingDataTest
% imageTrain -- 1*1482 cell
% labelTrain -- 1*1482 double
% imageVal -- 1*185 cell
% labelVal -- 1*185 double
%------
% data parameters
numTrain = length(imageTrain);
numVal = length(imageVal);
X = zeros(numTrain, inputLayerUnit);
XVal = zeros(numVal, inputLayerUnit);
% reorder
for i = 1:numTrain
    image = imageTrain{i};
    image = imresize(image, [sqrt(inputLayerUnit), sqrt(inputLayerUnit)]);
    X(i,:) = image(:);
end
for i = 1:numVal
    image = imageVal{i};
    image = imresize(image, [sqrt(inputLayerUnit), sqrt(inputLayerUnit)]);
    XVal(i,:) = image(:);
end
y = labelTrain';
y(y<0) = 0;
yVal = labelVal';
yVal(yVal<0) = 0;
% clear dataset
clear image imageTrain imageVal labelTrain labelVal;
fprintf('Loading data finished...\n');
%% Visualizing Data
% randomly pick 20 row of data
sel = randperm(size(X, 1));
sel = sel(1:20);
displayData(X(sel, :));
fprintf('Visualizing Data finished...\n');
%% Construct LDA subspace using PhD
model = perform_lda_PhD(X', y', 1);
featureVal = linear_subspace_projection_PhD(XVal', model, 1);
%update training data
X = model.train';
XVal = featureVal';
%% Visualizing Data
figure; plot(X, y, '.'); title('Training data point after LDA',...
                                'FontSize', 18);
figure; plot(XVal, yVal, '.'); title('Validation data point after LDA', ...
                                'FontSize', 18);
%% Compute Cost and Gradient
% initialize parameters
[m, n] = size(X);
X = [ones(m,1), X];
initialTheta = zeros(n+1, 1);
lambda = 10;
% cost function
[cost, grad] = costFunctionReg(initialTheta, X, y, lambda);
%% Optimize using fminunc function
% Set parameters
options = optimset('GradObj', 'on', 'Maxiter', 50);
% Optimize
[theta, J] = fminunc(@(t)(costFunctionReg(t, X, y, lambda)), ...
                          initialTheta, options);
%% Display boundary for validation set 
xBdy = min(XVal):0.1:max(XVal);
yBdy = sigmf(xBdy, [theta(1),theta(2)]); 
hold on;plot(xBdy,yBdy,'r-');
xlabel('x');ylabel('t');
boundary = - theta(1) / theta(2);
%% Predict
pred = predictReg(theta, X);
fprintf('Training set accuracy: %f\n', mean(double(pred == y)) *100);
[m, n] = size(XVal);
XVal = [ones(m,1), XVal];
pred = predictReg(theta, XVal);
fprintf('Validation set accuracy: %f\n', mean(double(pred == yVal)) *100);

GUI.m (image labeling)

function varargout = GUI(varargin)
%% ----GUI----
%% GUI for face labeling
%% ----Written by Gizmosir 2016-04-18----
% Begin initialization code - DO NOT EDIT
gui_Singleton = 1;
gui_State = struct('gui_Name',       mfilename, ...
                   'gui_Singleton',  gui_Singleton, ...
                   'gui_OpeningFcn', @GUI_OpeningFcn, ...
                   'gui_OutputFcn',  @GUI_OutputFcn, ...
                   'gui_LayoutFcn',  [] , ...
                   'gui_Callback',   []);
if nargin && ischar(varargin{1})
    gui_State.gui_Callback = str2func(varargin{1});
end
if nargout
    [varargout{1:nargout}] = gui_mainfcn(gui_State, varargin{:});
else
    gui_mainfcn(gui_State, varargin{:});
end
% End initialization code - DO NOT EDIT
% --- Executes just before GUI is made visible.
function GUI_OpeningFcn(hObject, eventdata, handles, varargin)
% Choose default command line output for GUI
handles.output = hObject;
% global variable
global imageName;
global imageLabel;
global imageFace;
global imageFaceFrame; %face in current frame
global faceDetector;
% load image file and get image Name
file = dir('../kobe3/*.jpg'); %kobe3
% file = dir('./test/*.jpg');
imageName = {file.name};
% define faceDetector
faceDetector = vision.CascadeObjectDetector();
% define image face as cell
imageFace = {};
imageFaceFrame = {};
% define image label as a two row matrix
imageLabel = []; % zeros(2, length(imageName));
% Update imageFace and label
update(handles);
% Update handles structure
guidata(hObject, handles);
% --- Outputs from this function are returned to the command line.
function varargout = GUI_OutputFcn(hObject, eventdata, handles) 
% varargout  cell array for returning output args (see VARARGOUT);
% hObject    handle to figure
% eventdata  reserved - to be defined in a future version of MATLAB
% handles    structure with handles and user data (see GUIDATA)
% Get default command line output from handles structure
varargout{1} = handles.output;
% --- Executes on button press in pushbutton1.(YES button)
function pushbutton1_Callback(hObject, eventdata, handles)
% record the current image label
global imageLabel;
imageLabel = [imageLabel, 1];
% display next image
update(handles);
% --- Executes on button press in pushbutton2.(NO button)
function pushbutton2_Callback(hObject, eventdata, handles)
% record the current image label
global imageLabel;
imageLabel = [imageLabel, -1];
% display next image
update(handles);
function imageFaceFrame = faceDetect(imageName, handles)
%global variable
global faceDetector;
% load original image
imageName = sprintf('../kobe3/%s',imageName);
imageOri = imread(imageName);
% detect the face using faceDetector
BBox = step(faceDetector, imageOri);
% delete face small then 60x60 pixels
if ~isempty(BBox)
    BBox = sortrows(BBox, -3);
    index = min(find(BBox(:,3)<48));
    BBox(index:end, :) = [];
end
% randomly pick a windoe if there is no matching
if isempty(BBox)
    BBox = [1, 1, 128, 128];
end
% show face rectangle
B = insertObjectAnnotation(imageOri, 'rectangle', BBox, 'Face');
axes(handles.axes2); imshow(B); 
% save detected face image
for i = 1:size(BBox, 1)
    imageFaceFrame{i} = imageOri(BBox(i, 2):BBox(i, 2)+BBox(i, 4)-1, ...
                            BBox(i, 1):BBox(i, 1)+BBox(i, 3)-1); 
end
function update(handles)
global imageName
global imageFace
global imageFaceFrame
if isempty(imageName)
    finishLabeling(handles);
elseif ~isempty(imageFaceFrame);
    %display and delete the next face image
    image = imageFaceFrame{1};
    imageFace = [imageFace, image];
    imageFaceFrame(1) = [];
    axes(handles.axes1); imshow(image); 
else
    % find boundingbox using faceDetecting
    imageFaceFrame = faceDetect(imageName{1}, handles);
    %store and delete the first image
    image = imageFaceFrame{1};
    imageFace = [imageFace, image];
    imageFaceFrame(1) = [];
    axes(handles.axes1); imshow(image); 
    %change display tag text
    set(handles.text1, 'String', imageName(1));
    % delete the picture
    imageName(1) = [];
end
function finishLabeling(handles)
% save imageFace and imageLabel
global imageFace;
global imageLabel;
save('myData', 'imageFace', 'imageLabel');
%clear
close(GUI);
clear;
clc;
display('finish...');

dataSeperate.m

%%---dataSeparate---
%% separate the image data set into three part
%% Trainset:80%, Validationset: 10% Testset:10%
%% ----Written by Gizmosir 2016-04-19----
% clear
close all;
clear;
clc;
% load data
load myData; %myDataCorrected dataAlgorithmTest
m = length(imageFace);
% First ramdomly reorder all the label so that it become totally randomly
% distributed
sel = randperm(m);
index = [floor(m*0.8), floor(m*0.9)];
% training dataset
imageTrain = imageFace(sel(1:index(1)));
labelTrain = imageLabel(sel(1:index(1)));
% cross-validation dataset
imageVal = imageFace(sel(index(1)+1:index(2)));
labelVal = imageLabel(sel(index(1)+1:index(2)));
% testing dataset
imageTest = imageFace(sel(index(2)+1:end));
labelTest = imageLabel(sel(index(2)+1:end));
%% reshape the image(now move to nnTrainer)
% for i = 1:length(imageTrain)
%     imageTrain{i} = imresize(imageTrain{i}, [128, 128]);
% end
% 
% %% reshape the image
% for i = 1:length(imageVal)
%     imageVal{i} = imresize(imageVal{i}, [128, 128]);
% end
save('trainingDataTest', 'imageTrain', 'labelTrain', ...
    'imageVal', 'labelVal');
save('testingDataTest', 'imageTest', 'labelTest');

LabelChecking.m

%%---LabelChcking---
%% display and check the labeled image
%% ---Written by Gizmosir 2016-04-21---
% initilize
close all;
clear;
clc;
% load data
load myDataCorrected
% seperate into two group
indexTrue = find(imageLabel>0);
indexFalse = find(imageLabel<0);
imageDisplay = [];
for i = 1:length(indexFalse) %indexTrue OR indexFalse
    image = imageFace{indexFalse(i)}; %indexTrue indexFalse
    image = imresize(image, [128 128]);
    imageDisplay(rem(i, 64)+1,:) = image(:);
    if rem(i, 64) == 0
        displayData(imageDisplay(1:64, :));
        imageDisplay = [];
        pause;
    end
end

imageRenamer.m

%%----fileRenamer----
%% rename all the image store in image file
%% ----Written by Gizmosir 2016-04-18----
%clear;
close;
clear;
clc;
%%load all the image in image folder
fileDirectory = '../kobe3/';
file = dir(fullfile(fileDirectory, '*.jpg'));
fileNames = {file.name};
%rename the file using movefile
for i = 1:length(fileNames)
    newName = fullfile(fileDirectory, sprintf('%04d.jpg', i));
    movefile(fullfile(fileDirectory, fileNames{i}), newName);
end