@zhenni94
2015-08-26T04:55:26.000000Z
字数 12749
阅读 1461
project homepage: http://www.stat.ucla.edu/~xianjie.chen/projects/pose_estimation/pose_estimation.html
Full score function:
Unary Term
conf
is a structure of the given global configuration. conf.pa
is the index of the parent of each joint. p_no
is the number of the parts(joints).
The main part of this function is shown in the following.
// read data
[pos_train, pos_val, pos_test, neg_train, neg_val, tsize] = LSP_data();
// train dcnn
train_dcnn(pos_train, pos_val, neg_train, tsize, caffe_solver_file);
// train graphical model
model = train_model(note, pos_val, neg_val, tsize);
// testing
boxes = test_model([note,'_LSP'], model, pos_test);
/* ... */
// evaluation
show_eval(pos_test, ests, conf, eval_method);
LSP_data.m
Some variables and constants:
trainval_frs_pos = 1:1000; // training frames for positive
test_frs_pos = 1001:2000; // testing frames for positive
trainval_frs_neg = 615:1832; // training frames for negative (of size 1218)
frs_pos = cat(2, trainval_frs_pos, test_frs_pos); // frames for negative
all_pos // num(pos)*1 struct array for positive
// struct: im, joints, r_degree, isflip
neg // num(neg)*1 struct array for negative
pos_trainval = all_pos(1 : numel(trainval_frs_pos)); // training and validation image struct for positive
pos_test = all_pos(numel(trainval_frs_pos)+1 : end); // testing image struct for positive
Data preparing:
lsp_pc2oc
: function joints = lsp_pc2oc(joints)
: convert to person-centricpos_trainval(ii).joints = Trans * pos_trainval(ii).joints;
Create ground truth joints for model training. Augment the original 14 joint positions with midpoints of joints, defining a total of 26 jointsadd_flip
: flip trainval images (horizontally) (#pos_trainval *= 2)init_scale
: init dataset specific parametersadd_rotate
: rotate trainval images (every val_id = randperm(numel(pos_trainval), 2000);
: split training and validation data for positive (random choose 2000 image from the pos_trainval
to be the validation set, #training = #pos_trianval - 2000 = 78000)
val_id = randperm(numel(neg), 500);
split training and validation data for negtive (random choose 500 image from the neg
to be the validation set, #neg_val = #neg - #neg_train = 1218 - 500 = 728)
add_flip
: flip the negative data (#neg_val *= 2; #neg_train *= 2)train_dcnn.m
Some variable and constants:
mean_pixel = [128, 128, 128]; // the mean value of each pixel
K = conf.K; // K = T_{ij}
prepare_patches.m
Prepare the patches and derive their labels to train dcnn
// generate the labels
clusters = learn_clusters(pos_train, pos_val, tsize);
label_train = derive_labels('train', clusters, pos_train, tsize);
label_val = derive_labels('val', clusters, pos_val, tsize);
// labels for negative (dummy)
dummy_label = struct('mix_id', cell(numel(neg_train), 1), ...
'global_id', cell(numel(neg_train), 1));
// all the training data
train_imdata = cat(1, num2cell(pos_train), num2cell(neg_train));
train_labels = cat(1, num2cell(label_train), num2cell(dummy_label));
// random permute the data and store it in the format of LMDB
perm_idx = randperm(numel(train_imdata));
train_imdata = train_imdata(perm_idx);
train_labels = train_labels(perm_idx);
if ~exist([cachedir, 'LMDB_train'], 'dir')
store_patch(train_imdata, train_labels, psize, [cachedir, 'LMDB_train']);
end
// validation data for positive
val_imdata = num2cell(pos_val);
val_labels = num2cell(label_val);
if ~exist([cachedir, 'LMDB_val'], 'dir')
store_patch(val_imdata, val_labels, psize, [cachedir, 'LMDB_val']);
end
learn_clusters
(call cluster_rp
cluster relative position)nbh_IDs = get_IDs(pa, K);
: get the neighbor of each part(joint)clusters{ii}
: cell : the mean relative postion of ii
-th partX(ii,:) = norm_rp(imdata(ii), cur, nbh, tsize);
relative position for ii
-th data itemmean_X = mean(X(valid_idx,:),1);
normX = bsxfun(@minus, X(valid_idx,:), mean_X);
centralize (normalize) the relative positionR
trials of the k-means algorithm and choose the one has the smallest distance [gInd{trial}, cen{trial}, sumdist(trial)] = k_means(normX, K);
imgid
(all the img belongs to the cluster k
) of clusters{cur}{n}(k)
, where clusters{cur}{n}(k)
is the k
-th cluster of n
-th neighbor of the cur
-th joint.derive_labels
(call assign_label
)labels
: a array of struct : mix_id
, global_id
, near
, invalid
K
: get_id
: nbh_IDs{ii}
: get the neighors of ii
-th part, target_IDs{ii}
: indexes of ii
-th neighbors in global_IDs{ii}
: the labels : n
-th neighor of p
-th part (joint) in ii
-th data image. nbh_idx = nbh_IDs{p}(n);
labels(ii).mix_id{p}(n)
the index of nearest cluster, labels(ii).near{p}{n}
: the index of near clusters (dist < 3*dist(nearest))labels(ii).invalid(p)
: for check and debuglabels(ii).global_id
: translate the mix_id{p}
to global_id(p)
System call caffe
to train dcnn
system([caffe_root, '/build/tools/caffe train ', sprintf('-gpu %d -solver %s', ...
conf.device_id, caffe_solver_file)]);
net_surgery.m
Change the fully-connected layers to convolutional layers.
caffe matlab interface code: https://github.com/xianjiec/caffe/blob/dev/matlab/caffe/matcaffe.cpp
caffe('reset'); caffe('init', deploy_file, model_file);
fc_weights = caffe('get_weights');
: all the weights in the networksfc_layer_ids(ii)
: the index of the ii
-th fully connected layer in the original networkcaffe('reset'); caffe('init', deploy_conv_file, model_file);
conv_weights = caffe('get_weights');
all the weights in the fcn conv_layer_ids(ii)
: the index of the ii
-th fully connected layer in fcnweights{1}
: weightsweights{2}
: bias
trans_params = struct('weights', cell(numel(conv_names), 1), ...
'layer_names', cell(numel(conv_names), 1));
for ii = 1:numel(conv_names)
trans_params(ii).layer_names = conv_names{ii};
weights = cell(2, 1);
weights{1} = reshape(fc_weights(fc_layer_ids(ii)).weights{1}, size(conv_weights(conv_layer_ids(ii)).weights{1}));
weights{2} = reshape(fc_weights(fc_layer_ids(ii)).weights{2}, size(conv_weights(conv_layer_ids(ii)).weights{2}));
trans_params(ii).weights = weights;
end
caffe('set_weights', trans_params); caffe('save', fully_conv_model_file);
train_model
label_val
: the label of validation data for positive (struct: mix_id
, global_id
, near
, invalid
)build_model
: prepare the weights parameter in the formula of full score for SVMtrain
: model = train(cls, model, pos_val, neg_val, 1);
Use validation set to train SVMbias
: apps
: pdef
: 0.01, gaus
: Structures of parts of model:
model.len = 0; // number of parameters in the model
// 'i' is the index of the parameters in the whole model
model.bias = struct('w',{},'i',{}); // bias
model.apps = struct('w',{},'i',{}); // appearance of each part
model.pdefs = struct('w',{},'i',{}); // prior of deformation (regressed)
model.gaus = struct('w',{},'i',{},'mean',{}, 'var', {}); // deformation gaussian
// '***id' is the index of '***' in 'model.***'
model.components{1} = struct('parent',{}, 'pid', {}, 'nbh_IDs', {}, ...
'biasid',{},'appid',{},'app_global_ids',{},'pdefid',{},'gauid',{},'idpr_global_ids',{});
train
mining_onneg
detect
: [box,model] = detect(neg(i), model, -1, [], 0, i, -1);
poslatent
detect
: box = detect(pos(ii), model, 0, bbox, overlap, ii, 1);
sparselen
// qp.x(:,i) = examples
// qp.i(:,i) = id
// qp.b(:,i) = bias of linear constraint
// qp.d(i) = ||qp.x(:,i)||^2
// qp.a(i) = ith dual variable
qp_prune();
qp_opt();
//...
detect
function [boxes,model,ex] = detect(iminfo, model, thresh, bbox, overlap, id, label)
The description of this function given by the author:
Detect objects in image using a model and a score threshold.
Higher threshold leads to fewer detections.
The function returns a matrix with one row per detected object. The last column of each row gives the score of the detection. The column before last specifies the component used for the detection. Each set of the first 4 columns specify the bounding box for a part.If bbox is not empty, we pick best detection with significant overlap.
If label is included, we write feature vectors to a global QP structure.
This function updates the model (by running the QP solver) if upper and lower bound differs.
im = imreadx(iminfo);
[im, bbox] = cropscale_pos(im, bbox, model.cnn.psize);
imCNNdet
: [pyra, unary_map, idpr_map] = imCNNdet(im,model,useGpu);
pyra = impyra_fun(im, model, upS);
i
-th level, p
-th part, n
-th neighbor, joint_prob
from the DCNNunary_map{i}{p} = sum(joint_prob(:,:,app_global_ids), 3);
idpr_map{i}{p}{n}(:,:,m) = sum(joint_prob(:,:,idpr_global_ids{n}{m}),3);
levels = levels(randperm(length(levels)));
// Walk from leaves to root of tree, passing message to parent
for p = p_no:-1:2
child = parts(p);
par = parts(p).parent;
parent = parts(par);
cbid = find(child.nbh_IDs == parent.pid);
pbid = find(parent.nbh_IDs == child.pid);
[msg,parts(p).Ix,parts(p).Iy,parts(p).Im{cbid},parts(par).Im{pbid}] ...
= passmsg(child, parent, cbid, pbid);
parts(par).score = parts(par).score + msg;
end
// Add bias to root score
parts(1).score = parts(1).score + parts(1).b;
rscore = parts(1).score;
function [box,ex] = backtrack(x,y,parts,pyra,ex,write)
model = optimize(model);
test_modal
function boxes = test_model(note,model,test)
Returns candidate bounding boxes after non-maximum suppression
detect_fast
: box = detect_fast(test(i), model, model.thresh, par);
similar to detect
nms_pose
: boxes{i} = nms_pose(box, overlap);
(overlap=0.3
) boxes{i} = boxes{i}(1,:);
// estimation joints and scores generated from the detected boxes
ests = conf.box2det(boxes, p_no);
// generate part stick from joints locations
for ii = 1:numel(ests)
ests(ii).sticks = conf.joint2stick(ests(ii).joints);
pos_test(ii).sticks = conf.joint2stick(pos_test(ii).joints);
end
// Evaluation and Plots the results
eval_method = {'strict_pcp', 'pdj'};
show_eval(pos_test, ests, conf, eval_method);