@zhenni94
2015-08-26T04:55:26.000000Z
字数 12749
阅读 1542
project homepage: http://www.stat.ucla.edu/~xianjie.chen/projects/pose_estimation/pose_estimation.html
Full score function:
Unary Term
conf is a structure of the given global configuration. conf.pa is the index of the parent of each joint. p_no is the number of the parts(joints).
The main part of this function is shown in the following.
// read data[pos_train, pos_val, pos_test, neg_train, neg_val, tsize] = LSP_data();// train dcnntrain_dcnn(pos_train, pos_val, neg_train, tsize, caffe_solver_file);// train graphical modelmodel = train_model(note, pos_val, neg_val, tsize);// testingboxes = test_model([note,'_LSP'], model, pos_test);/* ... */// evaluationshow_eval(pos_test, ests, conf, eval_method);
LSP_data.mSome variables and constants:
trainval_frs_pos = 1:1000; // training frames for positivetest_frs_pos = 1001:2000; // testing frames for positivetrainval_frs_neg = 615:1832; // training frames for negative (of size 1218)frs_pos = cat(2, trainval_frs_pos, test_frs_pos); // frames for negativeall_pos // num(pos)*1 struct array for positive// struct: im, joints, r_degree, isflipneg // num(neg)*1 struct array for negativepos_trainval = all_pos(1 : numel(trainval_frs_pos)); // training and validation image struct for positivepos_test = all_pos(numel(trainval_frs_pos)+1 : end); // testing image struct for positive
Data preparing:
lsp_pc2oc : function joints = lsp_pc2oc(joints) : convert to person-centricpos_trainval(ii).joints = Trans * pos_trainval(ii).joints; Create ground truth joints for model training. Augment the original 14 joint positions with midpoints of joints, defining a total of 26 jointsadd_flip : flip trainval images (horizontally) (#pos_trainval *= 2)init_scale : init dataset specific parametersadd_rotate : rotate trainval images (every val_id = randperm(numel(pos_trainval), 2000); : split training and validation data for positive (random choose 2000 image from the pos_trainval to be the validation set, #training = #pos_trianval - 2000 = 78000)
val_id = randperm(numel(neg), 500); split training and validation data for negtive (random choose 500 image from the neg to be the validation set, #neg_val = #neg - #neg_train = 1218 - 500 = 728)
add_flip : flip the negative data (#neg_val *= 2; #neg_train *= 2)train_dcnn.mSome variable and constants:
mean_pixel = [128, 128, 128]; // the mean value of each pixelK = conf.K; // K = T_{ij}
prepare_patches.mPrepare the patches and derive their labels to train dcnn
// generate the labelsclusters = learn_clusters(pos_train, pos_val, tsize);label_train = derive_labels('train', clusters, pos_train, tsize);label_val = derive_labels('val', clusters, pos_val, tsize);// labels for negative (dummy)dummy_label = struct('mix_id', cell(numel(neg_train), 1), ...'global_id', cell(numel(neg_train), 1));// all the training datatrain_imdata = cat(1, num2cell(pos_train), num2cell(neg_train));train_labels = cat(1, num2cell(label_train), num2cell(dummy_label));// random permute the data and store it in the format of LMDBperm_idx = randperm(numel(train_imdata));train_imdata = train_imdata(perm_idx);train_labels = train_labels(perm_idx);if ~exist([cachedir, 'LMDB_train'], 'dir')store_patch(train_imdata, train_labels, psize, [cachedir, 'LMDB_train']);end// validation data for positiveval_imdata = num2cell(pos_val);val_labels = num2cell(label_val);if ~exist([cachedir, 'LMDB_val'], 'dir')store_patch(val_imdata, val_labels, psize, [cachedir, 'LMDB_val']);end
learn_clusters(call cluster_rp cluster relative position)nbh_IDs = get_IDs(pa, K);: get the neighbor of each part(joint)clusters{ii}: cell : the mean relative postion of ii-th partX(ii,:) = norm_rp(imdata(ii), cur, nbh, tsize); relative position for ii-th data itemmean_X = mean(X(valid_idx,:),1); normX = bsxfun(@minus, X(valid_idx,:), mean_X); centralize (normalize) the relative positionR trials of the k-means algorithm and choose the one has the smallest distance [gInd{trial}, cen{trial}, sumdist(trial)] = k_means(normX, K); imgid(all the img belongs to the cluster k) of clusters{cur}{n}(k), where clusters{cur}{n}(k) is the k-th cluster of n-th neighbor of the cur-th joint.derive_labels(call assign_label)labels: a array of struct : mix_id, global_id, near, invalidK : get_id: nbh_IDs{ii}: get the neighors of ii-th part, target_IDs{ii} : indexes of ii-th neighbors in global_IDs{ii} : the labels : n-th neighor of p-th part (joint) in ii-th data image. nbh_idx = nbh_IDs{p}(n); labels(ii).mix_id{p}(n) the index of nearest cluster, labels(ii).near{p}{n}: the index of near clusters (dist < 3*dist(nearest))labels(ii).invalid(p) : for check and debuglabels(ii).global_id : translate the mix_id{p} to global_id(p)System call caffe to train dcnn
system([caffe_root, '/build/tools/caffe train ', sprintf('-gpu %d -solver %s', ...conf.device_id, caffe_solver_file)]);

net_surgery.mChange the fully-connected layers to convolutional layers.
caffe matlab interface code: https://github.com/xianjiec/caffe/blob/dev/matlab/caffe/matcaffe.cpp
caffe('reset'); caffe('init', deploy_file, model_file);fc_weights = caffe('get_weights');: all the weights in the networksfc_layer_ids(ii): the index of the ii-th fully connected layer in the original networkcaffe('reset'); caffe('init', deploy_conv_file, model_file);conv_weights = caffe('get_weights'); all the weights in the fcn conv_layer_ids(ii): the index of the ii-th fully connected layer in fcnweights{1} : weightsweights{2} : bias
trans_params = struct('weights', cell(numel(conv_names), 1), ...'layer_names', cell(numel(conv_names), 1));for ii = 1:numel(conv_names)trans_params(ii).layer_names = conv_names{ii};weights = cell(2, 1);weights{1} = reshape(fc_weights(fc_layer_ids(ii)).weights{1}, size(conv_weights(conv_layer_ids(ii)).weights{1}));weights{2} = reshape(fc_weights(fc_layer_ids(ii)).weights{2}, size(conv_weights(conv_layer_ids(ii)).weights{2}));trans_params(ii).weights = weights;end
caffe('set_weights', trans_params); caffe('save', fully_conv_model_file);train_modellabel_val : the label of validation data for positive (struct: mix_id, global_id, near, invalid)build_model: prepare the weights parameter in the formula of full score for SVMtrain : model = train(cls, model, pos_val, neg_val, 1); Use validation set to train SVMbias: apps: pdef: 0.01, gaus: Structures of parts of model:
model.len = 0; // number of parameters in the model// 'i' is the index of the parameters in the whole modelmodel.bias = struct('w',{},'i',{}); // biasmodel.apps = struct('w',{},'i',{}); // appearance of each partmodel.pdefs = struct('w',{},'i',{}); // prior of deformation (regressed)model.gaus = struct('w',{},'i',{},'mean',{}, 'var', {}); // deformation gaussian// '***id' is the index of '***' in 'model.***'model.components{1} = struct('parent',{}, 'pid', {}, 'nbh_IDs', {}, ...'biasid',{},'appid',{},'app_global_ids',{},'pdefid',{},'gauid',{},'idpr_global_ids',{});
trainmining_onneg detect: [box,model] = detect(neg(i), model, -1, [], 0, i, -1);poslatent detect: box = detect(pos(ii), model, 0, bbox, overlap, ii, 1); sparselen
// qp.x(:,i) = examples// qp.i(:,i) = id// qp.b(:,i) = bias of linear constraint// qp.d(i) = ||qp.x(:,i)||^2// qp.a(i) = ith dual variableqp_prune();qp_opt();//...
detectfunction [boxes,model,ex] = detect(iminfo, model, thresh, bbox, overlap, id, label)
The description of this function given by the author:
Detect objects in image using a model and a score threshold.
Higher threshold leads to fewer detections.
The function returns a matrix with one row per detected object. The last column of each row gives the score of the detection. The column before last specifies the component used for the detection. Each set of the first 4 columns specify the bounding box for a part.If bbox is not empty, we pick best detection with significant overlap.
If label is included, we write feature vectors to a global QP structure.
This function updates the model (by running the QP solver) if upper and lower bound differs.
im = imreadx(iminfo);[im, bbox] = cropscale_pos(im, bbox, model.cnn.psize);imCNNdet: [pyra, unary_map, idpr_map] = imCNNdet(im,model,useGpu); pyra = impyra_fun(im, model, upS); i-th level, p-th part, n-th neighbor, joint_prob from the DCNNunary_map{i}{p} = sum(joint_prob(:,:,app_global_ids), 3);idpr_map{i}{p}{n}(:,:,m) = sum(joint_prob(:,:,idpr_global_ids{n}{m}),3);levels = levels(randperm(length(levels)));
// Walk from leaves to root of tree, passing message to parentfor p = p_no:-1:2child = parts(p);par = parts(p).parent;parent = parts(par);cbid = find(child.nbh_IDs == parent.pid);pbid = find(parent.nbh_IDs == child.pid);[msg,parts(p).Ix,parts(p).Iy,parts(p).Im{cbid},parts(par).Im{pbid}] ...= passmsg(child, parent, cbid, pbid);parts(par).score = parts(par).score + msg;end// Add bias to root scoreparts(1).score = parts(1).score + parts(1).b;rscore = parts(1).score;
function [box,ex] = backtrack(x,y,parts,pyra,ex,write)model = optimize(model);test_modalfunction boxes = test_model(note,model,test)
Returns candidate bounding boxes after non-maximum suppression
detect_fast : box = detect_fast(test(i), model, model.thresh, par); similar to detectnms_pose : boxes{i} = nms_pose(box, overlap); (overlap=0.3) boxes{i} = boxes{i}(1,:);
// estimation joints and scores generated from the detected boxesests = conf.box2det(boxes, p_no);// generate part stick from joints locationsfor ii = 1:numel(ests)ests(ii).sticks = conf.joint2stick(ests(ii).joints);pos_test(ii).sticks = conf.joint2stick(pos_test(ii).joints);end// Evaluation and Plots the resultseval_method = {'strict_pcp', 'pdj'};show_eval(pos_test, ests, conf, eval_method);