[关闭]
@zhenni94 2015-08-26T04:55:26.000000Z 字数 12749 阅读 1356

Articulated Pose Estimation by a Graphical Model with Image Dependent Pairwise Relations

project homepage: http://www.stat.ucla.edu/~xianjie.chen/projects/pose_estimation/pose_estimation.html

Paper

Graphical Model

Score Function

DCNN

Inference

Si(li|I)=U(li|I)+kK(i)maxlk,tik,tki(R(li,lk,tik,tki|I)+Sk(lk|I))

Learning


Implementation


demo.m

conf is a structure of the given global configuration. conf.pa is the index of the parent of each joint. p_no is the number of the parts(joints).
The main part of this function is shown in the following.

  1. // read data
  2. [pos_train, pos_val, pos_test, neg_train, neg_val, tsize] = LSP_data();
  3. // train dcnn
  4. train_dcnn(pos_train, pos_val, neg_train, tsize, caffe_solver_file);
  5. // train graphical model
  6. model = train_model(note, pos_val, neg_val, tsize);
  7. // testing
  8. boxes = test_model([note,'_LSP'], model, pos_test);
  9. /* ... */
  10. // evaluation
  11. show_eval(pos_test, ests, conf, eval_method);

Read data : LSP_data.m

Some variables and constants:

  1. trainval_frs_pos = 1:1000; // training frames for positive
  2. test_frs_pos = 1001:2000; // testing frames for positive
  3. trainval_frs_neg = 615:1832; // training frames for negative (of size 1218)
  4. frs_pos = cat(2, trainval_frs_pos, test_frs_pos); // frames for negative
  5. all_pos // num(pos)*1 struct array for positive
  6. // struct: im, joints, r_degree, isflip
  7. neg // num(neg)*1 struct array for negative
  8. pos_trainval = all_pos(1 : numel(trainval_frs_pos)); // training and validation image struct for positive
  9. pos_test = all_pos(numel(trainval_frs_pos)+1 : end); // testing image struct for positive

Data preparing:


Train DCNN : train_dcnn.m

Some variable and constants:

  1. mean_pixel = [128, 128, 128]; // the mean value of each pixel
  2. K = conf.K; // K = T_{ij}

Prepare patches : prepare_patches.m

Prepare the patches and derive their labels to train dcnn

K-means : get rij, tij and the labels Kc=0{c}×(×jN(i){1,2,...,Tij})
  1. // generate the labels
  2. clusters = learn_clusters(pos_train, pos_val, tsize);
  3. label_train = derive_labels('train', clusters, pos_train, tsize);
  4. label_val = derive_labels('val', clusters, pos_val, tsize);
  5. // labels for negative (dummy)
  6. dummy_label = struct('mix_id', cell(numel(neg_train), 1), ...
  7. 'global_id', cell(numel(neg_train), 1));
  8. // all the training data
  9. train_imdata = cat(1, num2cell(pos_train), num2cell(neg_train));
  10. train_labels = cat(1, num2cell(label_train), num2cell(dummy_label));
  11. // random permute the data and store it in the format of LMDB
  12. perm_idx = randperm(numel(train_imdata));
  13. train_imdata = train_imdata(perm_idx);
  14. train_labels = train_labels(perm_idx);
  15. if ~exist([cachedir, 'LMDB_train'], 'dir')
  16. store_patch(train_imdata, train_labels, psize, [cachedir, 'LMDB_train']);
  17. end
  18. // validation data for positive
  19. val_imdata = num2cell(pos_val);
  20. val_labels = num2cell(label_val);
  21. if ~exist([cachedir, 'LMDB_val'], 'dir')
  22. store_patch(val_imdata, val_labels, psize, [cachedir, 'LMDB_val']);
  23. end
Learn clusters : learn_clusters(call cluster_rp cluster relative position)
Derive labels : derive_labels(call assign_label)

Train dcnn

System call caffe to train dcnn

  1. system([caffe_root, '/build/tools/caffe train ', sprintf('-gpu %d -solver %s', ...
  2. conf.device_id, caffe_solver_file)]);

network

Get fully-convolutional net : net_surgery.m

Change the fully-connected layers to convolutional layers.

caffe matlab interface code: https://github.com/xianjiec/caffe/blob/dev/matlab/caffe/matcaffe.cpp

  1. trans_params = struct('weights', cell(numel(conv_names), 1), ...
  2. 'layer_names', cell(numel(conv_names), 1));
  3. for ii = 1:numel(conv_names)
  4. trans_params(ii).layer_names = conv_names{ii};
  5. weights = cell(2, 1);
  6. weights{1} = reshape(fc_weights(fc_layer_ids(ii)).weights{1}, size(conv_weights(conv_layer_ids(ii)).weights{1}));
  7. weights{2} = reshape(fc_weights(fc_layer_ids(ii)).weights{2}, size(conv_weights(conv_layer_ids(ii)).weights{2}));
  8. trans_params(ii).weights = weights;
  9. end

Train graphical model : train_model

build_model

Structures of parts of model:

  1. model.len = 0; // number of parameters in the model
  2. // 'i' is the index of the parameters in the whole model
  3. model.bias = struct('w',{},'i',{}); // bias
  4. model.apps = struct('w',{},'i',{}); // appearance of each part
  5. model.pdefs = struct('w',{},'i',{}); // prior of deformation (regressed)
  6. model.gaus = struct('w',{},'i',{},'mean',{}, 'var', {}); // deformation gaussian
  7. // '***id' is the index of '***' in 'model.***'
  8. model.components{1} = struct('parent',{}, 'pid', {}, 'nbh_IDs', {}, ...
  9. 'biasid',{},'appid',{},'app_global_ids',{},'pdefid',{},'gauid',{},'idpr_global_ids',{});

Train graphical model: train

  1. // qp.x(:,i) = examples
  2. // qp.i(:,i) = id
  3. // qp.b(:,i) = bias of linear constraint
  4. // qp.d(i) = ||qp.x(:,i)||^2
  5. // qp.a(i) = ith dual variable
  6. qp_prune();
  7. qp_opt();
  8. //...
Detect objects in image : detect

function [boxes,model,ex] = detect(iminfo, model, thresh, bbox, overlap, id, label)

The description of this function given by the author:

Detect objects in image using a model and a score threshold.
Higher threshold leads to fewer detections.
The function returns a matrix with one row per detected object. The last column of each row gives the score of the detection. The column before last specifies the component used for the detection. Each set of the first 4 columns specify the bounding box for a part.

If bbox is not empty, we pick best detection with significant overlap.
If label is included, we write feature vectors to a global QP structure.
This function updates the model (by running the QP solver) if upper and lower bound differs.

  1. // Walk from leaves to root of tree, passing message to parent
  2. for p = p_no:-1:2
  3. child = parts(p);
  4. par = parts(p).parent;
  5. parent = parts(par);
  6. cbid = find(child.nbh_IDs == parent.pid);
  7. pbid = find(parent.nbh_IDs == child.pid);
  8. [msg,parts(p).Ix,parts(p).Iy,parts(p).Im{cbid},parts(par).Im{pbid}] ...
  9. = passmsg(child, parent, cbid, pbid);
  10. parts(par).score = parts(par).score + msg;
  11. end
  12. // Add bias to root score
  13. parts(1).score = parts(1).score + parts(1).b;
  14. rscore = parts(1).score;

Test Model: test_modal

function boxes = test_model(note,model,test)
Returns candidate bounding boxes after non-maximum suppression

Evaluation

  1. // estimation joints and scores generated from the detected boxes
  2. ests = conf.box2det(boxes, p_no);
  3. // generate part stick from joints locations
  4. for ii = 1:numel(ests)
  5. ests(ii).sticks = conf.joint2stick(ests(ii).joints);
  6. pos_test(ii).sticks = conf.joint2stick(pos_test(ii).joints);
  7. end
  8. // Evaluation and Plots the results
  9. eval_method = {'strict_pcp', 'pdj'};
  10. show_eval(pos_test, ests, conf, eval_method);
添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注