Hog detector for hand recognition
Hello,
I am trying to detect hand in images thanks to Hog detection and SVM network.
Is it a good idea? Because I try with dataset and it doesn't work properly ...
Edit:
So basically I use that code: https://github.com/lcit/people_detect...
I just modify the database I use , I used that one: https://expirebox.com/download/f61af2...
I modify the training cpp file like this:
/* =========================================================================
Author: Leonardo Citraro
Company:
Filename: training.cpp
Last modifed: 22.12.2016 by Leonardo Citraro
Description: Training of the classifier using the HOG feature
=========================================================================
=========================================================================
*/
#include "HOG.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/ml/ml.hpp"
#include "opencv2/core/utility.hpp"
#include "opencv2/imgcodecs.hpp"
#include <iostream>
#include <fstream>
#include <vector>
#include <algorithm>
#include <memory>
#include <random>
#include <functional>
#include <ctime>
#include <iomanip>
#include <math.h>
static int MatTYPE = CV_32FC1;
using TYPE = float;
TYPE compute_mean(std::vector<TYPE> v) {
return std::accumulate(std::begin(v), std::end(v), 0.0f)/v.size();
}
void feature_mean_variance(const cv::Mat& data, std::vector<TYPE>& mean, std::vector<TYPE>& var) {
mean.resize(data.cols);
var.resize(data.cols);
for(size_t col=0; col<data.cols; ++col) {
std::vector<TYPE> feature(data.rows);
for(size_t i = 0; i < data.rows; ++i) {
const TYPE* ptr_row = data.ptr<TYPE>(i);
feature[i] = ptr_row[col];
}
TYPE m = std::accumulate(std::begin(feature), std::end(feature), 0.0)/feature.size();
mean[col] = m;
std::vector<TYPE> diff(data.rows);
std::transform(std::begin(feature), std::end(feature), std::begin(diff), std::bind2nd(std::minus<TYPE>(), m));
TYPE v = std::inner_product(std::begin(diff), std::end(diff), std::begin(diff), 0.0)/feature.size();
var[col] = v;
}
}
template<class T>
void save_vector( const std::string& filename, const std::vector<T>& v ) {
try {
std::ofstream f(filename, std::ios::binary);
unsigned int len = v.size();
f.write( (char*)&len, sizeof(len) );
f.write( (const char*)&v[0], len * sizeof(T) );
f.close();
} catch(...) {
throw;
}
}
int main(int argc, char* argv[]) {
// size of the box that should contain a person
cv::Size person_size(50,75);
// setting up the HOG
size_t cellsize = 5;
size_t blocksize = cellsize*2;
size_t stride = cellsize;
size_t binning = 9;
HOG hog(blocksize, cellsize, stride, binning, HOG::GRADIENT_SIGNED, HOG::BLOCK_NORM::L2norm);
hog.save("hog.ext");
// matrix of data and labels
std::vector<std::vector<TYPE>> data;
std::vector<int> labels;
// open the subimages of the persons one by one
for(int i=1; i<400; ++i){
std::string filename;
if(i<10){
filename = "/home/xavier/Bureau/developpement/NeuralNetwork/dataset/hand/Marcel-Train/Five/Five-train00" + std::to_string(i) + ".ppm";
}else if(i<100){
filename = "/home/xavier/Bureau/developpement/NeuralNetwork/dataset/hand/Marcel-Train/Five/Five-train0" + std::to_string(i) + ".ppm";
}else if(i<1000){
filename = "/home/xavier/Bureau/developpement/NeuralNetwork/dataset/hand/Marcel-Train/Five/Five-train" + std::to_string(i) + ".ppm";
}
try {
//std::cout << "debut filename" << filename << std::endl;
// open and display an image
cv::Mat image = cv::imread(filename, CV_8U);
cv::resize(image,image,person_size);
/*cv::imshow("image",image);
cv::waitKey(-1);*/
if(image.data) {
//std::cout << "la0" << std::endl;
// Retrieve the ...
How about this one? hog detector
Consider showing exactly what you have tried and your results thus far. It will be hard for others to assist without you providing more information. Additionally, if you show effort in describing what you are trying to solve, others will be far more inclined to assist you.
you should probably use opencv's training tool for this.
above code looks quite shoddy. (code repetition, it's also doing a classification, while you need a regression for the detection case)
and, my 2ct.: unless you restrict it to a single pose, chances of success are low.
The images should have a multiply of 8 for the size?
not sure, but probably yes.
I have also wrong behaviors. What should be the size of the dataset?
you mean, the size of the images ? the positives should be cropped /resized to the HOGDetector's winSize.
(that's also the minimum size, that can be detected later)
the negatives might be large, it will automatically sample a region from that
Where could I get example of use of that train_HOG example?
what do you mean ? a pretrained svm detector ?
maybe you can find this post and this repo useful