Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

A naive approach to evaluate exercise performance--- How to make it better?

Hi, I made a small extension of OpenPose with tensorflow.

It can detect and evaluate a user's pose while performing an exercise. My ultimate goal is to create a system that can run on mobile devices.

Here's the demo on YouTube: https://www.youtube.com/watch?v=TRLYHUn8yJ4

https://www.youtube.com/watch?v=gpA6o7hF57s

Description: The text terminal in the middle will send countdown message to user. When the exercise session starts, the window on the right will show real time image of the user.

After the capture session is over, the window on the left will show captured images frame by frame and label them with"correct / wrong" labels.

Finally the text terminal will show the overall performance of the user.


The movements during a specific exercise should be repeatable and the rthythm should be the same. So I set the time duration to 4.0 sec and separate it by 0.2 sec to get 20 durations. I created 20 corresponding softmax models for these durations. The pose (which is 18 sets of keypoint coordinates) is the input and the output is either 0(correct) or 1(wrong). I trained these models with my own data(each data set have 2000 samples) And the final result is what you've seen in the Demo video.

It certainly works and have potential to be transplanted to mobile platforms, but not good enough. The false positive and negative results are quite obvious.

So here are my questions: Did any one created something similar to this?

How can I make it better with OpenCV techniques like affine transform?

PS:I did studied the "action recognition" systems. But they are too expensive to run on a mobile device.

A naive approach to evaluate exercise performance--- How to make it better?

Hi, I made a small extension of OpenPose with tensorflow.

It can detect and evaluate a user's pose while performing an exercise. My ultimate goal is to create a system that can run on mobile devices.

Here's the demo on YouTube: https://www.youtube.com/watch?v=TRLYHUn8yJ4

https://www.youtube.com/watch?v=gpA6o7hF57s

Description: The text terminal in the middle will send countdown message to user. When the exercise session starts, the window on the right will show real time image of the user.

After the capture session is over, the window on the left will show captured images frame by frame and label them with"correct / wrong" labels.

Finally the text terminal will show the overall performance of the user.


The movements during a specific exercise should be repeatable and the rthythm should be the same. So I set the time duration to 4.0 sec and separate it by 0.2 sec to get 20 durations. I created 20 corresponding softmax models for these durations. The pose (which is 18 sets of keypoint coordinates) is the input and the output is either 0(correct) or 1(wrong). I trained these models with my own data(each data set have 2000 samples) And the final result is what you've seen in the Demo video.

It certainly works and have potential to be transplanted to mobile platforms, but not good enough. The false positive and negative results are quite obvious.

So here are my questions: Did any one created something similar to this?

How can I make it better with OpenCV techniques like affine transform?

PS:I did studied the "action recognition" systems. But they are too expensive to run on a mobile device.


Edited 2018.08.18 There are two noteworthy projects on Github:

https://github.com/ildoonet/tf-pose-estimation <==in this project, the author used MobileNet backbone to boost up the speed.

https://github.com/tensorflow/tfjs-models/tree/master/posenet <==This is not OpenPose but have similar output. The speed of this project is very fast.

A naive approach to evaluate exercise performance--- How to make it better?

Hi, I made a small extension of OpenPose with tensorflow.

It can detect and evaluate a user's pose while performing an exercise. My ultimate goal is to create a system that can run on mobile devices.

Here's the demo on YouTube: https://www.youtube.com/watch?v=TRLYHUn8yJ4

https://www.youtube.com/watch?v=gpA6o7hF57s

Description: The text terminal in the middle will send countdown message to user. When the exercise session starts, the window on the right will show real time image of the user.

After the capture session is over, the window on the left will show captured images frame by frame and label them with"correct / wrong" labels.

Finally the text terminal will show the overall performance of the user.


The movements during a specific exercise should be repeatable and the rthythm should be the same. So I set the time duration to 4.0 sec and separate it by 0.2 sec to get 20 durations. I created 20 corresponding softmax models for these durations. The pose (which is 18 sets of keypoint coordinates) is the input and the output is either 0(correct) or 1(wrong). I trained these models with my own data(each data set have 2000 samples) And the final result is what you've seen in the Demo video.

It certainly works and have potential to be transplanted to mobile platforms, but not good enough. The false positive and negative results are quite obvious.

So here are my questions: Did any one created something similar to this?

How can I make it better with OpenCV techniques like affine transform?

PS:I did studied the "action recognition" systems. But they are too expensive to run on a mobile device.


Edited 2018.08.18 There are two noteworthy projects on Github:

https://github.com/ildoonet/tf-pose-estimation <==in this project, the author used MobileNet backbone to boost up the speed.

https://github.com/tensorflow/tfjs-models/tree/master/posenet <==This is not OpenPose but have similar output. The speed of this project is very fast.