May 9, 2021
The following text is a bit longer and technical. Don't worry if you're not familiar with the concepts.
This work aimed to create a system for gesture recognition using skeletal information. The first part of the thesis presented state-of-the-art object detection and hand pose estimation methods along with a short review of gesture recognition. Available datasets for these tasks were listed as well. This work then proposed a system for gesture recognition from skeletal features represented by 21 key points. The developed system consists of three fundamental stages: detection, pose estimation, and gesture recognition. Each of these stages was evaluated separately on public datasets. A dataset consisting of four thousand images was captured to evaluate the proposed system, which was also evaluated in real-time.
The Tiny YOLOv3 was selected for object detection for its speed and robustness. It was trained on the HandSeg dataset with great success, achieving an IoU score of 86.35~\%. A series of preprocessing steps were developed for background removal and proper normalization. The preprocessed image is the input of the hand pose estimator, JGR-P2O, which infers the precise position of the hand's skeleton. The evaluation demonstrated that the estimator's performance is highly dependent on the dataset. The low variety of poses in the dataset causes the model to generalize poorly to unseen poses in a natural environment. Several experiments were conducted, some of which improved the estimator's performance. On the MSRA dataset, the model reaches a mean joint error of 14.7 mm, while on the Bighand, a more complex dataset, achieves roughly 24.9 mm. Testing revealed that the estimator works better for the right hand than the left, which is due to the composition of the training dataset.
The proposed skeleton-based gesture recognition system allows the user to specify the target gesture and determine the hand's maximum rotation relative to that target gesture. On the other hand, the disadvantage of this solution is the need for accurate determination of the skeleton, which proves to be a challenging task.
Each of these stages wasere evaluated separately on public datasets.
Not a big deal, a lot of native speakers would use was here too, but for the sake of correctness
It was trained on the HandSeg dataset with great success, achieving an IoU score of 86.35~\%.
Just a note on the score ~\% looks weird (unless that is a syntax specifically used for IoU scores). Is the \ just an escape character? The tilde (~) usually means around and would go before the number, but that doesn't really seem right when you're talking about a precise number.
On the MSRA dataset, the model reaches a mean joint error of 14.7 mm, while on the Bighand, a more complex dataset, achieves roughly 24.9 mm.
just removed the spacing between number and mm, that's how i see it in most writing
Bachelor's Thesis - Conclusion |
The following text is a bit longer and technical. |
Don't worry if you're not familiar with the concepts. |
This work aimed to create a system for gesture recognition using skeletal information. |
The first part of the thesis presented state-of-the-art object detection and hand pose estimation methods along with a short review of gesture recognition. |
Available datasets for these tasks were listed as well. |
This work then proposed a system for gesture recognition from skeletal features represented by 21 key points. |
The developed system consists of three fundamental stages: detection, pose estimation, and gesture recognition. |
Each of these stages was evaluated separately on public datasets. Each of these stages w Not a big deal, a lot of native speakers would use was here too, but for the sake of correctness |
A dataset consisting of four thousand images was captured to evaluate the proposed system, which was also evaluated in real-time. |
The Tiny YOLOv3 was selected for object detection for its speed and robustness. |
It was trained on the HandSeg dataset with great success, achieving an IoU score of 86.35~\%. It was trained on the HandSeg dataset with great success, achieving an IoU score of 86.35~\%. Just a note on the score ~\% looks weird (unless that is a syntax specifically used for IoU scores). Is the \ just an escape character? The tilde (~) usually means around and would go before the number, but that doesn't really seem right when you're talking about a precise number. |
A series of preprocessing steps were developed for background removal and proper normalization. |
The preprocessed image is the input of the hand pose estimator, JGR-P2O, which infers the precise position of the hand's skeleton. |
The evaluation demonstrated that the estimator's performance is highly dependent on the dataset. |
The low variety of poses in the dataset causes the model to generalize poorly to unseen poses in a natural environment. |
Several experiments were conducted, some of which improved the estimator's performance. |
On the MSRA dataset, the model reaches a mean joint error of 14.7 mm, while on the Bighand, a more complex dataset, achieves roughly 24.9 mm. On the MSRA dataset, the model reaches a mean joint error of 14.7 just removed the spacing between number and mm, that's how i see it in most writing |
Testing revealed that the estimator works better for the right hand than the left, which is due to the composition of the training dataset. |
The proposed skeleton-based gesture recognition system allows the user to specify the target gesture and determine the hand's maximum rotation relative to that target gesture. |
On the other hand, the disadvantage of this solution is the need for accurate determination of the skeleton, which proves to be a challenging task. |
You need LangCorrect Premium to access this feature.
Go Premium