Dynamic Hand Gesture-Based Object Removal and Replacement in Video Frames Using Finite State Machine and PIX-MIX Algorithm

Zulaikha Syah

Authors

Zulaikha Syah malaysia Author

Keywords:

Hand gesture recognition, object replacement, finite state machine (FSM), , PIX-MIX algorithm, video frames, human-computer interaction, augmented reality, inpainting, real-time processing

Abstract

The dynamic removal and replacement of objects in video frames using hand gestures is a significant advancement in human-computer interaction, enhancing user experience in various applications such as augmented reality, gaming, and video editing. This paper presents a novel method that combines a finite state machine (FSM) with the PIX-MIX algorithm to achieve real-time, seamless object manipulation. The FSM interprets a set of predefined hand gestures, allowing users to select, remove, and replace objects dynamically within video frames. The PIX-MIX algorithm is then employed to inpaint the background and integrate new objects with high fidelity. Our approach demonstrates robustness in diverse lighting conditions and complex backgrounds, maintaining high accuracy and minimal latency. Experimental results highlight the efficiency and practicality of the proposed method, showing its potential for widespread adoption in interactive media applications.

References

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.

Simonyan, K., & Zisserman, A. (2014). Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv preprint arXiv:1409.1556.

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

Zhang, R., Isola, P., & Efros, A. A. (2016). Colorful Image Colorization. European Conference on Computer Vision (ECCV).

Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., & Efros, A. A. (2016). Context Encoders: Feature Learning by Inpainting. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

Kingma, D. P., & Ba, J. (2015). Adam: A Method for Stochastic Optimization. International Conference on Learning Representations (ICLR).

Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely Connected Convolutional Networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

Viola, P., & Jones, M. (2001). Rapid Object Detection using a Boosted Cascade of Simple Features. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

Salim, N. E. (2013). Advancements in gene expression analysis through distinguishability-based feature selection. Journal of Recent Trends in Computer Science and Engineering (JRTCSE), 1(2), 30-36

Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016). SSD: Single Shot MultiBox Detector. European Conference on Computer Vision (ECCV).