Unveiling the Hidden Patterns: AI-Driven Innovations in Image Processing and Acoustic Signal Detection

Hemanth Kumar Gollangi; Sanjay Ramdas Bauskar; Chandrakanth Rao Madhavaram; Eswar Prasad Galla; Janardhana Rao Sunkara; Mohit Surender Reddy

doi:10.70589/JRTCSE.2020.1.3

Authors

Hemanth Kumar Gollangi Servicenow Admin, TTech Digital India Limited Author
Sanjay Ramdas Bauskar Sr. Database Administrator, Pharmavite LLC. Author
Chandrakanth Rao Madhavaram Technology Lead, Infosys Author
Eswar Prasad Galla Senior Support Engineer, Infosys Author
Janardhana Rao Sunkara Sr. Oracle Database Administrator, Siri Info Solutions Inc. Author
Mohit Surender Reddy Sr Network Engineer, Motorola Solutions Author

DOI:

https://doi.org/10.70589/JRTCSE.2020.1.3

Keywords:

AI, Image Processing, Acoustic Signal Detection, CNN, RNN, Deep Learning, Pattern Recognition, Feature Extraction

Abstract

Image processing, as well as acoustic signal detection, have had major enhancements over the years, and this is due to AI. In the past, most algorithms involved using basic signal processing where features needed to be extracted manually and then various rules were applied when the data grew large. Deep learning models, for example, provide a durable solution to ventilation by eliminating the need for manual feature engineering as well as improving the detection rate in areas of health, surveillance and even industrial applications. This paper offers a comprehensive analysis of the emerging innovation driven by Advanced Intelligence in the field of image processing and the detection of acoustic signals with regard to the substrate patterns identified by AI technologies such as Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), as well as other sophisticated algorithms. The paper also describes how AI, when combined with image processing and acoustic detection, can add more value to the results being produced. Due to the large number of cases and training data, patterns can be learned and are as follows: image classification, object detection, process anomaly detection in industrial systems, as well as acoustic event recognition in noisy environments. The paper aims to provide an understanding of the AI methodologies adopted in both domains and, to this end, offers examples of specific industries and rationales for their implementation of these technologies. An extensive discussion of the basics of neural networks and their modifications is provided, with emphasis on the application of those structures for automated image feature extraction and acoustic pattern recognition. We also study the issues of comparison, accuracy, computational complexity, and the ability of AI models to function in similar conditions. This article also seeks to present how AI models can be enhanced by integrating image processing with acoustic signal detection methods and should produce possible research directions for increasing AI performance. Finally, the authors recap the main findings, provide information about advanced methods in their field, and show some possible future uses in self-driving cars, robots and drones, and meteorological monitoring.

References

Basavaprasad, B., & Ravi, M. (2014). A study on the importance of image processing and its applications. IJRET: International Journal of Research in Engineering and Technology, 3(1).

Skarbnik, N., Zeevi, Y. Y., & Sagiv, C. (2009). The importance of phase in image processing. Technion-Israel Institute of Technology, Faculty of Electrical Engineering.

Abraham, D. A. (2019). Underwater acoustic signal processing: modeling, detection, and estimation. Springer.

Adrián-Martínez, S., Bou-Cabo, M., Felis, I., Llorens, C. D., Martínez-Mora, J. A., Saldaña, M., & Ardid, M. (2015). Acoustic signal detection through the cross-correlation method in experiments with different signal-to-noise ratio and reverberation conditions. In Ad-hoc Networks and Wireless: ADHOC-NOW 2014 International Workshops, ETSD, MARSS, MWaoN, SecAN, SSPA, and WiSARN, Benidorm, Spain, June 22--27, 2014, Revised Selected Papers 13 (pp. 66-79). Springer Berlin Heidelberg.

Canny, J. (1986). A computational approach to edge detection. IEEE Transactions on pattern analysis and machine intelligence, (6), 679-698.

Otsu, N. (1975). A threshold selection method from gray-level histograms. Automatica, 11(285-296), 23-27.

Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257-286.

LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., & Jackel, L. D. (1989). Backpropagation applied to handwritten zip code recognition. Neural Computation, 1(4), 541-551.

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25.

Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... & Bengio, Y. (2020). Generative adversarial networks. Communications of the ACM, 63(11), 139-144.

Ronneberger, O., Fischer, P., & Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18 (pp. 234-241). Springer International Publishing.

Graves, A., Mohamed, A. R., & Hinton, G. (2013, May). Speech recognition with deep recurrent neural networks. In 2013 IEEE international conference on acoustics, speech and signal processing (pp. 6645-6649). IEEE.

Li, M., Li, X., Gao, C., & Song, Y. (2019). Acoustic microscopy signal processing method for detecting near-surface defects in metal materials. Ndt & E International, 103, 130-144.

Hochreiter, S. (1997). Long Short-term Memory. Neural Computation MIT-Press.

Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555.

Ngiam, J., Khosla, A., Kim, M., Nam, J., Lee, H., & Ng, A. Y. (2011). Multimodal deep learning. In Proceedings of the 28th International Conference on Machine Learning (ICML-11) (pp. 689-696).

Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009, June). Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition (pp. 248-255). IEEE.

Song, Z., Bian, H., & Zielinski, A. (2016). Application of acoustic image processing in underwater terrain aided navigation. Ocean Engineering, 121, 279-290.