Human Action Recognition (HAR) is a challenging domain in computer vision, involving recognizing complex patterns by analyzing the spatiotemporal dynamics of individuals’ movements in videos. These patterns arise in sequential data, such as video frames, which are often essential to accurately distinguish actions that would be ambiguous in a single image. HAR has garnered considerable interest due to its broad applicability, ranging from robotics and surveillance systems to sports motion analysis, healthcare, and the burgeoning field of autonomous vehicles. While several taxonomies have been proposed to categorize HAR approaches in surveys, they often overlook hybrid methodologies and fail to demonstrate how different models incorporate various architectures and modalities. In this comprehensive survey, we present the novel SMART-Vision taxonomy, which illustrates how innovations in deep learning for HAR complement one another, leading to hybrid approaches beyond traditional categories. Our survey provides a clear roadmap from foundational HAR works to current state-of-the-art systems, highlighting emerging research directions and addressing unresolved challenges in discussion sections for architectures within the HAR domain. We provide details of the research datasets that various approaches used to measure and compare HAR approaches. We also explore the rapidly emerging field of Open-HAR systems, which challenges HAR systems by presenting samples from unknown, novel classes during test-time.
Tạp chí khoa học Trường Đại học Cần Thơ
Lầu 4, Nhà Điều Hành, Khu II, đường 3/2, P. Xuân Khánh, Q. Ninh Kiều, TP. Cần Thơ
Điện thoại: (0292) 3 872 157; Email: tapchidhct@ctu.edu.vn
Chương trình chạy tốt nhất trên trình duyệt IE 9+ & FF 16+, độ phân giải màn hình 1024x768 trở lên