Action Tube Generation by Person Query Matching for Spatio-Temporal Action Detection Topics: Deep Learning for Visual Understanding ; Event and Human Activity Recognition In Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2 VISAPP: VISAPP, 261-268, 2025 , Porto, Portugal