
Transactions on Intelligent Transportation Systems, 
12(4), 1292–1304. doi:10.1109/TITS.2011.2158424. 
Kitani, K. M., Ziebart, B. D., Bagnell, J. A., and Hebert, 
M. (n.d.). Activity Forecasting, 1–14. 
Koppula, H. S. (2013). Learning Spatio-Temporal 
Structure from RGB-D Videos for Human Activity 
Detection and Anticipation, 28. 
Koppula, H., and Saxena, A. (2013). Anticipating Human 
Activities using Object Affordances for Reactive 
Robotic Response. Robotics: Science and Systems. 
Li, K., Hu, J., and Fu, Y. (2012). Modeling complex 
temporal composition of actionlets for activity 
prediction. Computer Vision–ECCV 2012, 286–299. 
Liu, N., Lovell, B. C., Kootsookos, P. J., Davis, R. I. A., 
Imaging, I. R., and Group, S. I. (n.d.). Understanding 
HMM Training for Video Gesture Recognition School 
of Information Technology and Electrical Engineering, 
(Figure 2), 2–5. 
Lopes, P.F. Jardim, D. Alexandre, I.M. , Math4Kids, 
Information Systems and Technologies (CISTI), 2011 
6th Iberian Conference on , vol., no., pp.1-6, 15-18 
June 2011. 
Moore, D. (n.d.). Recognizing Multitasked Activities from 
Video using Stochastic Context-Free Grammar Intro-
duction and Related Work Representation using SCFG 
The Earley-Stolcke Parsing AAAI-02, 770–776. 
Nevatia, Ram Zhao, Tao Hongeng, Somboon, Hierarchical 
Language-based Representation of Events in Video 
Streams, Computer Vision and Pattern Recognition 
Workshop, 2003. CVPRW '03. Conference on , vol.4, 
no., pp.39, 16-22 June 2003 doi: 
10.1109/CVPRW.2003.10038. 
Nguyen, N.T. Phung, D.Q. Venkatesh, S. Bui, H., 
Learning and detecting activities from movement 
trajectories using the hierarchical hidden Markov 
model," Computer Vision and Pattern Recognition, 
2005. CVPR 2005. IEEE Computer Society 
Conference on , vol.2, no., pp. 955- 960 vol. 2, 20-25 
June 2005 doi: 10.1109/CVPR.2005.203. 
Niu, W., Long, J., Han, D., Wang, Y., and Barbara, S. 
(n.d.). Human Activity Detection and Recognition for 
Video Surveillance, 1–4. 
 Oliver, N. Horvitz, E. Garg, A., Layered representations 
for human activity recognition, Multimodal Interfaces, 
2002. Proceedings. Fourth IEEE International 
Conference on, vol., no., pp. 3- 8, 2002 doi: 10.1109/ 
ICMI.2002.1166960. 
O'Rourke, J. and N. I. Badler. 1980. Model-based image 
analysis of human motion using constraint 
propagation. IEEE PAMI, 2(4). 
Pentland, A. and Liu, A. (1999). Modeling and prediction 
of human behavior. Neural computation, 11(1), 229–
42. Retrieved from http://www.ncbi.nlm.nih.gov/ 
pubmed/9950731. 
Pinhanez, C.S., Bobick, A.F., Human action detection 
using PNF propagation of temporal constraints, 
Computer Vision and Pattern Recognition, 1998. 
Proceedings. 1998 IEEE Computer Society 
Conference on, vol., no., pp.898-904, 23-25 Jun 1998 
doi: 10.1109/CVPR.1998.698711. 
Popa, M., Koc, A. K., Rothkrantz, L. J. M., Shan, C., and 
Wiggers, P. (n.d.). Kinect Sensing of Shopping related 
Actions. 
Rashid, Rick. 1980. LIGHTS: a system for interpretation 
of moving light displays. Ph.D. thesis, University of 
Rochester Computer Science Department. 
Ryoo, M. (2011). Human activity prediction: Early 
recognition of ongoing activities from streaming 
videos. Computer Vision (ICCV), 2011 IEEE, (Iccv). 
Retrieved from http://ieeexplore.ieee.org/xpls/abs_all. 
jsp?arnumber=6126349. 
Ryoo, M. S., and Aggarwal, J. K. (2008). Semantic 
Representation and Recognition of Continued and 
Recursive Human Activities. International Journal of 
Computer Vision, 82(1), 1–24. doi:10.1007/s11263-
008-0181-1. 
Ryoo, M.S. Aggarwal, J.K., Semantic Understanding of 
Continued and Recursive Human Activities, Pattern 
Recognition, 2006. ICPR 2006. 18th International 
Conference on, vol.1, no., pp.379-378, 0-0 0 doi: 
10.1109/ICPR.2006.1043. 
Ryoo, M.S., Aggarwal, J.K. , Recognition of Composite 
Human Activities through Context-Free Grammar 
Based Representation, Computer Vision and Pattern 
Recognition, 2006 IEEE Computer Society 
Conference on, vol.2, no., pp. 1709- 1718, 2006 doi: 
10.1109/CVPR.2006.242. 
Sinha, S. N., Frahm, J., Pollefeys, M., and Genc, Y. 
(2006). GPU-based Video Feature Tracking And 
Matching, 012(May), 1–15. 
Starner, T. Pentland, A., Real-time American Sign 
Language recognition from video using hidden 
Markov models, Computer Vision, 1995. 
Proceedings., International Symposium on, vol., no., 
pp.265-270, 21-23 Nov 1995 doi: 
10.1109/ISCV.1995.477012. 
Uddin, M. Z., Byun, K., Cho, M., Lee, S., Khang, G., and 
Kim, T.-S. (2011). A Spanning Tree-Based Human 
Activity Prediction System Using Life Logs from 
Depth Silhouette-Based Human Activity Recognition. 
In P. Real, D. Diaz-Pernil, H. Molina-Abril, A. 
Berciano, and W. Kropatsch (Eds.), Computer 
Analysis of Images and Patterns (Vol. 6854, pp. 302–
309). Springer Berlin Heidelberg. doi:10.1007/978-3-
642-23672-3_37. 
Vu, V., Bremond, F., Thonnat, M., Orion, P., Sophia, I. N. 
R. I. A., Cedex, B.-S. A., Vu, T., et al. (2004). 
Automatic Video Interpretation : A Novel Algorithm 
for Temporal Scenario Recognition, 1–6. 
Wang, J. (n.d.). Mining Actionlet Ensemble for Action 
Recognition with Depth Cameras. 
Yamato, J., Ohya, J. Ishii, K., Recognizing human action 
in time-sequential images using hidden Markov 
model, Computer Vision and Pattern Recognition, 
1992. Proceedings CVPR '92., 1992 IEEE Computer 
Society Conference on , vol., no., pp.379-385, 15-18 
Jun 1992 doi: 10.1109/CVPR.1992.223161. 
Yu, E. Aggarwal, J.K., Detection of Fence Climbing from 
Monocular Video, Pattern Recognition, 2006. ICPR 
HumanActivityRecognitionandPrediction
31