Satinder P Singh, Michael J Kearns, Diane J Litman, Marilyn
A Walker (2000). Reinforcement learning for spoken
dialogue systems. Neural Information Processing
Systems, pages 956-962.
Sarwar, G. Karypis, J. Konstan, J. Riedl (2001), Item-based
collaborative filtering recommendation algorithms, 10th
International Conference on World Wide Web, ACM,
pp. 285-295.
Sebastian Thrun and Anton Schwartz (1993). Issues in using
function approximation for reinforcementlearning.
Connectionist Models Summer School Hillsdale, NJ.
Lawrence Erlbaum.
Shambour, J. Lu (2011), A hybrid trust-enhanced
collaborative filtering recommendation approach for
personalized government-to-business eservices,
International Journal of Intelligent Systems, 26 814843.
Shamim Nemati, Mohammad M Ghassemi, and Gari D Cli
ord (2016). Optimal medication dosing fromsuboptimal
clinical examples:A deep reinforcementlearning
approach. Engineering in Medicine and Biology Society,
pages 2978-2981. IEEE.
Shi-Yong Chen, Yang Yu, Qing Da, Jun Tan, Hai-Kuan
Huang, and Hai-Hong Tang (2018). Stabilizing
reinforcement learning in dynamic environment with
application to online recommendation. In Proceedings of
the 24th ACM SIGKDD International Conference on
Knowledge Discovery & Data Mining.
Shuai Zhang, Lina Yao, Aixin Sun, Yi Tay (2019). Deep
learningbased recommender system :A survey and new
perspectives. Computing Surveys (CSUR), 52(1):1-38,
Smyth, P. Cotter (2000), A personalised TV listings service
for the digital TV age, Knowledge-Based Systems.
Su Liu, Ye Chen, Hui Huang, Liang Xiao, and Xiaojun Hei
(2018). Towards smart educational recommendations
with reinforcement learning in classroom. International
Conference on Teaching, Assessment, and Learning for
Engineering pages 1079-1084. IEEE.
Tariq Mahmood and Francesco Ricci (2007). Learning and
adaptivity in interactive recommender systems.
Conference on Electronic commerce, pages 75-84.
Thomas Degris (2015). Deep reinforcement learning in large
discrete action spaces. arXiv : 1512.07679.
Thorsten Bohnenberger and Anthony Jameson (2001). When
policies are better than plans: Decision theoretic planning
of recommendation sequences. International Conference
on intelligent user interfaces, pages 21-24.
Thorsten Joachims, Dayne Freitag, Tom Mitchell(1997).
Webwatcher: A tour guide for the world wide web. In
IJCAI (1), pages 770{777. Citeseer.
Timothy P Lillicrap, Alexander Pritzel, Jonathan J Hunt,
Nicolas Heess, Yuval Tassa, Tom Erez, David Silver,
Daan Wierstra (2015). Continuous control with deep
reinforcement learning. arXiv.
Tong Yu, Yilin Shen, Ruiyi Zhang, Xiangyu Zeng, and
Hongxia Jin (2019). Vision-language recommendation
via attribute augmented multimodal reinforcement
learning. ACM International Conference on Multimedia,
pages 39-47.
Vladimir Vapnik (2013). The nature of statistical learning
theory. Springer science & business media.
Wacharawan Intayoad, Chayapol Kamyod, and Punnarumol
Temdee (2018). Reinforcement learning for online
learning recommendation system. In 2018 Global
Wireless Summit (GWS), pages 167-170. IEEE.
Yufan Zhao, Donglin Zeng, Mark A Socinski, and Michael
R Kosorok (2011). Reinforcement learning strategies
forclinical trials in nonsmall cell lung cancer.
Nima Taghipour, Ahmad Kardan, Saeed Shiry Ghidary
(2007). Usage based web recommendations: a
reinforcement learning approach. In Proceedings of the
2007 ACM conferenceon Recommender systems.
Tejas D Kulkarni, Karthik Narasimhan, Ardavan Saeedi, and
Josh Tenenbaum (2016). Hierarchical deep
reinforcement learning: Integrating temporal abstraction
and intrinsic motivation. Neural information processing
systems, pages 3675-3683.
Long-Ji Lin (1992). Self-improving reactive agents based on
reinforcement learning, planning and teaching. Machine
learning, 8(3-4):293-321.
Yikun Xian, Zuohui Fu, S Muthukrishnan, Gerard De Melo,
Yongfeng Zhang (2019). Reinforcement knowledge
graph reasoning for explainable recommendation. ACM
SIGIR Conference on Research and Development in
Information Retrieval, pages 285-294.
Wilson, B. Smyth, D. O’Sullivan (2003), Sparsity reduction
in collaborative recommendation: A case-based
approach, Journal of Pattern Recognition andArtificial
Intelligence, 17863-884.
Xiangyu Zhao, Long Xia, Dawei Yin, and Jiliang Tang
(2019). Model-based reinforcement learning for whole-
chain recommendations. arXiv preprint
arXiv:1902.03987.
Xinshi Chen, Shuang Li, Hui Li, Shaohua Jiang, Yuan Qi,
and Le Song (2019). Generative adversarial user model
for reinforcement learning based recommendation
system. In International Conference on Machine
Learning, pages 1052{1061.
Xiting Wang, Yiru Chen, Jie Yang, Le Wu, Zhengtao Wu,
Xing Xie (2018). A reinforcement learning framework
for explainable recommendation. Conference on Data
Mining, pages 587-596. IEEE.
Yongfeng Zhang, Xu Chen (2018). Explainable
recommendation: A survey and new perspectives.
arXiv:1804.11192.
YuWang (2020). A hybrid recommendation for music based
on reinforcement learning. In Pacific-Asia Conference on
Knowledge Discovery and Data Mining, pages 91-103.
Springer,
Zachary C Lipton (2018). The mythos of model
interpretability. Queue, 16(3):31-57
Zhengyao Jiang, Dixing Xu, Jinjun Liang (2017). A deep
reinforcement learning framework for the nancial
portfolio management problem. arXiv.
Zimdars, D. M. Chickering, C. Meek (2001). Using temporal
data for making recommendations. In 17th Conference in
Uncertainty in Articial Intelligence.
Ziyu Wang, Tom Schaul, Matteo Hessel, Hado Hasselt, Marc
Lanctot, Nando Freitas (2016). Dueling network
architectures for deep reinforcement learning. In
International conference on machine learning.