We present a technical architecture for user preference model,and show the nature of the problem represented within a Markov Decision Process combined with adaptive reinforcement learning algorithm.We provide a possible candidate solution for user modeling dynamically to satisfy the users expected preference based on minimal or missing information,it is also a exploration for the evaluation of the user experience when selecting service providers.Simulations of the representative user models show that the adaptive reinforcement learning solutions are effective.
Key words
utility theory /
user preference /
Markov decision process /
reinforcement learning
{{custom_keyword}} /
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
References
[1] B Liver,J Altmann.Social carder recommendation for selecting services in electronic telecommunication markets:A preliminary report[R].Technical Report TR-97-033,ICSI,Berkeley,CA,USA,1997.
[2] U Chajewska,D Koller,R Parr.Making rational decisions during adaptive utility elicitation[A].In Proceedings of the Seventeenth National Conference on Artificial Intelligence[C].Austin,TX,USA,2000.363-369.
[3] Craig Boutilier.A POMDP formulation of preference eficitation problems[A].In Proceedings of American Association of Artificial Intelligence[C].Edmonton,Alberta,Canada,2002.239-246.
[4] Simon French.Decision Theory:An Introduction to the Mathematics of Rationality[M].New York,USA,Halsted Press,1986.
[5] Daniel P Boulet,Niall M Fraser.Improving preference elicitation for decision support systems[J].IEEE Trans,1995.1574-1579.
[6] Leslie Pack Kaelbling,Andrew W Moore.Reinforcement learning:a survey[J].Journal of Artificial Intelligence Research,1996,237-285.
[7] R S Sutton,A G Barto.Reinforcement learning[M].MIT Press,Cambridge,MA,1998.
[8] 胡奇英,刘建庸.马尔可夫决策过程引论[M].西安:西安电子科技大学出版社,2000.
[9] 刘克.实用马尔可夫决策过程[M].北京:清华大学出版社,2004.
{{custom_fnGroup.title_en}}
Footnotes
{{custom_fn.content}}