A Behavioral Model for Exploration vs. Exploitation: Theoretical Framework and Experimental Evidence

发布时间:2025-04-24发布部门:旭日工商管理学院

报告人简介:

丁婧颖,新加坡国立大学商学院博士后研究员。上海交通大学管理科学与工程博士学位,南京大学管理学学士学位。研究领域包括数据驱动的决策问题,在线学习算法,行为运作管理等。在Manufacturing & Service Operations Management, Omega等国际期刊上发表论文,参与国家自然科学基金重点项目和面上项目的研究。在INFORMS年会、POMS年会、中国管理科学与工程学会年会等国际国内管理学会议做报告。

报告简介:

How do people navigate the exploration-exploitation (EE) trade-off when making repeated choices with unknown rewards? We study this question through the lens of multi-armed bandit problems and introduce a novel behavioral model, Quantal Choice with Adaptive Reduction of Exploration (QCARE). It generalizes Thompson Sampling, allowing for a principled way to quantify the EE trade-off and reflect human decision-making patterns. The model adaptively reduces exploration as information accumulates, with the reduction rate serving as a parameter to quantify the EE trade-off dynamics. We theoretically analyze how varying reduction rates influence decision quality, shedding light on the effects of “over-exploration” and “under-exploration.” Empirically, we validate QCARE through experiments collecting behavioral data from human participants. QCARE not only captures critical behavioral patterns in the EE trade-off but also outperforms alternative models in predictive power. Our analysis reveals a behavioral tendency toward over-exploration.


摄影:
编辑:李盈颉
信息员:周莉莉
撰写:管理学院