亚洲综合日韩精品欧美综合区,超碰97人人做人人爱网站,无码人妻少妇狂喷V999AⅤ ,男人猛躁进女人免费播放

您當(dāng)前所在位置: 首頁 > 講座報(bào)告 > 正文
講座報(bào)告

Fast Linear Programming Based Approaches for Solving Markov Decision Processes

來源:數(shù)學(xué)與統(tǒng)計(jì)學(xué)院          點(diǎn)擊:
報(bào)告人 陳彩華 副教授 時(shí)間 11月18日14:30
地點(diǎn) 南校區(qū)信遠(yuǎn)Ⅱ區(qū)206報(bào)告廳 報(bào)告時(shí)間

講座名稱:Fast Linear Programming Based Approaches for Solving Markov Decision Processes

講座人:陳彩華 副教授

講座時(shí)間:11月18日14:30

講座地點(diǎn):南校區(qū)信遠(yuǎn)Ⅱ區(qū)206報(bào)告廳


 

講座人介紹:

陳彩華,副教授,南京大學(xué)理學(xué)博士,新加坡國(guó)立大學(xué)聯(lián)合培養(yǎng)博士,曾赴新加坡國(guó)立大學(xué)、香港中文大學(xué)等學(xué)習(xí)與訪問。主持/完成的基金包括國(guó)家自然科學(xué)基金面上項(xiàng)目、青年項(xiàng)目,江蘇省自然科學(xué)基金面上項(xiàng)目、青年項(xiàng)目,參與國(guó)家自然科學(xué)基金重點(diǎn)項(xiàng)目,代表作發(fā)表在《Mathematical Programming》,《SIAM Journal on Optimization》,《SIAM Journal on Imaging Science》及CVPR、NIPS等國(guó)際知名學(xué)術(shù)期刊與會(huì)議,其中多篇論文入選ESI高被引論文。獲華人數(shù)學(xué)家聯(lián)盟最佳論文獎(jiǎng)(2017、2018連續(xù)兩年),中國(guó)運(yùn)籌學(xué)會(huì)青年科技獎(jiǎng)(2018),南京大學(xué)青年五四獎(jiǎng)?wù)?2019),入選首批南京大學(xué)仲英青年學(xué)者(全校10人,2018)。

 

講座內(nèi)容:

Markov Decision Processes (MDPs) are widely applied in the study of sequential decision making, whose applications are almost endless. In practice, MDPs are often solved using dynamic programming method such as policy iteration and value iteration. Compared with the dynamic programming method, the linear programming based approach for MDPs is powerful in theory but often not quite e?icient in practice as a solution method. To improve its practicability, this paper studies the linear programming based approach for MDPs from a computational perspective. Specifically, we consider a discrete stage, infinite horizon discounted MDP, which has a nice property ensuring the existence of deterministic optimal policies. In this respect, we introduce a weakly and a strongly constrained sparse formulation for MDPs to find optimal deterministic policies approximately. By exploiting the sparse structure in the formulations, we propose a class of multi-block alternating direction methods of multipliers (ADMM) to solve MDPs e?iciently. The numerical study shows that, ADMM applied to the strongly constrained formulation achieves a comparable runtime as policy iteration, and beats other methods by a large margin. To the best of our knowledge, this is the first linear programming based method that achieves a similar performance as policy iteration. Theoretically, we also prove the convergence of the multi-block ADMM using recent techniques of convex and nonconvex optimization. These findings are encouraging that we take a first step in showing that linear programming based methods can be e?icient for solving MDPs, which offers a different perspective, including its elegant theory and great toolbox, to the study of sequential decision making.

 

主辦單位:數(shù)學(xué)與統(tǒng)計(jì)學(xué)院

123

南校區(qū)地址:陜西省西安市西灃路興隆段266號(hào)

郵編:710126

北校區(qū)地址:陜西省西安市太白南路2號(hào)

郵編:710071

訪問量:

版權(quán)所有:西安電子科技大學(xué)    建設(shè)與運(yùn)維:信息網(wǎng)絡(luò)技術(shù)中心     陜ICP備05016463號(hào)    陜公網(wǎng)安備61019002002681號(hào)