找回密碼
 To register

QQ登錄

只需一步,快速開(kāi)始

掃一掃,訪問(wèn)微社區(qū)

打印 上一主題 下一主題

Titlebook: Reinforcement Learning; Richard S. Sutton Book 1992 Springer Science+Business Media New York 1992 agents.algorithms.artificial intelligenc

[復(fù)制鏈接]
查看: 12646|回復(fù): 43
樓主
發(fā)表于 2025-3-21 16:35:22 | 只看該作者 |倒序?yàn)g覽 |閱讀模式
書(shū)目名稱(chēng)Reinforcement Learning
編輯Richard S. Sutton
視頻videohttp://file.papertrans.cn/826/825930/825930.mp4
叢書(shū)名稱(chēng)The Springer International Series in Engineering and Computer Science
圖書(shū)封面Titlebook: Reinforcement Learning;  Richard S. Sutton Book 1992 Springer Science+Business Media New York 1992 agents.algorithms.artificial intelligenc
描述Reinforcement learning is the learning of a mapping fromsituations to actions so as to maximize a scalar reward orreinforcement signal. The learner is not told which action to take, asin most forms of machine learning, but instead must discover whichactions yield the highest reward by trying them. In the mostinteresting and challenging cases, actions may affect not only theimmediate reward, but also the next situation, and through that allsubsequent rewards. These two characteristics -- trial-and-errorsearch and delayed reward -- are the most important distinguishingfeatures of reinforcement learning. .Reinforcement learning is both a new and a very old topic in AI. Theterm appears to have been coined by Minsk (1961), and independently incontrol theory by Walz and Fu (1965). The earliest machine learningresearch now viewed as directly relevant was Samuel‘s (1959) checkerplayer, which used temporal-difference learning to manage delayedreward much as it is used today. Of course learning and reinforcementhave been studied in psychology for almost a century, and that workhas had a very strong impact on the AI/engineering work. One could infact consider all of reinforcement learning to
出版日期Book 1992
關(guān)鍵詞agents; algorithms; artificial intelligence; control; learning; machine learning; proving; reinforcement le
版次1
doihttps://doi.org/10.1007/978-1-4615-3618-5
isbn_softcover978-1-4613-6608-9
isbn_ebook978-1-4615-3618-5Series ISSN 0893-3405
issn_series 0893-3405
copyrightSpringer Science+Business Media New York 1992
The information of publication is updating

書(shū)目名稱(chēng)Reinforcement Learning影響因子(影響力)




書(shū)目名稱(chēng)Reinforcement Learning影響因子(影響力)學(xué)科排名




書(shū)目名稱(chēng)Reinforcement Learning網(wǎng)絡(luò)公開(kāi)度




書(shū)目名稱(chēng)Reinforcement Learning網(wǎng)絡(luò)公開(kāi)度學(xué)科排名




書(shū)目名稱(chēng)Reinforcement Learning被引頻次




書(shū)目名稱(chēng)Reinforcement Learning被引頻次學(xué)科排名




書(shū)目名稱(chēng)Reinforcement Learning年度引用




書(shū)目名稱(chēng)Reinforcement Learning年度引用學(xué)科排名




書(shū)目名稱(chēng)Reinforcement Learning讀者反饋




書(shū)目名稱(chēng)Reinforcement Learning讀者反饋學(xué)科排名




單選投票, 共有 1 人參與投票
 

0票 0.00%

Perfect with Aesthetics

 

0票 0.00%

Better Implies Difficulty

 

0票 0.00%

Good and Satisfactory

 

1票 100.00%

Adverse Performance

 

0票 0.00%

Disdainful Garbage

您所在的用戶組沒(méi)有投票權(quán)限
沙發(fā)
發(fā)表于 2025-3-21 23:12:07 | 只看該作者
板凳
發(fā)表于 2025-3-22 02:33:55 | 只看該作者
地板
發(fā)表于 2025-3-22 08:31:57 | 只看該作者
https://doi.org/10.1007/978-1-4615-3618-5agents; algorithms; artificial intelligence; control; learning; machine learning; proving; reinforcement le
5#
發(fā)表于 2025-3-22 11:58:11 | 只看該作者
0893-3405 learner is not told which action to take, asin most forms of machine learning, but instead must discover whichactions yield the highest reward by trying them. In the mostinteresting and challenging cases, actions may affect not only theimmediate reward, but also the next situation, and through that
6#
發(fā)表于 2025-3-22 14:24:18 | 只看該作者
Technical Note,he action-values are represented discretely. We also sketch extensions to the cases of non-discounted, but absorbing, Markov environments, and where many Q values can be changed each iteration, rather than just one.
7#
發(fā)表于 2025-3-22 20:23:27 | 只看該作者
8#
發(fā)表于 2025-3-22 21:41:06 | 只看該作者
Introduction: The Challenge of Reinforcement Learning,m. In the most interesting and challenging cases, actions may affect not only the immediate’s reward, but also the next situation, and through that all subsequent rewards. These two characteristics—trial-and-error search and delayed reward—are the two most important distinguishing features of reinforcement learning.
9#
發(fā)表于 2025-3-23 02:36:48 | 只看該作者
Book 1992 not told which action to take, asin most forms of machine learning, but instead must discover whichactions yield the highest reward by trying them. In the mostinteresting and challenging cases, actions may affect not only theimmediate reward, but also the next situation, and through that allsubsequ
10#
發(fā)表于 2025-3-23 08:20:02 | 只看該作者
 關(guān)于派博傳思  派博傳思旗下網(wǎng)站  友情鏈接
派博傳思介紹 公司地理位置 論文服務(wù)流程 影響因子官網(wǎng) 吾愛(ài)論文網(wǎng) 大講堂 北京大學(xué) Oxford Uni. Harvard Uni.
發(fā)展歷史沿革 期刊點(diǎn)評(píng) 投稿經(jīng)驗(yàn)總結(jié) SCIENCEGARD IMPACTFACTOR 派博系數(shù) 清華大學(xué) Yale Uni. Stanford Uni.
QQ|Archiver|手機(jī)版|小黑屋| 派博傳思國(guó)際 ( 京公網(wǎng)安備110108008328) GMT+8, 2026-1-23 19:14
Copyright © 2001-2015 派博傳思   京公網(wǎng)安備110108008328 版權(quán)所有 All rights reserved
快速回復(fù) 返回頂部 返回列表
珠海市| 永吉县| 西乌珠穆沁旗| 陇川县| 托里县| 象州县| 茂名市| 平泉县| 玛多县| 德钦县| 肇源县| 五家渠市| 卓尼县| 连云港市| 六盘水市| 合川市| 临高县| 潮安县| 尉犁县| 师宗县| 广德县| 东山县| 松桃| 东乡县| 托里县| 准格尔旗| 班玛县| 安吉县| 横山县| 太湖县| 吉木萨尔县| 长武县| 昌图县| 固原市| 绩溪县| 浦东新区| 浙江省| 子长县| 西林县| 黄平县| 洪洞县|