メインコンテンツに移動

Reinforcement learning revisited

Miquel Noguer i Alonso, Daniel Bloch and David Pacheco Aznar

For an in-depth look at reinforcement learning we refer the reader to the books by Blackwell (1969) and Sutton and Barto (2018).

4.1 OVERVIEW

While supervised learning is learning from examples provided by a knowledgable external supervisor, it does not allow learning from interaction. Thus, it cannot learn interactive problems. On the other hand, reinforcement learning (RL) is learning how to map situations to actions so as to maximise a numerical reward signal. The learner must discover which actions yield the most reward by trying them. In general, actions may affect not only the immediate reward but also the next situation and, through that, all subsequent rewards. RL explicitly considers the whole problem of a goal-directed agent interacting with an uncertain environment. Agents have explicit goals, can sense aspects of their environments and can choose actions to influence their environments.

4.1.1 The agent and its environment

Consider an agent continually interacting with an environment: the agent selects some actions, and the environment responds to those actions and presents new situations to the agent. The environment also gives rise to rewards, which are special numerical

Sorry, our subscription options are not loading right now

Please try again later. Get in touch with our customer services team if this issue persists.

New to Risk.net? View our subscription options

無料メンバーシップの内容をお知りになりたいですか?ここをクリック

パスワードを表示
パスワードを非表示にする

You need to sign in to use this feature. If you don’t have a Risk.net account, please register for a trial.

ログイン
You are currently on corporate access.

To use this feature you will need an individual account. If you have one already please sign in.

Sign in.

Alternatively you can request an individual account here