Deep reinforcement learning
Miquel Noguer i Alonso, Daniel Bloch and David Pacheco Aznar
Deep reinforcement learning
Preface
Introduction
Markov decision problems
Learning the optimal policy
Reinforcement learning revisited
Temporal difference learning revisited
Stochastic approximation in Markov decision processes
Large language models: reasoning and reinforcement learning
Deep reinforcement learning
Applications of artificial intelligence in finance
Pricing options with temporal difference backpropagation
Pricing American options
Daily price limits
Portfolio optimisation
Appendix
This chapter provides a comprehensive introduction to deep reinforcement learning (DRL), which will be explored in depth. Chapter 3 focuses, in particular, on obtaining optimal reinforcement learning (RL) policies. DRL is a framework that merges deep learning techniques with reinforcement learning principles to solve complex decision-making tasks. At its core, the DRL framework consists of an agent that interacts with an environment, aiming to maximise cumulative rewards over time. The environment is modelled as a Markov decision process (MDP), with a state space representing all possible conditions, an action space outlining all available decisions, a transition function defining the probabilities of moving from one state to another and a reward function that provides immediate feedback based on the agent’s actions. A discount factor accounts for the reduced influence of future rewards.
The agent’s behaviour is governed by a policy, which is a probabilistic mapping from states to actions. In DRL, this policy is often represented as a deep neural network with trainable parameters. Through a process of repeated interactions with the environment, the agent learns to refine its policy
Copyright Infopro Digital Limited. All rights reserved.
As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (point 2.4), printing is limited to a single copy.
If you would like to purchase additional rights please email info@risk.net
Copyright Infopro Digital Limited. All rights reserved.
You may share this content using our article tools. As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (clause 2.4), an Authorised User may only make one copy of the materials for their own personal use. You must also comply with the restrictions in clause 2.5.
If you would like to purchase additional rights please email info@risk.net