Автор24

Информация о работе

Подробнее о работе

Страница работы

ВКР Глубокое обучение с подкреплением в задачах управления финансовыми портфелями

  • 50 страниц
  • 2021 год
  • 0 просмотров
  • 0 покупок
Автор работы

user5741222

2000 ₽

Работа будет доступна в твоём личном кабинете после покупки

Гарантия сервиса Автор24

Уникальность не ниже 50%

Фрагменты работ

We have implemented Deep Reinforcement Learning models that act as autonomous portfolio optimization agents. In particular, we focus on Deep Deterministic Policy Gradient

and Deep Q-Network algorithms, which are model-free reinforcement learning algorithms that learn the quality of actions and tell agents what action to take under what circumstances.

We have performed a comparative analysis of the Reinforcement Learning based optimization strategy and more traditional «Follow the Winner», «Follow the Loser», «Random» and «Uniformly Balanced» strategies to find out which agent outperforms all the other strategies.

Table of Contents

List ob abbreviations.....................................................................................................5 Notation.........................................................................................................................6 Introduction...................................................................................................................7 Part 1. Theoretical Part. (Background)........................................................................10 Chapter 1. Deep Learning............................................................................................10

1.1 Perceptron..........................................................................................................10 1.2 Neural Network.................................................................................................11 1.3 Activation Function...........................................................................................12 1.4 Loss Function....................................................................................................14 1.5 Backpropagation................................................................................................16 1.6 Optimization Algorithms...................................................................................16 1.7 Gradient Descent Optimization Algorithms......................................................16 1.8 Overfitting.........................................................................................................21

Chapter 2. Reinforcement Learning............................................................................22 2.1 Key Concepts.....................................................................................................22 2.2 Taxonomy of RL Algorithms.............................................................................30

Chapter 3. Deep Reinforcement Learning...................................................................33 3.1 Vanilla Policy Gradient (VPG)..........................................................................33 3.2 Deep Deterministic Policy Gradient (DDPG)...................................................35

Chapter 4. Financial Theory........................................................................................40 4.1 Financial Terms and Concepts...........................................................................40 4.2 Statistical Moments...........................................................................................42

Part 2. Practical Part....................................................................................................49 Chapter 5. Trading environment..................................................................................49

5.1 OpenAI Gym.....................................................................................................49 5.2 MDP model........................................................................................................49 5.3 Action Space......................................................................................................50 5.4 State and Observation Space.............................................................................50 5.5 Reward signal....................................................................................................50 5.6 Trading environment implementation...............................................................51 5.7 Dataset...............................................................................................................51

Chapter 6. Trading Agents...........................................................................................52 6.1 Base Agent.........................................................................................................52 6.2 Regular Agents..................................................................................................52 6.3 DQN Agent........................................................................................................53 6.4 DDPG Agent......................................................................................................53

Chapter 7. Experiments...............................................................................................55 7.1 Experiments in OpenAI Gym environment.......................................................55 7.2 Experiments in Trading Environment...............................................................56

Results.....................................................................................................................59 Conclusions.............................................................................................................61 References...............................................................................................................62

Application 1. Trading environment package listing...................................................64 Application 2. DDPG Agent package listing...............................................................70 Application 3. Financial metrics package listing.........................................................76

Оригинальность по АП.Вуз на 11 февраля 2023 года более 70%.

This paper demonstrates the capabilities of Deep Reinforcement Learning algorithms

in the area of financial portfolio management. This field has seen a huge development in recent years, because of the increased computational power and increased research in sequential decision making through control theory.

In this paper we have designed an environment for trading that is compatible with OpenAI gym framework. It simulates real market behavior and can be utilized to assess different portfolio optimization strategies. Also it is used to train reinforcement learning algorithms (DDPG & DQN). The agents can act in this environment by allocating the weights of stocks in the portfolio in each time step.

References

Achiam, J.. Spinning Up in Deep Reinforcement Learning, 2018.

Baviera, R., Pasquini, M., Serva, M. and Vulpiani, A.. Optimal Strategies for Prudent Investors, arXiv:cond-mat/9804297, 1998.

Benhamou, E., Saltiel, D., Guez, B. and Paris, N.. Testing Sharpe ratio: luck or skill?, arXiv:1905.08042, 2019.

Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J. and Zaremba, W.. OpenAI Gym, arXiv:1606.01540, 2016.

Chen, J.. Skewness.,

Investopedia:https://www.investopedia.com/terms/s/skewness.asp, 2021.

De, S., Mukherjee, A. and Ullah, E.. Convergence guarantees for RMSProp and ADAM in non-convex optimization and an empirical comparison to Nesterov acceleration, arXiv:1807.06766, 2018.

Duchi, J., Hazan, E. and Singer, Y.. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization, Journal of Machine Learning Research, 2011. Fernando, J.. Sharpe Ratio., Investopedia:

https://www.investopedia.com/terms/s/sharperatio.asp, 2021.

Goodfellow, I., Bengio, Y. and Courville, A.. Deep Learning, 2016.

Han, M., Zhang, L., Wang, J. and Pan, W.. Actor-Critic Reinforcement Learning for Control with Stability Guarantee, arXiv:2004.14288, 2020.

Hester, T., Vecerik, M., Pietquin, O., Lanctot, M., Schaul, T., Piot, B., Horgan, D., Quan, J., Sendonaris, A., Dulac-Arnold, G., Osband, I., Agapiou, J., Leibo, J. Z. and Gruslys, A.. Deep Q-learning from Demonstrations, arXiv:1704.03732, 2017. Kharitonov G. D. Deep reinforcement learning in Financial Portfolio Management // Information and telecommunication technologies and mathematical modeling of

high-tech systems: materials of the All-Russian conference with international participation. Moscow, RUDN, April 19–23, 2021 - Moscow: RUDN, 2021. - pp. 288-294.

Kukačka, J., Golkov, V. and Cremers, D.. Regularization for Deep Learning: A Taxonomy, arXiv:1710.10686, 2017.

Lehle, B. and Peinke, J.. Analyzing a stochastic process driven by Ornstein-Uhlenbeck noise, Phys. Rev. E 97, 012113 (2018) arXiv:1702.00032, 2017.

Liu, R. and Zou, J.. The Effects of Memory Replay in Reinforcement Learning, arXiv:1710.06574, 2017.

Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S. and Hassabis, D.. Human-level control through deep reinforcement learning, Nature 518:529-533, 2015.

Nwankpa, C., Ijomah, W., Gachagan, A. and Marshall, S.. Activation Functions: Comparison of trends in Practice and Research for Deep Learning,

arXiv:1811.03378, 2018.

Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z.,

Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J. and Chintala, S.. PyTorch: An Imperative Style, High-Performance Deep Learning Library, 8024-8035, 2019.

Quandl. Quandl API, 2016.

Shiloh-Perl, L. and Giryes, R.. Introduction to deep learning, arXiv:2003.03253, 2020.

Streeter, M.. Learning Effective Loss Functions Efficiently, arXiv:1907.00103, 2019.

Sutton, R. and Barto, A.. Reinforcement learning, an introduction, 2018.

Форма заказа новой работы

Не подошла эта работа?

Закажи новую работу, сделанную по твоим требованиям

Оставляя свои контактные данные и нажимая «Заказать Выпускную квалификационную работу», я соглашаюсь пройти процедуру регистрации на Платформе, принимаю условия Пользовательского соглашения и Политики конфиденциальности в целях заключения соглашения.

Фрагменты работ

We have implemented Deep Reinforcement Learning models that act as autonomous portfolio optimization agents. In particular, we focus on Deep Deterministic Policy Gradient

and Deep Q-Network algorithms, which are model-free reinforcement learning algorithms that learn the quality of actions and tell agents what action to take under what circumstances.

We have performed a comparative analysis of the Reinforcement Learning based optimization strategy and more traditional «Follow the Winner», «Follow the Loser», «Random» and «Uniformly Balanced» strategies to find out which agent outperforms all the other strategies.

Table of Contents

List ob abbreviations.....................................................................................................5 Notation.........................................................................................................................6 Introduction...................................................................................................................7 Part 1. Theoretical Part. (Background)........................................................................10 Chapter 1. Deep Learning............................................................................................10

1.1 Perceptron..........................................................................................................10 1.2 Neural Network.................................................................................................11 1.3 Activation Function...........................................................................................12 1.4 Loss Function....................................................................................................14 1.5 Backpropagation................................................................................................16 1.6 Optimization Algorithms...................................................................................16 1.7 Gradient Descent Optimization Algorithms......................................................16 1.8 Overfitting.........................................................................................................21

Chapter 2. Reinforcement Learning............................................................................22 2.1 Key Concepts.....................................................................................................22 2.2 Taxonomy of RL Algorithms.............................................................................30

Chapter 3. Deep Reinforcement Learning...................................................................33 3.1 Vanilla Policy Gradient (VPG)..........................................................................33 3.2 Deep Deterministic Policy Gradient (DDPG)...................................................35

Chapter 4. Financial Theory........................................................................................40 4.1 Financial Terms and Concepts...........................................................................40 4.2 Statistical Moments...........................................................................................42

Part 2. Practical Part....................................................................................................49 Chapter 5. Trading environment..................................................................................49

5.1 OpenAI Gym.....................................................................................................49 5.2 MDP model........................................................................................................49 5.3 Action Space......................................................................................................50 5.4 State and Observation Space.............................................................................50 5.5 Reward signal....................................................................................................50 5.6 Trading environment implementation...............................................................51 5.7 Dataset...............................................................................................................51

Chapter 6. Trading Agents...........................................................................................52 6.1 Base Agent.........................................................................................................52 6.2 Regular Agents..................................................................................................52 6.3 DQN Agent........................................................................................................53 6.4 DDPG Agent......................................................................................................53

Chapter 7. Experiments...............................................................................................55 7.1 Experiments in OpenAI Gym environment.......................................................55 7.2 Experiments in Trading Environment...............................................................56

Results.....................................................................................................................59 Conclusions.............................................................................................................61 References...............................................................................................................62

Application 1. Trading environment package listing...................................................64 Application 2. DDPG Agent package listing...............................................................70 Application 3. Financial metrics package listing.........................................................76

Оригинальность по АП.Вуз на 11 февраля 2023 года более 70%.

This paper demonstrates the capabilities of Deep Reinforcement Learning algorithms

in the area of financial portfolio management. This field has seen a huge development in recent years, because of the increased computational power and increased research in sequential decision making through control theory.

In this paper we have designed an environment for trading that is compatible with OpenAI gym framework. It simulates real market behavior and can be utilized to assess different portfolio optimization strategies. Also it is used to train reinforcement learning algorithms (DDPG & DQN). The agents can act in this environment by allocating the weights of stocks in the portfolio in each time step.

References

Achiam, J.. Spinning Up in Deep Reinforcement Learning, 2018.

Baviera, R., Pasquini, M., Serva, M. and Vulpiani, A.. Optimal Strategies for Prudent Investors, arXiv:cond-mat/9804297, 1998.

Benhamou, E., Saltiel, D., Guez, B. and Paris, N.. Testing Sharpe ratio: luck or skill?, arXiv:1905.08042, 2019.

Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J. and Zaremba, W.. OpenAI Gym, arXiv:1606.01540, 2016.

Chen, J.. Skewness.,

Investopedia:https://www.investopedia.com/terms/s/skewness.asp, 2021.

De, S., Mukherjee, A. and Ullah, E.. Convergence guarantees for RMSProp and ADAM in non-convex optimization and an empirical comparison to Nesterov acceleration, arXiv:1807.06766, 2018.

Duchi, J., Hazan, E. and Singer, Y.. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization, Journal of Machine Learning Research, 2011. Fernando, J.. Sharpe Ratio., Investopedia:

https://www.investopedia.com/terms/s/sharperatio.asp, 2021.

Goodfellow, I., Bengio, Y. and Courville, A.. Deep Learning, 2016.

Han, M., Zhang, L., Wang, J. and Pan, W.. Actor-Critic Reinforcement Learning for Control with Stability Guarantee, arXiv:2004.14288, 2020.

Hester, T., Vecerik, M., Pietquin, O., Lanctot, M., Schaul, T., Piot, B., Horgan, D., Quan, J., Sendonaris, A., Dulac-Arnold, G., Osband, I., Agapiou, J., Leibo, J. Z. and Gruslys, A.. Deep Q-learning from Demonstrations, arXiv:1704.03732, 2017. Kharitonov G. D. Deep reinforcement learning in Financial Portfolio Management // Information and telecommunication technologies and mathematical modeling of

high-tech systems: materials of the All-Russian conference with international participation. Moscow, RUDN, April 19–23, 2021 - Moscow: RUDN, 2021. - pp. 288-294.

Kukačka, J., Golkov, V. and Cremers, D.. Regularization for Deep Learning: A Taxonomy, arXiv:1710.10686, 2017.

Lehle, B. and Peinke, J.. Analyzing a stochastic process driven by Ornstein-Uhlenbeck noise, Phys. Rev. E 97, 012113 (2018) arXiv:1702.00032, 2017.

Liu, R. and Zou, J.. The Effects of Memory Replay in Reinforcement Learning, arXiv:1710.06574, 2017.

Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S. and Hassabis, D.. Human-level control through deep reinforcement learning, Nature 518:529-533, 2015.

Nwankpa, C., Ijomah, W., Gachagan, A. and Marshall, S.. Activation Functions: Comparison of trends in Practice and Research for Deep Learning,

arXiv:1811.03378, 2018.

Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z.,

Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J. and Chintala, S.. PyTorch: An Imperative Style, High-Performance Deep Learning Library, 8024-8035, 2019.

Quandl. Quandl API, 2016.

Shiloh-Perl, L. and Giryes, R.. Introduction to deep learning, arXiv:2003.03253, 2020.

Streeter, M.. Learning Effective Loss Functions Efficiently, arXiv:1907.00103, 2019.

Sutton, R. and Barto, A.. Reinforcement learning, an introduction, 2018.

Купить эту работу

ВКР Глубокое обучение с подкреплением в задачах управления финансовыми портфелями

2000 ₽

или заказать новую

Лучшие эксперты сервиса ждут твоего задания

от 3000 ₽

Гарантии Автор24

Изображения работ

Страница работы
Страница работы
Страница работы

Понравилась эта работа?

или

11 февраля 2023 заказчик разместил работу

Выбранный эксперт:

Автор работы
user5741222
4
Купить эту работу vs Заказать новую
0 раз Куплено Выполняется индивидуально
Не менее 40%
Исполнитель, загружая работу в «Банк готовых работ» подтверждает, что уровень оригинальности работы составляет не менее 40%
Уникальность Выполняется индивидуально
Сразу в личном кабинете Доступность Срок 1—6 дней
2000 ₽ Цена от 3000 ₽

5 Похожих работ

Выпускная квалификационная работа

Операционные системы и платформы

Уникальность: от 40%
Доступность: сразу
3000 ₽
Выпускная квалификационная работа

Автоматизированная система для заказа медицинского оборудования

Уникальность: от 40%
Доступность: сразу
6000 ₽
Выпускная квалификационная работа

Автоматизация документооборота организации ООО ЧОП "Сайга"

Уникальность: от 40%
Доступность: сразу
990 ₽
Выпускная квалификационная работа

Пименение электронных образовательных ресурсов в обучении ВКР

Уникальность: от 40%
Доступность: сразу
2000 ₽
Выпускная квалификационная работа

АВТОМАТИЗИРОВАННОЕ РАБОЧЕЕ МЕСТО МЕНЕДЖЕРА

Уникальность: от 40%
Доступность: сразу
5000 ₽

другие учебные работы по предмету

Готовая работа

Разработка АИС учета кадров

Уникальность: от 40%
Доступность: сразу
2800 ₽
Готовая работа

Проектирование информационной системы для контроля обеспечения работ компании «Interfere»

Уникальность: от 40%
Доступность: сразу
1200 ₽
Готовая работа

Разработка и испытание ПО по моделям

Уникальность: от 40%
Доступность: сразу
1490 ₽
Готовая работа

персональная программа начальника отдела производства (на примере ООО"Вселуг")

Уникальность: от 40%
Доступность: сразу
2800 ₽
Готовая работа

Особые точки функций комплексного переменного и их изучение с помощью Maple

Уникальность: от 40%
Доступность: сразу
2240 ₽
Готовая работа

Контроль логических интегральных микросхем (+ доклад)

Уникальность: от 40%
Доступность: сразу
1000 ₽
Готовая работа

Внедрение системы управления освещением умного дома.

Уникальность: от 40%
Доступность: сразу
2800 ₽
Готовая работа

Автоматизированная система складского учета

Уникальность: от 40%
Доступность: сразу
3000 ₽
Готовая работа

диплом Разработка системы автоматизации документооборота

Уникальность: от 40%
Доступность: сразу
2000 ₽
Готовая работа

диплом Интеллектуальные системы. Управления данными в интеллектуальных системах

Уникальность: от 40%
Доступность: сразу
1700 ₽
Готовая работа

оптимизация торгово-закупочной деятельности

Уникальность: от 40%
Доступность: сразу
2800 ₽
Готовая работа

безопасность беспроводных сетей

Уникальность: от 40%
Доступность: сразу
3300 ₽