%0 Conference Proceedings %T Reinforcement Learning Approach for Multi-period Inventory with Stochastic Demand %+ Nanyang Technological University [Singapour] %+ Singapore Institute of Manufacturing Technology (SIMTech) %A Shakya, Manoj %A Ng, Huey, Yuen %A Ong, Darrell, Joshua %A Lee, Bu-Sung %Z Part 4: Deep Learning - Recurrent/Reinforcement %< avec comité de lecture %@ 978-3-031-08332-7 %( IFIP Advances in Information and Communication Technology %B 18th IFIP International Conference on Artificial Intelligence Applications and Innovations (AIAI) %C Hersonissos, Greece %Y Ilias Maglogiannis %Y Lazaros Iliadis %Y John Macintyre %Y Paulo Cortez %I Springer International Publishing %3 Artificial Intelligence Applications and Innovations %V AICT-646 %N Part I %P 282-291 %8 2022-06-17 %D 2022 %R 10.1007/978-3-031-08333-4_23 %K Reinforcement learning %K Multi-period inventory management %K Q-learning %Z Computer Science [cs]Conference papers %X Finding an optimal solution to multi-period inventory ordering decision problems with uncertain demand is important for any manufacturing organization. Moreover, these problems are NP-hard as there are many factors to consider including customer demand and lead time which are stochastic in nature. This paper describes a reinforcement learning (RL) approach, Q-learning in particular, to decide on ordering policies. We formulated the finite horizon single-product multi-period problem into a reinforcement learning model in the form of Markov decision processes (MDP) and solve it to obtain the near-optimal solutions. Mixed integer linear programming (MILP) technique is still common in solving these problems; but they usually lack simplicity and may not optimized near to optimal. We formulated the same problem using the mixed integer linear programming model as the baseline algorithm so that we can compare it with RL approach. In comparison to MILP, the reinforcement learning agent performed better in making ordering decisions over the finite horizon. Obtaining better performance in multi-period problem would help the business in taking appropriate inventory decisions and reduce the total inventory costs. %G English %Z TC 12 %Z WG 12.5 %2 https://inria.hal.science/hal-04317172/document %2 https://inria.hal.science/hal-04317172/file/527511_1_En_23_Chapter.pdf %L hal-04317172 %U https://inria.hal.science/hal-04317172 %~ IFIP %~ IFIP-AICT %~ IFIP-TC %~ IFIP-WG %~ IFIP-TC12 %~ IFIP-AIAI %~ IFIP-WG12-5 %~ IFIP-AICT-646