Deep Reinforcement Learning Techniques For Solving Hybrid Flow Shop Scheduling Problems: Proximal Policy Optimization (PPO) and Asynchronous Advantage Actor-Critic (A3C)

Nahhas, Abdulrahman
Kharitonov, Andrey
Turowski , Klaus
Journal Title
Journal ISSN
Volume Title
Well-studied scheduling practices are fundamental for the successful support of core business processes in any manufacturing environment. Particularly, the Hybrid Flow Shop (HFS) scheduling problems are present in many manufacturing environments. The current advances in the field of Deep Reinforcement Learning (DRL) attracted the attention of both practitioners and academics to investigate their adoption beyond synthetic game-like applications. Therefore, we present an approach that is based on DRL techniques in conjunction with a discrete event simulation model to solve a real-world four-stage HFS scheduling problem. The main narrative behind the presented concepts is to expose a DRL agent to a game-like environment using an indirect encoding. Two types of DRL techniques namely, Proximal Policy Optimization (PPO) and Asynchronous Advantage Actor-Critic (A3C), are evaluated for solving problems of different complexity. The computational results suggest that the DRL agents successfully learn appropriate policies for solving the investigated problem. In addition, the investigation shows that the agent can adjust their policies when we expose them to a different problem. We further evaluate the approach to solving problem instances published in the literature to establish a comparison.
Intelligent Decision Support for Logistics and Supply Chain Management, asynchronous advantage actor-critic (a3c), deep reinforcement learning, hybrid flow shop scheduling, proximal policy optimization (ppo), simulation
Access Rights
Email if you need this content in ADA-compliant format.