Safe Reinforcement Learning via Observation Shielding

Mccalmon, Joe; Liu, Tongtong; Goldsmith, Reid; Cyhaniuk, Andrew; Halabi, Talal; Alqahtani, Sarra

Safe Reinforcement Learning via Observation Shielding

Files

0643.pdf (1.68 MB)

Date

2023-01-03

Authors

Starting Page

6603

Abstract

Reinforcement Learning (RL) algorithms have shown success in scaling up to large problems. However, deploying those algorithms in real-world applications remains challenging due to their vulnerability to adversarial perturbations. Existing RL robustness methods against adversarial attacks are weak to large perturbations - a scenario that cannot be ruled out for RL adversarial threats, as is the case for deep neural networks in classification tasks. This paper proposes a method called observation-shielding RL (OSRL) to increase the robustness of RL against large perturbations using predictive models and threat detection. Instead of changing the RL algorithms with robustness regularization or retrain them with adversarial perturbations, we depart considerably from previous approaches and develop an add-on safety feature for existing RL algorithms during runtime. OSRL builds on the idea of model predictive shielding, where an observation predictive model is used to override the perturbed observations as needed to ensure safety. Extensive experiments on various MuJoCo environments (Ant, Hooper) and the classical pendulum environment demonstrate that our proposed OSRL is safer and more efficient than state-of-the-art robustness methods under large perturbations.

Keywords

Cyber Operations, Defense, and Forensics, adversarial examples, reinforcement learning, robustness, safety, shielding

URI

https://hdl.handle.net/10125/103433

Extent

10

Related To

Proceedings of the 56th Hawaii International Conference on System Sciences

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International

Collections

Cyber Operations, Defense, and Forensics

Full item page

Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.

Safe Reinforcement Learning via Observation Shielding

Files

Date

Authors

Contributor

Advisor

Department

Instructor

Depositor

Speaker

Researcher

Consultant

Interviewer

Narrator

Transcriber

Annotator

Journal Title

Journal ISSN

Volume Title

Publisher

Volume

Number/Issue

Starting Page

Ending Page

Alternative Title

Abstract

Description

Keywords

Citation

URI

Extent

Format

Geographic Location

Time Period

Related To

Related To (URI)

Table of Contents

Rights

Rights Holder

Local Contexts

Collections