Detection of Important States through an Iterative Q-value Algorithm for Explainable Reinforcement Learning

Milani, Rudy; Moll, Maximilian; De Leone, Renato

Detection of Important States through an Iterative Q-value Algorithm for Explainable Reinforcement Learning

dc.contributor.author	Milani, Rudy
dc.contributor.author	Moll, Maximilian
dc.contributor.author	De Leone, Renato
dc.date.accessioned	2023-12-26T18:37:06Z
dc.date.available	2023-12-26T18:37:06Z
dc.date.issued	2024-01-03
dc.identifier.doi	10.24251/HICSS.2024.174
dc.identifier.isbn	978-0-9981331-7-1
dc.identifier.other	13df9219-88ea-4b3a-bbc8-69115062495c
dc.identifier.uri	https://hdl.handle.net/10125/106551
dc.language.iso	eng
dc.relation.ispartof	Proceedings of the 57th Hawaii International Conference on System Sciences
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject	Intelligent Decision Support on Networks – Data-driven Optimization, Augmented and Explainable AI in Complex Supply Chains
dc.subject	explainable reinforcement learning
dc.subject	importance analysis
dc.subject	important states
dc.subject	safe reinforcement learning
dc.title	Detection of Important States through an Iterative Q-value Algorithm for Explainable Reinforcement Learning
dc.type	Conference Paper
dc.type.dcmi	Text
dcterms.abstract	To generate safe and trustworthy Reinforcement Learning agents, it is fundamental to recognize meaningful states where a particular action should be performed. Thus, it is possible to produce more accurate explanations of the behaviour of the trained agent and simultaneously reduce the risk of committing a fatal error. In this study, we improve existing metrics using Q-values to detect essential states in Reinforcement Learning by introducing a scaled iterated algorithm called IQVA. The key observation of our approach is that a state is important not only if the action has a high impact but also if it often appears in different episodes. We compared our approach with the two baseline measures and a newly introduced value in grid-world environments to demonstrate its efficacy. In this way, we show how the proposed methodology can highlight only the meaningful states for that particular agent instead of emphasizing the importance of states that are rarely visited.
dcterms.extent	8 pages
prism.startingpage	1401

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 0136.pdf
Size:: 873.77 KB
Format:: Adobe Portable Document Format

Download

Collections

Intelligent Decision Support on Networks - Data-driven Optimization, Augmented and Explainable AI in Complex Supply Chains