Publications
Beware Untrusted Simulators – Reward-Free Backdoor Attacks in Reinforcement Learning (2026)
ICLR
Simulated environments are a key piece in the success of Reinforcement Learning (RL), allowing practitioners and researchers to train decision making agents without running expensive experiments on real hardware. Simulators remain a security blind spot, however, enabling adversarial developers to alter the dynamics of their released simulators for malicious purposes. Therefore, in this work we highlight a novel threat, demonstrating how simulator dynamics can be exploited to stealthily implant action-level backdoors into RL agents. The backdoor then allows an adversary to reliably activate targeted actions in an agent upon observing a predefined “trigger”, leading to potentially dangerous consequences. Traditional backdoor attacks are limited in their strong threat models, assuming the adversary has near full control over an agent’s training pipeline, enabling them to both alter and observe agent’s rewards. As these assumptions are infeasible to implement within a simulator, we propose a new attack “Daze” which is able to reliably and stealthily implant backdoors into RL agents trained for real world tasks without altering or even observing their rewards. We provide formal proof of Daze’s effectiveness in guaranteeing attack success across general RL tasks along with extensive empirical evaluations on both discrete and continuous action space domains. We additionally provide the first example of RL backdoor attacks transferring to real, robotic hardware. These developments motivate further research into securing all components of the RL training pipeline to prevent malicious attacks.
TBD (2025)
TBD
In Review
Toward Life-Long Creative Problem Solving: Using World Models for Increased Performance in Novelty Resolution (2022)
ICCC
Creative problem solving (CPS) is a skill which enables
innovation, often times through repeated exploration of
an agent’s world. In this work, we investigate methods
for life-long creative problem solving (LLCPS), with
the goal of increasing CPS capability over time. We de-
velop two world models to facilitate LLCPS which use
sub-symbolic action and object information to predict
symbolic meta-outcomes of actions. We experiment
with three CPS scenarios run sequentially and in sim-
ulation. Results suggest that LLCPS is possible through
the use of a world model, which can be trained on CPS
exploration trials, and used to guide future CPS explo-
ration.
A framework for creative problem solving through action discovery (2021)
RSS Workshop
Creative problem solving (CPS) is a process through
which an agent discovers previously unknown information about
itself and its environment in order to achieve an unsolvable
task. In this paper, we introduce a unified framework for
CPS through action discovery. We describe two methods which
enable action discovery at a declarative and neurosymbolic level,
namely through action primitive segmentation, and behavior
babbling, respectively. We review experimental evaluations of our
framework, and end with a discussion on limitations and future
work considerations for CPS.
Toward creative problem solving agents: Action discovery through behavior babbling (2021)
IEEE Conference
Creative problem solving (CPS) is the process by which an agent discovers unknown information about itself and its environment, allowing it to accomplish a previously impossible goal. We propose a framework for CPS by robots for discovering novel actions via behavior babbling, capable of learning a representation of novel actions at both a symbolic planning level, and a sub-symbolic action controller level. Our framework employs two modes of discovery – a focused incubation method that scopes its search to the actions and entities composing the failed plan, and a defocused incubation method which enables exploration of actions and entities outside of the failed plan. We implemented and tested our framework using a Baxter robot in a 3D physics-based simulation environment, where we ran three proof-of-concept object manipulation scenarios. Results suggest that it is possible to use behavior babbling as a method for the autonomous discovery of flexible and reusable actions.