less than 1 minute read

Generalizing Value Estimation over Timescale

Sherstan, C., MacGlashan, J., Pilarski, P. M. (2018) Generalizing Value Estimation over Timescale. FAIM Workshop: Prediction and Generative Modeling in Reinforcement Learning (PGMRL). Stockholm, Sweden. July 15.

General value functions (GVFs) are an approach to representing models of an agent’s world as a collection of predictive questions. A GVF is defined by: a policy, a prediction target, and a timescale. Traditionally predictions for a given timescale must be specified by the engineer and each timescale learned independently. Here we present γ-nets, a method for generalizing value function estimation over timescale, allowing a given GVF to be trained and queried for any fixed timescale. The key to our approach is to use timescale as one of the network inputs. The prediction target for any fixed timescale is then available at every timestep and we are free to train on any number of timescales. We present preliminary results on a simple test signal.