Autonomous vehicle operations use learned constraints over beliefs to traverse a vehicle transportation network. Sensor data and user demonstration data are used to determine a belief path using a partially observable Markov decision process (POMDP) model. The belief path can be updated based on the learned constraints. Candidate actions are determined based on the POMDP model. The candidate actions are constrained by the updated belief path. An action is selected, and the vehicle traverses the vehicle network using the selected action.