A video of a driver of a vehicle is obtained. Based on subset of frames of the video, a series of body poses are identified. The series of body poses are identified by detecting landmark points associated with respective body parts of the driver. The landmark points correspond to coordinates of locations of pixels that represent at least one or more joints of the respective body parts of the driver in the subset of frames. A driver behavior is identified based on the series of the body poses. An assistive vehicle control action for the vehicle is output based on the driver behavior.