Here's a simplistic example using PyTorch and a 3D convolutional network (I3D) for feature extraction: