Columbia University recently published a paper on ‘fail’ videos. Besides being a ton of fun to watch it turns out classifying ‘fail’ action (or unintentional action) is a difficult machine learning problem. The project, published by Dave Epstein, Boyuan Chen and Carl Vondrick, includes the development of the ‘Oops! dataset’. This dataset contains hours of ‘fail’ videos from YouTube with the unintentional action annotated.
The dataset consists of 20,338 videos from YouTube fail compilation videos, adding up to over 50 hours of data. These clips, filmed by amateur videographers in the real world, are diverse in action, environment, and intention.
To recognize and predict unintentional action in the videos the team utilized a 3D convolutional network. The still image above shows a snapshot of outputs from the machine learning model which predicts intentional, transitional, and unintentional action in the video. As a comparison, the team also investigated the speed of video to measure intentionality. The intrinsic speed model is competitive with the machine learning algorithm but neither is comparable to human performance.
We train a supervised neural network as a baseline and analyze its performance compared to human consistency on the tasks. We also investigate self-supervised representations that leverage natural signals in our dataset, and show the effectiveness of an approach that uses the intrinsic speed of video to perform competitively with highly-supervised pretraining. However, a significant gap between machine and human performance remains.
If you would like to learn more check out the blog post for “Oops! Predicting Unintentional Action in Video”. You can also download the Oops! dataset or check out the code on the GitHub repo.
Written by Rebecca Minich, Product Analyst, Data Science at Google. Opinions expressed are solely my own and do not express the views or opinions of my employer.
Stop breadboarding and soldering – start making immediately! Adafruit’s Circuit Playground is jam-packed with LEDs, sensors, buttons, alligator clip pads and more. Build projects with Circuit Playground in a few minutes with the drag-and-drop MakeCode programming site, learn computer science using the CS Discoveries class on code.org, jump into CircuitPython to learn Python and hardware together, TinyGO, or even use the Arduino IDE. Circuit Playground Express is the newest and best Circuit Playground board, with support for CircuitPython, MakeCode, and Arduino. It has a powerful processor, 10 NeoPixels, mini speaker, InfraRed receive and transmit, two buttons, a switch, 14 alligator clip pads, and lots of sensors: capacitive touch, IR proximity, temperature, light, motion and sound. A whole wide world of electronics and coding is waiting for you, and it fits in the palm of your hand.