Thursday 1 September 2016

FFMPEG detecting black frames and generating keystamps


In my previous post I looked at using FFMPEG for adding black frames, colour bars and slates using the complex_filter switch and a number of libavfilters including with FFMPEG. Today I'm going to follow on from that using the videofilter (-vf) switch to look at generating information from the incoming media stream.

Detecting Black Frames
This is quite a typical video processing application, particularly at the professional end where content may intentionally have a sequence of black frames inserted to identify commercial breaks. It's useful to be able to identify those and 'segment' the video sequence to be put into an editor or automated process. 

FFMPEG has two libavfilters for this blackdetect and blackframes. We're going to use the former which has a syntax like the following:

ffmpeg –i myfile.mxf –vf “blackdetect=d=2:pix_th=0.00” –an –f null –


The blackdetect filter takes in a parameter to indicate the duration period of black frames (d=2) and the threshold of frame 'blackness' pix_th=0.00. The other options above are just to stub off the output.

This puts the information out to stdout (e.g. the console) which is pretty easily processed. There are quite a lot of online discussions about how to process that in various ways into text or csv files for ongoing work and some options on using ffprobe that are worth exploring.

There are similar audio filters for detecting silence.

Generating Scene-Change Frames
Another typical video processing task is generating a sequence of representative keystamps (image frames) for the video sequence, ideally with each of those representing a 'scene' in the sequence. There's a whole lot of discussion we can go into here on what constitutes a scene and processing techniques for identifying that. That's not the topic here. This is about demonstrating what FFMPEG can offer, take or leave it!

One of the filtering functions offered by FFMPEG is to be able to make a conditional decision based on the processed data. In this case we are going to use the gt (greater) comparison and compare the 'scene' information against a threshold value which is between 0-1.0. We'll then scale the output of that to our desired keystamp resolution and indicate the number of frames we want to generate. The syntax looks like this:


ffmpeg –i myfile.mxf –vf “select=gt(scene\,0.4),scale=640:320” –frames:v 5 –vsync vfr thumbs%0d.png

It appears that there is a 'bug' or argued over functionality, but if you do not include the -vsync vfr option you'll only get the first detected output repeated.

Generating a Single Scene Change Tile
Another pretty typical operation is wanting to summarise a whole file into a single keystamp with a number of tiled images giving a summary of the whole content. FFMPEG nicely provides us with a function for doing that as well. Warning, processing this seems to be quite slow.

ffmpeg –i myfile.mxf –vf “select=gt(scene\,0.4),tile,scale=640:320” thumbs%0d.png


In this case what we're doing is passing the output identified images from the scene comparison into the 'tile' filter in the video processing chain.


Some of this and a well written general intro to using FFMPEG that is an easier intro that the canonical documentation can be found in this article on the swiss knife of video processing.