hondrou thoughts: FFMPEG for adding black, colourbars, tone and slates

There’s quite a few scattered comments online about doing parts of this with FFMPEG, but nothing cohesive and it seemed a bit hit and miss getting some of the bits to work at first. I've collected together the steps I worked through along with some of the useful references that helped along the way. It’s not complete by any means, but it should be a good starter for building upon.

The first reference I turned up was to add black frames at thestart and end of the video giving a command line as follows:

ffmpeg -i XX.mp4 -vf "

color=c=black:s=720x576:d=10 [pre] ;

color=c=black:s=720x576:d=30 [post] ;

[pre] [in] [post] concat=n=3" -an -vcodec mpeg2video -pix_fmt yuv422p -s 720v576 -aspect 16:9 -r 25 -minrate 30000k -maxrate 30000k -b 30000k output.mpg

This gives the starting point using the –vf command to setup a concat video effect. The –vf option only allows a single input file into the filter graph but allows the definition of some processing options inside the function. The post goes on to look at some of the challenges that the author was facing mostly related to differences between the input file resolution and aspect ratio. I played around with this in a much more simplified manner using an MXF sample input file as follows:

ffmpeg -i XX.mxf -vf "

color=c=black:s=1920x1080:d=10 [pre] ;

color=c=black:s=1920x1080:d=30 [post] ;

[pre] [in] [post] concat=n=3" –y output.mxf

In my case here I’m using the same output codec as the input so hence the simpler command line without additional parameters. The –y option means overwrite any existing files which I’ve just used for ease. The key to understanding this is the concat filter which took me quite a bit of work.

As a note, I’ve laid this out on multiple lines for readability, but it needs to be on a single command line to work.

Concat Video Filter

The concat video filter is the key to this operation which is stitching together the three components. There are a couple of concepts to explore, so I’ll take them bit by bit, some of the referenced links will use them earlier in examples so it might be worthwhile skipping the references first, reading through to the end and then dipping into the references as needed to explore more details once you have the full context.

Methods of concatenating files explains a range of ffmpeg options including concat which is descibed with this example:

ffmpeg -i opening.mkv -i episode.mkv -i ending.mkv \

-filter_complex '[0:0] [0:1] [1:0] [1:1] [2:0] [2:1] concat=n=3:v=1:a=1 [v] [a]' \

-map '[v]' -map '[a]' output.mkv

Notice in this example the \ is used to spread the command line over multiple lines as might be used in a linux script (this doesn’t work on windows). In this case there are multiple input files and the filter_complex command is used instead. As most of the examples use –filter_complex instead of –vf I’ll use that from now on. I had a number of problems getting this to work initially which I’ll describe as I go through.

In this case the concat command has a few more options:

concat=n=3:v=0:a=1" :

concat means use the media concatenate (joining) function.

n means confirm total count of input files.
v means has video? use 0 = no video, 1 = contains video.
a means has audio? use 0 = no audio, 1 = contain audio.

Some clues to understanding how this works are given with this nice little diagram indicating how the inputs, streams and outputs are mapped together:

                   Video     Audio

                   Stream    Stream

input_file_1 ----> [0:1]     [0:0]

                     |         |

input_file_2 ----> [1:1]     [1:0]

                     |         |

                   "concat" filter

                     |         |

                    [v]       [a]

                     |         |

                   "map"     "map"

                     |         |

Output_file <-------------------

Along with the following description:

ffmpeg -i input_1 -i input_2

-filter_complex "[0:1] [0:0] [1:1] [1:0] concat=n=2:v=1:a=1 [v] [a]"

-map [v] -map [a] output_file

The above command uses:

Two input files are specified: "-i input_1" and "-i input_2".
The "conact" filter is used in the "-filter_complex" option to concatenate 2 segments of input streams.
Two input files are specified: "-i input_1" and "-i input_2".
"[0:1] [0:0] [1:1] [1:0]" provides a list of input streams to the "concat" filter. "[0:1]" refers to the first (index 0:) input file and the second (index :1) stream, and so on.
"concat=n=2:v=1:a=1" specifies the "concat" filter and its arguments: "n=2" specifies 2 segments of input streams; "v=1" specifies 1 video stream in each segment; "a=1" specifies 1 audio stream in each segment.
"[v] [a]" defines link labels for 2 streams coming out of the "concat" filter.
"-map [v]" forces stream labeled as [v] go to the output file.
"-map [a]" forces stream labeled as [a] go to the output file.
"output_file" specifies the output file.

Filter_Complex input mapping

Before we get onto the output mapping, let’s look at what this input syntax means – I cannot remember quite where I found the information, but basically the definition of the concat command n, v, a gives the ‘dimensions’ of the input and output to the filter. So there will be v+a outputs and n*(v+a) inputs.

The inputs are referenced as follows: [0:1] means input 0, track 1, or is defined [0:v:0] which means input 0, video track 0.

There needs to be n*(v+a) of these, arranged (v+a) in front of the concat command. For example:

Concat two input video sequences

“[0:0] [1:0] concat=n=2:v=1:a=0”

Concat two input audio sequences (assuming audio is on the second track)

“[0:1] [1:1] concat=n=2:v=0:a=1”

Concat two input AV sequences (assuming audio is on the second and third tracks)

“[0:0] [0:1] [0:2] [1:0] [1:1] [1:2] concat=n=2:v=1:a=2”

Getting this wrong kept producing this cryptic message:

"[AVFilterGraph @ 036e2fc0] No such filter: '

Error initializing complex filters.

Invalid argument"

Which seems to be the root of some other people's problems as well.

Output mapping

Taking the rest of the filter, it is possible to put mappings after the concat command to identify the outputs:

concat=n=2:v=1:a=1 [v] [a]"

-map [v] -map [a]

These can then be mapped using the map command to the various output tracks which are created in order, so if you had four audio tracks it would look something like this:

concat=n=2:v=1:a=4 [v] [a1] [a2] [a3] [a4]"

-map [v] -map [a1] -map [a2] -map [a3] -map [a4]

This can be omitted and seems to work fine with a default mapping.

Notice in this case that the output tracks have all been named to something convenient to understand, likewise these could be written as follows:

concat=n=2:v=1:a=4 [vid] [engl] [engr] [frl] [frr]"

-map [vid] -map [engl] -map [engr] -map [frl] -map [frr]

This is also possible on the input as follows:

“[0:0] [vid1]; [1:0] [vid2]; [vid1] [vid2] concat=n=2:v=1:a=0”

Which is pretty neat and allows a bit more clearer description.

Generating Black

Now we’ve got the groundwork in place we can create some black video. I found two ways of doing this, first in the concat filter:

ffmpeg –i XX.mxf –filter_complex “[0:0] [video]; color=c=black:s=1920x1080:d=30 [black]; [black] [video] concat=n=2:v=1:a=0”

This has a single input file which we take the video track from and then create an input black video for 30secs and feed into the concat filter to produce a video stream.

Alternatively a number of samples show the input stream being created like this:

ffmpeg –i XX.mxf -f lavfi -i "color=c=black:s=1920x1080:d=10" –filter_complex “[0:0] [video]; [1:0] [black]; [black] [video] concat=n=2:v=1:a=0”

This all works fine, so let’s now add some audio tracks in. We’ll need to generate a matching audio track for the black video. I found when I was first playing with this that the output video files were getting truncated because the duration of the output only matched the ‘clock-ticks’ of the duration of the input video. The way I’ll do this is to generate a tone using a sine wave function and set the frequency to zero, which just saves me explaining this again later.

ffmpeg –i XX.mxf –filter_complex “color=c=black:s=1920x1080:d=10 [black]; sine=frequency=0:sample_rate=48000:d=10 [silence]; [black] [silence] [0:0] [0:1] concat=n=2:v=1:a=1”

And similarly if we wanted to top and tail with black it works like this:

ffmpeg –i XX.mxf –filter_complex “color=c=black:s=1920x1080:d=10 [black]; sine=frequency=0:sample_rate=48000:d=10 [silence]; [black] [silence] [0:0] [0:1] [black] [silence] concat=n=3:v=1:a=1”

or it doesn’t! It seems that you can only use the streams once in the mapping… so it’s easy enough to modify to this:

ffmpeg -i hottubmxf.mxf -filter_complex "color=c=black:s=1920x1080:d=10 [preblack]; sine=frequency=0:sample_rate=48000:d=10 [presilence]; color=c=black:s=1920x1080:d=10 [postblack]; sine=frequency=0:sample_rate=48000:d=10 [postsilence]; [preblack] [presilence] [0:0] [0:1] [postblack] [postsilence] concat=n=3:v=1:a=1" -y output.mxf

Which is a bit more of a faff, but works.

ColourBars and Slates

Adding ColorBars is simply a matter of using another generator like this to replace black at the front with colorbars and this time generating a tone:

ffmpeg -i hottubmxf.mxf -filter_complex "

testsrc=d=10:s=1920x1080 [prebars];

sine=frequency=1000:sample_rate=48000:d=10 [pretone];

color=c=black:s=1920x1080:d=10 [postblack];

sine=frequency=0:sample_rate=48000:d=10 [postsilence];

[prebars] [pretone] [0:0] [0:1] [postblack] [postsilence]

concat=n=3:v=1:a=1" -y output.mxf

Let’s add the black in back at the start as well:

ffmpeg -i hottubmxf.mxf -filter_complex "

testsrc=d=10:s=1920x1080 [prebars];

sine=frequency=1000:sample_rate=48000:d=10 [pretone];

color=c=black:s=1920x1080:d=10 [preblack];

sine=frequency=0:sample_rate=48000:d=10 [presilence];

color=c=black:s=1920x1080:d=10 [postblack];

sine=frequency=0:sample_rate=48000:d=10 [postsilence];

[prebars] [pretone] [preblack] [presilence] [0:0] [0:1] [postblack] [postsilence]

concat=n=4:v=1:a=1" -y output.mxf

Now let’s add the title to the black as a slate, which can be done with the following:

drawtext=fontfile=OpenSans-Regular.ttf:text='Title of this Video':fontcolor=white:fontsize=24:x=(w-tw)/2:y=(h/PHI)+th

Which I found along with some additional explanation for adding text boxes.

This can be achieved in the ffmpeg filtergraph syntax by adding the filter into the stream. In each of the inputs these filter options can be added as comma separated items, so for the [preblack], let’s now call that [slate] and would look like this:

color=c=black:s=1920x1080:d=10 ,drawtext=fontfile='C\:\\Windows\\Fonts\\arial.ttf':text='Text to write':fontsize=30:fontcolor=white:x=(w-text_w)/2:y=(h-text_h-line_h)/2 [slate];

Note the syntax of how to refer to the Windows path for the font.

This puts the text in the middle of the screen. Multiple lines are then easy to add.

hondrou thoughts

Wednesday, 24 August 2016

FFMPEG for adding black, colourbars, tone and slates

3 comments: