Systems and
Formalisms Lab

Recording and editing a talk for an online conference

A step-by-step guide on using guvcview, ffmpeg, and aegisub to assemble a talk video.

Most PL conferences moved online this year, and many are asking authors to pre-record talks to avoid livestreaming difficulties. Here are the steps we followed to record and edit our Kôika talk at PLDI 2020 and our extraction talk at IJCAR 2020. This posts covers preparing and recording the talk, adding the slides, and captioning the final video.

Many people at PLDI swore by OBS studio, though, so you should check that out too. My colleague Jason Gross also points out that Zoom has an automatic captioning feature for paid plans, so if you're confident recording everything at once and you have Zoom you can create a private room and use that feature.

Step 1: Make the slides

The exact format doesn't matter too much, since you'll record a video of your screen flipping through the slide deck anyway. We used Google Slides to facilitate remote collaboration (my favorite is usually beamer). For some reason it will sometimes switch the background color of all the liner notes to dark grey, and if you try to change the aspect ratio of the slides it will diligently resize all images to that new aspect ratio. It's terrible, but it worked.

We used the liner note boxes to build a fairly detailed outline of the talk — much more detailed than I would usually have for a talk. The assumption was that the usual caveats about learning your speech by heart don't apply as strongly to a pre-recorded talk. This ended up being useful in the captioning phase.

Step 2: Record the speech

I suppose you could do this at the same time as you're recording the slides, but it's 30°C in Boston right now and recording both my screen and my camera caused my 2014 laptop to overheat and lag.

For some reason VLC's latency is abominable with my webcam (> 1s), so I used Guvcview instead (packaged as guvcview in Ubuntu), which conveniently re-exposes most of the settings that are typically only accessible through v4l2-ctl (package v4l-utils) or qv4l2 (package v4l2ctl). (Ubuntu comes pre-packaged with Cheese, but it doesn't give you quite control over the recording format.)

In practice, you don't need to record in HD, since you'll be embedding the camera image into your slides (in fact a properly stabilized phone would probably work just as well as a computer webcam). I recommend 480x360.

If you want to post-process the audio to reduce noise, leave a few seconds at the beginning or at the end to recording a usable noise profile.

Step 3: Record the slides

I used Simple screen recorder (package simplescreenrecorder), which is much more elaborate than the name suggests and works very nicely, to record my screen as I flipped through the slides.

The trick to align the slides and the webcam's audio is to listen to the webcam recording as you go through the slides and to start recording the slides at the same time as you start listening to the audio recording.

It helps to record the slides in the resolution you'll need them at, to avoid having to resize them (though if your conference is going to record a volunteer's screen playing a scaled version of your video in VLC and stream that to YouTube using Zoom, paying attention to pixel-perfection is likely a waste of time). I recommend 1440x1080.

Unless your conference plans to do it, make sure to include your name and the title of your talk on all slides, not just on the title slide (people who connect in the middle of the stream will thank you, or at least I think they will, because we didn't do it, so I don't know for sure). Slide numbers are nice too, though not as useful as in a physical conference.

If you plan to add captions, leave some space at the bottom of the slides (unless you plan to add your webcam stream to the side of the slides, in which case you'll have space on the side).

With Google Slides, recording to exactly the right dimensions was pretty easy, because opening the web developer tools (right click → Inspect) and resizing the browser window shows the exact pixel dimensions of the webpage area.

Step 4: Put everything together

I used FFmpeg and promised myself that if I ever get an academic job I'll hire a student to write to write a DSL for video manipulation that doesn't look like FFmpeg's filter graph syntax, until I realized that the Simpsons Racket folks already did it, so our version will probably have to be a dependently typed DSL in Coq and no one will want to use it.

Here's how to do the editing with FFmpeg. If you know the steps in Racket's #lang video please make a PR to this post; I'll offer you a beer emoji next time we're together in an online conference chatroom:

ffmpeg -i slides.mkv -i webcam.mp4 -filter_complex "[0:v] pad=width=1920:height=1080:x=0:y=0:color=black [left]; [1:v] scale=480:360 [right]; [left][right] overlay=x=main_w-overlay_w:y=0 [out]" -map "[out]" -map 1:a:0 koika.mp4

The syntax of this monster, explained:

Step 5: Add captions

This is really easy to do if you have a detailed outline, and people will thank you (this time I speak from experience).

This government website has a lot of information and tips on making videos more accessible.

Other notes

Cleaning up the audio with Audacity

My first audio take was about 50% voice and 50% loud fan noises from my computer, with a hint of trucks passing down the street (I had opened the window to try to cool down the computer), so I had to post-process the audio to get a decent recording, which was a decent occasion to learn two important lessons: 1. ice packs are not an efficient way to cool down an overheating laptop (I tried) and 2. Audacity (package audacity) works pretty well for reducing noise [1]. Hopefully you'll need neither of those tips.

  • Use ffmpeg -i webcam.mkv -vn -acodec copy koika.aac to extract the audio.

  • To remove the background noise: - Open the audio file in Audacity - Select a few seconds of noise only, then click Effect → Noise reduction and in the window that opens click Get Noise Profile. At this point Audacity has captured a noise profile, based on your selection. - Select the full track (Ctrl+A) and click Effect → Noise reduction again. The defaults are fine in my experience, so click Preview and then OK - Use File → Export → Export Audio… to export the cleaned-up audio back to MP4 (AAC).

  • Use ffmpeg -i koika.mkv -i koika.wav -c:a copy -c:v copy -map 0:v:0 -map 1:a:0 koika-audacity.mkv to reassemble the video and the audio without recoding either.