A Dirty Hack to Enable Acceptable Sway WM Screen Recording

Aaron D. Fields included in Other Technical Writeups

2017-08-22 712 words 4 minutes

Contents

TL;DR

https://gist.github.com/Spirotot/9eb68d7e1ea4984c3afa3ef28e2c046d

Background

I’ve been running the Wayland display server on my XPS 9550 for a while now, using sway, the drop-in replacement for the i3 window manager. One of the significant changes you’ll notice moving to Wayland from X is that “regular” screen capture tools¹ don’t work. This is due to Wayland’s security architecture; applications are no longer allowed access to the full frame buffer. In other words, apps can’t see what other apps are displaying unless they get special permission from the window manager. Because there is apparently no standardized screenshot/screen recording API to be used across different window managers, each window manager is effectively responsible for developing and including their own screenshot/screen-recording tool. Sway’s screenshot & screen recording tool is called swaygrab.

Recording with swaygrab

Screen Recording

It’s really easy. I typically just use the following command to get a high-quality recording:

1
swaygrab -c -o [display] [output].mkv

The -c means “capture” a display continuously; without it, swaygrab will just take a screenshot.

If you’re wondering what to use for [display], you can list the displays that Sway sees by running swaymsg -t get_outputs.²

I’ll leave it up to you to decide what to call your .mkv file.

Audio Recording

You might also be interested in capturing audio while you’re doing this. swaygrab does not capture audio, so my recommendation would be to use arecord from the alsamixer audio driver package.

1
arecord -f cd -D [device] [output].wav

The -f cd argument is a present for 16-bit, 44100khz, stereo quality. -D [device] specifies the recording device – you can list available devices by running arecord -l, which will spit out something like the following:

1
2
3
4
**** List of CAPTURE Hardware Devices ****
card 0: PCH [HDA Intel PCH], device 0: ALC3266 Analog [ALC3266 Analog]
  Subdevices: 1/1
  Subdevice #0: subdevice #0

To use that device, then your argument would be -D hw:0,0. The first 0 specifies the card, and the second 0 specifies the subdevice.

The Dirty Hack

If you’ve recorded your screen and audio, you’ll notice that the resulting output files are of differing lengths – this is what I was talking about in the Background. The .wav audio file you recorded will be the right length… but your .mkv video file will come up short, often around 50% of the length of the .wav file. This is no good. And even if they were the same length, it’d still be nice to have the audio and video embedded into one media file so you can easily share/upload the video.

First thing to do is calculate the length of the .wav and .mkv in seconds. Then, divide the longer length (should be the .wav's length) by the shorter length (the .mkv's length); you should end up with a floating-point number between ~1.5 and ~2.5, although this may vary greatly. We’ll call this number the [multiplier]. We’ll use the [multiplier] to “stretch” the length of the .mkv file to match that of the .wav file:

1
ffmpeg -i [input_mkv] -filter:v "setpts=[multiplier]*PTS" -preset ultrafast [output].mkv

[input_mkv] is actually the original, too-short .mkv you already created. [output].mkv is the new, proper-length .mkv the preceding command will create.

Now, we just need to squash our new .mkv and our original .wav into one media file, which can be done with the following command:

1
ffmpeg -i [input_mkv] -i [input_wav] -c:v copy -c:a aac [output].mkv

In this case, [input_mkv] is the newer, proper-length .mkv, and [input_wav] is the audio you recorded. The -c:v copy tells ffmpeg to just keep the current video format/bitrate/etc., and the -c:a aac tells ffmpeg to convert the .wav audio to the compressed AAC format for the final [output].mkv file.

Conclusion & Automation!

So was that confusing and poorly explained, or what? Probably… Fortunately, I love writing crappy Bash scripts, and so I’ve turned all of the information here into a script you can download and run yourself. It has the added bonus of automatically starting & stopping the screen and audio recording simultaneously!

	#!/bin/bash
	# Sway WM screen + audio recorder
	# Usage: ./record -d [display] -a [audio_device] -o [project_output_name]
	#
	# Displays can be listed with `swaymsg -t get_outputs`.
	# Audio devices can be listed with `arecord -l`.
	# Probably best not to put spaces in the "-o" argument, sorry...
	#
	# Dependencies: ffmpeg, alsamixer
	#
	# Example: ./record.sh eDP-1 hw:0,0 my_recording
	#
	# Note: If this file is sorely out of date, it's either no longer relevant,
	# and/or I decided to push changes here: https://github.com/Spirotot/dotFiles

	# Define some variables we're going to use...
	DISP=""
	AUDIO=""
	OUTPUT=""
	SCREEN_CMD=""
	AUDIO_CMD=""
	SCREEN_PID=""
	AUDIO_PID=""
	START=""

	# Set a trap for Ctrl+C (SIGINT) so that we can forward the
	# Ctrl+C to the `swaygrab` and `arecord` subprocesses.
	# Inspired by: https://stackoverflow.com/questions/8993655/can-a-bash-script-run-simultaneous-commands-then-wait-for-them-to-complete
	trap killandconvert SIGINT

	# `killandconvert()` kills the `swaygrab` and `arecord` subprocesses
	# when Ctrl+C is pressed, and then proceeds to fix up the length
	# discrepencies, and create the final output MKV.
	killandconvert() {
	# Forward the SIGINT to `swagrab` and `arecord` so they can shut
	# themselves down properly.
	kill -2 $SCREEN_PID
	kill -2 $AUDIO_PID

	# Wait for them to exit...
	wait $AUDIO_PID
	wait $SCREEN_PID

	# Get the lengths:
	# * https://forum.videolan.org/viewtopic.php?t=56438
	# * https://stackoverflow.com/questions/20323640/ffmpeg-deocde-without-producing-output-file
	# Convert the lengths with awk: https://askubuntu.com/questions/407743/convert-time-stamp-to-seconds-in-bash
	SCREEN_LENGTH=`ffmpeg -i ${OUTPUT}_orig.mkv -f null /dev/null 2>&1 \| \
	grep Duration \| awk '{print $2}' \| tr -d "," \| \
	awk -F: '{print ( $1 * 3600) + ($ 2 * 60) + $3}'`

	if [ "$START" = "" ]; then
	AUDIO_LENGTH=`ffmpeg -i ${OUTPUT}_orig.wav -f null /dev/null 2>&1 \| \
	grep Duration \| awk '{print $2}' \| tr -d "," \| \
	awk -F: '{print ( $1 * 3600) + ($ 2 * 60) + $3}'`
	else
	# https://unix.stackexchange.com/questions/53841/how-to-use-a-timer-in-bash
	AUDIO_LENGTH=$((SECONDS - START))
	fi

	# Calculate the multiplier used to sync the video to the audio.
	# https://stackoverflow.com/questions/12722095/how-do-i-use-floating-point-division-in-bash
	MULTIPLIER=`bc -l <<< "scale=8; $AUDIO_LENGTH/$ SCREEN_LENGTH"`

	# "Sync" the video to the audio by stretching it.
	# https://trac.ffmpeg.org/wiki/How%20to%20speed%20up%20/%20slow%20down%20a%20video
	`ffmpeg -i ${OUTPUT}_orig.mkv -filter:v "setpts=$ {MULTIPLIER}*PTS" \
	-preset ultrafast ${OUTPUT}_tmp.mkv`

	if [ "$START" = "" ]; then
	# Combine the video and audio streams into one output file.
	`ffmpeg -i ${OUTPUT}_tmp.mkv -i$ {OUTPUT}_orig.wav \
	-c:v copy -c:a aac ${OUTPUT}.mkv`
	else
	# If there is no audio stream, then just rename the video stream
	# as the final outout file.
	mv ${OUTPUT}_tmp.mkv$ {OUTPUT}.mkv
	fi

	# Cleanup
	rm -f ${OUTPUT}_orig.mkv
	rm -f ${OUTPUT}_tmp.mkv
	rm -f ${OUTPUT}_orig.wav
	}

	# Parse the command line options...
	# http://abhipandey.com/2016/03/getopt-vs-getopts/
	while getopts d:a:o: FLAG; do
	case $FLAG in
	d)
	DISP=$OPTARG
	;;
	a)
	AUDIO=$OPTARG
	;;
	o)
	OUTPUT=$OPTARG
	;;
	esac
	done

	# Check the user's options to make sure they're somewhat sane.
	if [ "$OUTPUT" = "" ]; then
	echo "No output specified."
	exit 1
	fi

	if [ "$DISP" = "" ]; then
	echo "No display specified."
	exit 1
	else
	# Build the command used for screen recording.
	SCREEN_CMD="swaygrab -c -o $DISP$ {OUTPUT}_orig.mkv"
	fi

	if [ "$AUDIO" = "" ]; then
	echo "Proceeding without audio recording."
	else
	# Build the command used for audio recording.
	AUDIO_CMD="arecord -f cd -D $AUDIO$ {OUTPUT}_orig.wav"
	fi

	# Start the screen recorder...
	$SCREEN_CMD &
	# ... and save the PID so we can kill it gracefully later.
	SCREEN_PID=$!

	if [ ! "$AUDIO_CMD" = "" ]; then
	# Start the audio recorder...
	$AUDIO_CMD &
	# ... and save the PID so we can kill it gracefully later.
	AUDIO_PID=$!
	else
	# Unless we're not going to record audio, in which case we'll
	# simply use a timer to figure out how much we need to stretch
	# the video...
	# https://unix.stackexchange.com/questions/53841/how-to-use-a-timer-in-bash
	START=$SECONDS
	fi

	# Just hang out until the user presses Ctrl+C
	wait

view raw record.sh hosted with ❤ by GitHub

If you are still using X, shutter is by far my personal favorite screenshot tool, and recordMyDesktop is definitely my favorite screen recording tool. ↩︎
Lots of other good Sway info here, especially if you have a HiDPI (i.e. 4k) display: https://github.com/SirCmpwn/sway/wiki ↩︎