Making Viewer UIs for Pitivi

Making Viewer UIs for Pitivi

Being someone who has already experimented with two transformation box approaches for Pitivi in the past, maintainers thought I might be the right person to do a modern one.

Creating a user interface for a video transformation requires three things:

  • The implementation of the transformation
  • A way to draw the widgets over the viewer and
  • Mapping the input to the reverse transformation

The transformation

First of all the implementation of the transformation, which is in our case scaling and translation, is currently done by GES.UriSource, calculated on the CPU. In the first Pitivi transformation box I did in GSoC 2012 this was done by the notorious Frei0r plugins from GStreamer Plugins Bad, which is also a CPU implementation. In the second version this was done on the GPU with the gltransformation element I wrote for GSoC 2014.

A method to draw widgets over the viewer

In Pitivi’s case, the viewer is a GStreamer sink. In all three versions rendering of the overlay widgets was done by Cairo, but it was done differently for all three implementations, since they all used different sinks.

 

 

2012: The first one used a hacky solution where the sink and cairo drew in the same Gtk drawing area, acquired for GStreamer with the Gst Overlay API. Many Gtk and GStreamer devs wondered how this worked at all. This and Pitivi switching to more modern sinks was the reason why the first version of the box didn’t stay upstream for long.


lubosz-the-magician

 

2014: Still using Gst Overlay API, but this time with the glimagesink. The cairo widgets are rendered into OpenGL textures and composed in the glimagesink draw callback in the GStreamer GL context. Worked pretty smooth for me, but didn’t provide a fallback solution for users without GL. Clearly an approach for the future, but how about something solid?

screenshot-from-2016-09-26-16-04-40

 

2016: Now we have the almighty GtkSink in Pitivi. It is a Gtk widget and overlays can be added via Gtk Overlay. The sink is also exchangeable with GtkGLSink, which uses a GStreamer GL context to display the video texture and also can use GStreamer GL plugins like gltransformation without needing to download the GPU memory with gldownload. Cairo rendering now can be easily added over GStreamer sinks, yay.

screenshot-from-2016-09-26-16-03-27

 

Linking the UI with the components doing the transformation

The mapping of the input from the UI to the transformation is clearly dependent on the transformation you are using. In the 2012 version I needed to map the input to frei0r^-1. In 2014 I used an OpenGL Model-View-Projection matrix calculated in Graphene, which could also do rotations and 3D transformations (we have a z-axis, yay).

The 2016 implementation uses the inverse transformation for the GES.UriSource transformation, which is done by the GStreamer elements videomixer and videoscale. Of course things like keeping aspect ratio, maintaining limits and transforming Gtk widget coordinates to the transformation’s coordinates are part of this 3rd ingredient.

 

Extensibility

The new transformation box fits great with Pitivi by making clips selectable from the viewer, so you can manage multiple overlapping clips quite easily. But the best part of this implementation may be its extensibility. I already made two overlay classes, one for the normal clips which uses a GES.UriSource transformation and one for title clips aka GES.TextSource, which is using different coordinates and different GStreamer plugins. In this fashion other plugins can be written for the Pitivi viewer, for example for 3D transformations with gltransformation. Or you could do crazy stuff like a UI for barrel distortion etc.

Clone the code and contribute! ūüôā

 

If you have any questions about this ask me. I’m lubosz on Freenode IRC.

Transforming Video on the GPU

OpenGL is very suitable for calculating transformations like rotation, scale and translation. Since the video will end up on one rectangular plane, the vertex shader only needs to transform 4 vertices (or 5 with GL_TRIANGLE_STRIP) and map the texture to it. This is a piece of cake for the GPU, since it was designed to do that with many many more vertices, so the performance bottleneck will be uploading the video frame into GPU memory and downloading it again.

The transformations

GStreamer already provides some separate plugins that are basically suitable for doing one of these transformations.

Translation

videomixer: The videomixer does translation of the video with the xpos and ypos properties.

frei0r-filter-scale0tilt: The frei0r plugin is very slow, but it has the advantage of doing scale and tilt (translate) in one plugin. This is why i used it in my 2011 GSoC. It also provides a “clip” propery for cropping the video.

Rotation

rotate: The rotate element is able to rotate the video, but it has to be applied after the other transformations, unless you want borders.

Screenshot from 2014-06-16 17:54:44

Scale

videoscale: The videoscale element is able to resize the video, but has to be applied after the translation. Additionally it resizes the whole canvas, so it’s also not perfect.

frei0r-filter-scale0tilt: This plugin is able to scale the video, and leave the cansas size as it is. It’s disadvantage is being very slow.

So we have some plugins that do transformation in GStreamer, but you can see that using them together is quite impossible and also slow. But how slow?

Let’s see how the performance of gltransformation¬†compares to the GStreamer CPU transformation plugins.

Benchmark time

All the commands are measured with “time”. The tests were done on the nouveau driver, using MESA as OpenGL implementation. All GPUs should have simmilar results, since not really much is calculated on them. The bottleneck should be the upload.

Pure video generation

gst-launch-1.0 videotestsrc num-buffers=10000 ! fakesink

CPU 3.259s

gst-launch-1.0 gltestsrc num-buffers=10000 ! fakesink

OpenGL 1.168s

Cool the gltestsrc seem to run faster than the classical videotestsrc. But we are not uploading real video to the GPU! This is cheating! Don’t worry, we will do real world tests with files soon.

Rotating the test source

gst-launch-1.0 videotestsrc num-buffers=10000 ! rotate angle=1.1 ! fakesink

CPU 10.158s

gst-launch-1.0 gltestsrc num-buffers=10000 ! gltransformation zrotation=1.1 ! fakesink

OpenGL 4.856s

Oh cool, we’re as twice as fast in OpenGL. This is without uploading the video to the GPU though.

Rotating a video file

In this test we will rotate a HD video file with a duration of¬†45 seconds. I’m replacing only the sink with fakesink. Note that the CPU rotation needs ¬†videoconverts.

gst-launch-1.0 filesrc location=/home/bmonkey/workspace/ges/data/hd/fluidsimulation.mp4 ! decodebin ! videoconvert ! rotate angle=1.1 ! videoconvert ! fakesink

CPU 17.121s

gst-launch-1.0 filesrc location=/home/bmonkey/workspace/ges/data/hd/fluidsimulation.mp4 ! decodebin ! gltransformation zrotation=1.1 ! fakesink

OpenGL 11.074s

Even with uploading the video to the GPU, we’re still faster!

Doing all 3 operations

Ok, now lets see how we perform in doing translation, scale and rotation. Note that the CPU pipeline does contain the problems described earlier.

gst-launch-1.0 videomixer sink_0::ypos=540 name=mix ! videoconvert ! fakesink filesrc location=/home/bmonkey/workspace/ges/data/hd/fluidsimulation.mp4 ! decodebin ! videoconvert ! rotate angle=1.1 ! videoscale ! video/x-raw, width=150 ! mix.

CPU 17.117s

gst-launch-1.0 filesrc location=/home/bmonkey/workspace/ges/data/hd/fluidsimulation.mp4 ! decodebin ! gltransformation zrotation=1.1 xtranslation=2.0 yscale=2.0 ! fakesink

OpenGL 9.465s

No surprise, it’s still faster and even correct.

frei0r-filter-scale0tilt

Let’s be unfair and benchmark the frei0r plugin. There is one advantage, that it can do translation and scale correctly, but rotation can only be applied at the end. So no rotation at different pivot points is possible.

gst-launch-1.0 filesrc location=/home/bmonkey/workspace/ges/data/hd/fluidsimulation.mp4 ! decodebin ! videoconvert ! rotate angle=1.1 ! frei0r-filter-scale0tilt scale-x=0.9 tilt-x=0.5 ! fakesink

CPU 35.227s

Damn, that is horribly slow.

The gltransformation plugin is up to 3 times faster than that!

Results

The gltransformation plugin does all 3 transformations together in a correct fashion and is fast in addition. Furthermore threedimensional transformations are possible, like rotating around the X axis or translating in Z. If you want, you can even use orthographic projection.

I also want to thank ystreet00 for helping me to get into the world of the GStreamer OpenGL plugins.

To run the test yourself, check out my patch for gst-plugins-bad:

https://bugzilla.gnome.org/show_bug.cgi?id=731722

Also don’t forget to use my python testing script:

https://github.com/lubosz/gst-gl-tests/blob/master/transformation.py

Graphene

gltransformation utilizes ebassi’s new graphene library, which implements linear algebra calculations needed for new world OpenGL without the fixed function pipeline.

Alternatives worth mentioning for C++ are QtMatrix4x4 and of course g-truc’s glm. These are not usable with GStreamer, and I was very happy that there was a GLib alternative.

After writing some tests and ebassi’s wonderful and quick help, Graphene was ready for usage with GStreamer!

Implementation in Pitivi

To make this transformation usable in Pitivi, we need some transformation interface. The last one I did was rendered in Cairo. Mathieu managed to get this rendered with the ClutterSink, but using GStreamer OpenGL plugins with the clutter sink is currently impossible. The solution will either be to extend the glvideosink to draw an interface over it or to make the clutter sink working with the OpenGL plugins. But I am rather not a fan of the clutter sink, since it introduced problems to Pitivi.

View Side-by-Side Stereoscopic Videos with GStreamer and Oculus Rift

Screenshot from 2013-08-28 16:36:17

GStreamer can do a lot. Most importantly it has the exact functionality I was looking for when I wanted to play a stereoscopic video on the Oculus Rift: Decoding a video stream and applying a GLSL fragment shader to it.

Available Players

I found a few solutions that to try to achieve that goal, but they were very unsatisfactory. Mostly they failed to decode the video stream or didn’t start for other reasons. They are not maintained that well, since they are recent one man projects with compiled only releases on forums. And worst of all, they only support Windows.

Surprisingly, I experienced the best results with OculusOverlay and VLC Player. Which transforms a hardcoded part of your desktop in a very hacky way with XNA. Works also with YouTube.

VideoPal is a player written in Java and using JOGL. In theory it could work in Linux but:

Exception in thread “main” java.lang.UnsatisfiedLinkError: Could not load SWT library. Reasons:
no swt-win32-3740 in java.library.path

Yeah.. no time for jar reverse engineering and no link to the source. I was able to run it on Windows, but it couldn’t open a H264 video.

There is also OculusPlayer using libvlc but does not release the source. The idea is good, but it didn’t work.

VR Player is a GPLv2 licenced Player written in C#. It also couldn’t decode the stream.

Another player is VR Cinema 3D, which does not play a stereo video, but simulates a virtual cinema with a 2D film. Funny idea.

Get some Stereo Videos

You can search for stereoscopic videos on YouTube with the 3D search filter. There a tons of stereoscopic videos, like this video of Piranhas.

Download the YouTube video with the YouTube downloader of you choice, which supports 3D videos. For example PwnYouTube.

For convenient usage in the terminal you should rename the file to something short and without spaces.

Using GStreamer

The minimal GStreamer pipeline for playing the video stream of a mp4 file (QuickTime / H.264 / AAC)  looks like this

$ gst-launch-1.0 filesrc location=piranhas.mp4 ! qtdemux ! avdec_h264 ! autovideosink

It contains the GStreamer elements for file source, QuickTime demuxer, H264 decoder and the automatic video sink.

If you want more information on the elements, try gst-inspect

$ gst-inspect-1.0 qtdemux

If you want audio you need to name the demuxer and add a audio queue with a decoder and an audio sink.

$ gst-launch-1.0 filesrc location=piranhas.mp4 ! qtdemux name=dmux ! avdec_h264 ! autovideosink dmux. ! queue ! faad ! autoaudiosink

Let’s add some Oculus Rift distortion now. We will use a GLSL fragment shader and the¬†glshader element from the gst-plugins-gl package for that. Since the GStreamer GL Plugins are not released yet, you need to build them by yourself. You could use my Archlinux AUR package or the GStreamer SDK build system cerbero. Here is a tutorial how to build GStreamer with cerbero.

In the GLSL shader you can change some parameters like video resolution, eye distance, scale and kappa. This could also be done with uniforms and set by a GUI.

The final GStreamer pipeline looks like this. Since we are using a GL plugin, we need to use the glimagesink.

TL;DR

$ gst-launch-1.0 filesrc location=piranhas.mp4 ! qtdemux name=dmux ! avdec_h264 ! glshader location=distortion.frag ! glimagesink dmux. ! queue ! faad ! autoaudiosink

Seeking and full screen are features that could be achieved in a GStreamer python application.