Jul 152016
 

H.265/HEVC logoSince I’ve started to release x265 binaries for Windows XP / XP x64 (and above), I’ve started to take interest in how it’s performance has developed over time. When looking back on x264, we can see that the encoders’ performance has always gradually improved with more and faster assembly code, optimizations for new CPUs’ instruction set extensions (think AMD FMA4, SSE4a or Intel AVX/AVX2) and so on. Over many years x264 has become a ton faster, as you can also see when comparing an old x264 v0.107.1745 (white results) with newer ones (red results) for the same CPUs in the [x264 benchmark].

So, x265 is much newer, but still, it’s been around for a few years now – since 2013 to be more precise – so I’d like to take a look at its performance trends using the options I typically apply for transcoding animated content. You can find the encoding settings and other information about the test below.

Note that I don’t compile every revision of x265, so I can compare only a few, namely the ones I chose to [release] periodically, excluding the new 2.0+2, which I haven’t released yet (waiting a bit more for a release). Here we go:

x265 performance trend across versions (2017-01-26)

x265 performance trend across versions, state 2017-01-26 (click to enlarge)

previous charts
x265 performance trends across versions, 2016-07-15

x265 performance trends across versions, state 2016-07-15

Note that this runtime graph is inverted, in the sense that “higher = better”. Version 1.7 was the first which could actually encode using the options of my choice, older versions don’t understand all of them and are not comparable in my case because of that. We start off with pretty bad performance and then we get an ample boost with the earlier 1.9 versions.

From 1.9+15 to 1.9+141 we can see a gradual performance increase, as expected from continued development and optimization. Naturally, 8-bits per color channel is fastest, as it makes use of a lot of 8-bit arithmetic internally. For higher bit depths (10 and 12 bits per channel, or 30 and 36 bits per pixel respectively) the internal arithmetic is boosted to a 16 bit precision, resulting in better outputs for finer gradients. This costs performance of course.

As expected, 10 and 12 bit runtimes are relatively close in terms of speed with 8 bit being quite far ahead.

Now the most surprising thing is the nose dive we take from 1.9+141 to 1.9+200. I have no idea what happened here?! Did they fix a major bug? Or did some performance-critical options change in the --preset veryslow that I’m using? I have no idea. But to put is in numbers easier to understand: For 8 bit, my test encode went from a runtime of 30:39.312 (1.9+141) to 41:20.296 (1.9+200)! Ah, the time format being MM:SS.sss, M = minutes, S = seconds and s = milliseconds. That’s almost -35%! It’s less pronounced for the higher bit depths with -25% for 10 bits and -24% for 12 bits, but that’s still significant. Maybe I shouldn’t have deleted the output data, might have been useful to look at the H.265 stream headers.

Well, from there on out we continue to see a gradual performance increase again, in a steady and stable fashion. But what happened between 1.9+141 and 1.9+200? I don’t know. Something major must have changed, I just don’t know what it was exactly…

Maybe somebody can enlighten me a bit[1], so here are the options used, right from the benchmarking script:

expand/collapse source code
  1. @ECHO OFF
  2.  
  3. :: ###################################################################################
  4. :: #                                                                                 #
  5. :: #  Please note, that command lines passed to timethis.exe may not be longer than  #
  6. :: #  1024 characters including expanded variables, so we need to keep it short      #
  7. :: #  enough. Some of the options passed to x265 may already be superflous due to    #
  8. :: #  being included in the veryslow preset, but I did not really double-check that. #
  9. :: #  Instead, I kept it just short enough by using shorter file names instead.      #
  10. :: #                                                                                 #
  11. :: #  Where to get the files:                                                        #
  12. :: #    * x265.exe     : http://wp.xin.at/x265-for-winxp-and-winxp-x64               #
  13. :: #    * avconv.exe   : http://wp.xin.at/x265-for-winxp-and-winxp-x64               #
  14. :: #    * colorecho.exe: http://wp.xin.at/archives/3064                              #
  15. :: #    * timethis.exe : https://support.microsoft.com/en-us/kb/927229               #
  16. :: #                                                                                 #
  17. :: #  GPL v3 license for this scripts code, colorecho.exe, parts of avconv.exe and   #
  18. :: #  the CygWin companion libraries for avconv.exe:                                 #
  19. :: #    * https://www.gnu.org/licenses/gpl-3.0.en.html                               #
  20. :: #                                                                                 #
  21. :: #  GPL v2 license for x265.exe:                                                   #
  22. :: #    * https://www.gnu.org/licenses/old-licenses/gpl-2.0.html                     #
  23. :: #                                                                                 #
  24. :: #  LGPL v3 license for other parts of avconv.exe:                                 #
  25. :: #    * https://www.gnu.org/licenses/lgpl-3.0.html                                 #
  26. :: #                                                                                 #
  27. :: #  NT Resource Kit license for timethis.exe:                                      #
  28. :: #    * https://enterprise.dejacode.com/license_library/Demo/ms-nt-resource-kit    #
  29. :: #                                                                                 #
  30. :: #  The x265 encoder  : http://x265.org                                            #
  31. :: #  The libav         : https://libav.org                                          #
  32. :: #  The Cygwin project: https://www.cygwin.com                                     #
  33. :: #                                                                                 #
  34. :: ###################################################################################
  35.  
  36. :: First round, 8 bits per channel color depth (MAIN profile), 8 bit arithmetic:
  37. FOR %%I IN (1.7-8b 1.9+15 1.9+108 1.9+141 1.9+200 1.9+210 1.9+230) DO .\colorecho.exe ^
  38.  ""Testing v%%I, 8 bits..."" 10 & .\timethis.exe ""echo x265 v%%I-8b & .\avconv.exe -r ^
  39.  24000/1001 -i input.h264 -f yuv4mpegpipe -pix_fmt yuv420p -r 24000/1001 - 2>NUL | ^
  40.  .\x%%I.exe - --y4m -D 8 --fps 24000/1001 -p veryslow --open-gop --ref 6 --bframes ^
  41.  16 --b-pyramid --bitrate 2000 --rect --amp --aq-mode 3 --no-sao --qcomp 0.75 ^
  42.  --no-strong-intra-smoothing --psy-rd 1.6 --psy-rdoq 5.0 --rdoq-level 1 ^
  43.  --tu-inter-depth 4 --tu-intra-depth 4 --ctu 32 --max-tu-size 16 --pass 1 ^
  44.  --slow-firstpass --stats %%I-8b.stats --sar 1 --range full -o %%I-8b-p1.h265 2>NUL & ^
  45.  .\avconv.exe -r 24000/1001 -i input.h264 -f yuv4mpegpipe -pix_fmt yuv420p -r ^
  46.  24000/1001 - 2>NUL | .\x%%I.exe - --y4m -D 8 --fps 24000/1001 -p veryslow ^
  47.  --open-gop --ref 6 --bframes 16 --b-pyramid --bitrate 2000 --rect --amp --aq-mode 3 ^
  48.  --no-sao --qcomp 0.75 --no-strong-intra-smoothing --psy-rd 1.6 --psy-rdoq 5.0 ^
  49.  --rdoq-level 1 --tu-inter-depth 4 --tu-intra-depth 4 --ctu 32 --max-tu-size 16 ^
  50.  --pass 2 --stats %%I-8b.stats --sar 1 --range full -o %%I-8b-p2.h265 2>NUL"" 1> ^
  51.  .\results-%%I-8b.txt 2>.\timethis-errorlog-%%I-8b.txt
  52.  
  53. :: Second round, 10 bits per channel color depth (MAIN10 profile), 16 bit arithmetic:
  54. FOR %%J IN (1.7-10b 1.9+15 1.9+108 1.9+141 1.9+200 1.9+210 1.9+230) DO .\colorecho.exe ^
  55.  ""Testing v%%J, 10 bits..."" 10 & .\timethis.exe ""echo x265 v%%J-10b & .\avconv.exe -r ^
  56.  24000/1001 -i input.h264 -f yuv4mpegpipe -pix_fmt yuv420p -r 24000/1001 - 2>NUL | ^
  57.  .\x%%J.exe - --y4m -D 10 --fps 24000/1001 -p veryslow --open-gop --ref 6 --bframes ^
  58.  16 --b-pyramid --bitrate 2000 --rect --amp --aq-mode 3 --no-sao --qcomp 0.75 ^
  59.  --no-strong-intra-smoothing --psy-rd 1.6 --psy-rdoq 5.0 --rdoq-level 1 ^
  60.  --tu-inter-depth 4 --tu-intra-depth 4 --ctu 32 --max-tu-size 16 --pass 1 ^
  61.  --slow-firstpass --stats %%J-10b.stats --sar 1 --range full -o %%J-10b-p1.h265 2>NUL ^
  62.  & .\avconv.exe -r 24000/1001 -i input.h264 -f yuv4mpegpipe -pix_fmt yuv420p -r ^
  63.  24000/1001 - 2>NUL | .\x%%J.exe - --y4m -D 10 --fps 24000/1001 -p veryslow ^
  64.  --open-gop --ref 6 --bframes 16 --b-pyramid --bitrate 2000 --rect --amp --aq-mode 3 ^
  65.  --no-sao --qcomp 0.75 --no-strong-intra-smoothing --psy-rd 1.6 --psy-rdoq 5.0 ^
  66.  --rdoq-level 1 --tu-inter-depth 4 --tu-intra-depth 4 --ctu 32 --max-tu-size 16 --pass ^
  67.  2 --stats %%J-10b.stats --sar 1 --range full -o %%J-10b-p2.h265 2>NUL"" 1> ^
  68.  .\results-%%J-10b.txt 2>.\timethis-errorlog-%%J-10b.txt
  69.  
  70. :: Second round, 12 bits per channel color depth (MAIN12 profile), 16 bit arithmetic:
  71. FOR %%K IN (1.7-12b 1.9+15 1.9+108 1.9+141 1.9+200 1.9+210 1.9+230) DO .\colorecho.exe ^
  72.  ""Testing v%%K, 12 bits"" 10 & .\timethis.exe ""echo x265 v%%K-12b & .\avconv.exe -r ^
  73.  24000/1001 -i input.h264 -f yuv4mpegpipe -pix_fmt yuv420p -r 24000/1001 - 2>NUL | ^
  74.  .\x%%K.exe - --y4m -D 12 --fps 24000/1001 -p veryslow --open-gop --ref 6 --bframes ^
  75.  16 --b-pyramid --bitrate 2000 --rect --amp --aq-mode 3 --no-sao --qcomp 0.75 ^
  76.  --no-strong-intra-smoothing --psy-rd 1.6 --psy-rdoq 5.0 --rdoq-level 1 ^
  77.  --tu-inter-depth 4 --tu-intra-depth 4 --ctu 32 --max-tu-size 16 --pass 1 ^
  78.  --slow-firstpass --stats %%K-12b.stats --sar 1 --range full -o %%K-12b-p1.h265 2>NUL ^
  79.  & .\avconv.exe -r 24000/1001 -i input.h264 -f yuv4mpegpipe -pix_fmt yuv420p -r ^
  80.  24000/1001 - 2>NUL | .\x%%K.exe - --y4m -D 12 --fps 24000/1001 -p veryslow ^
  81.  --open-gop --ref 6 --bframes 16 --b-pyramid --bitrate 2000 --rect --amp --aq-mode 3 ^
  82.  --no-sao --qcomp 0.75 --no-strong-intra-smoothing --psy-rd 1.6 --psy-rdoq 5.0 ^
  83.  --rdoq-level 1 --tu-inter-depth 4 --tu-intra-depth 4 --ctu 32 --max-tu-size 16 --pass ^
  84.  2 --stats %%K-12b.stats --sar 1 --range full -o %%K-12b-p2.h265 2>NUL"" 1> ^
  85.  .\results-%%K-12b.txt 2>.\timethis-errorlog-%%K-12b.txt
  86.  
  87. .\colorecho.exe "Benchmarks completed, cleaning up..." 10
  88.  
  89. :: Removing output and statistics files:
  90. del /Q *p*.h265 *stats*
  91.  
  92. .\colorecho.exe "All done, results are to be found in the results-*.txt files!" 10

So if I missed some critical changes that happened in between 1.9+141 and 1.9+200, please let me know! Oh, and here are the exact system specifications, in case it matters (it probably doesn’t):

  • Intel Xeon X5690 3.33GHz (SSE4.2 is max, no AVX), running at an all-core turbo of 3.6GHz
  • 24GB DDR-III/1066 10-10-10 2T
  • X58 chipset
  • Windows XP Professional x64 Edition SP2 with all Server 2003 R2 x64 updates

I guess I’ll keep doing the performance evaluations from here on out just to see how the encoder evolves over time, performance-wise… And maybe I’ll redo 1.9+141 and one of the newer versions and parse the stream headers to see if the effective encoding options differ anywhere after all. If yes, I’ll update this post!

Update 2017-01-27:

Three new x265 versions have been added to the performance trend analysis, 2.0+54, 2.1+60 and 2.2+22. Much to my dismay, there haven’t been any big developments when it comes to x265s’ performance on x86 machines over the last 6-8 months. Since 1.9+230 it’s pretty much linear, with only minimal variance.

So there have been changes and new options and all that, but it seems that the basic speed of the encoder in the --preset veryslow, implying --no-rskip has stayed the same. Maybe a Google Summer of Code performance challenge would do something for x265? As far as I can remember this worked miracles for x264 in the past as well, if my memory serves me right. Let’s see if we’ll get any significant improvements in the future…

[1] Thanks to “Particular” this [has been clarified] in the comments.

Feb 232016
 

H.265/HEVC logoAfter [compiling] and running the x265 HEVC encoder, and after [looking at its quality] for animated content, here’s another little piece of information about my experiments with H.265/HEVC. And this time it’s the decoding part. Playing H.265-encoded videos on the PC is relatively easy. On Windows I tend to use [MPC-HC] for this, and on Linux/UNIX you can use [mplayer] or [VLC]. The newest versions of those players are all linked against a modern libav or ffmpeg library collection, so they can decode anything any H.265 encoder can throw at them.

The questions are: At what price? And: What about mobile devices?

H.265/HEVC is costly in terms of computation, and not just in the encoding stage. Decoding this stuff is hard as well. So I looked at two older Core 2 processors to see how they fare when decoding regular 10-bit H.264/AVC and the same content encoded as 10-bit H.265/HEVC, both times at the same bitrate of 3Mbit/s ABR. Again, the marvelous Anime movie “The Garden of Words” was used for this. The video player of my choice for playback was [MPC-HC] v1.7.10, rendering to a VMR9 surface.

On top of that, I can also provide some insight on how relatively modern Android devices will handle this (devices partly without a H.265 hardware decoder chip however!), all thanks to [Umlüx], who’s been willing to install the necessary Apps and run some tests! On Android, [MX Player] was used.

For the record, the encoding settings were like this for x264 (pass 1 & pass 2), …

--fps 24000/1001 --preset veryslow --tune animation --open-gop --b-adapt 2 --b-pyramid normal -f -2:0
--bitrate 3000 --aq-mode 1 -p 1 --slow-firstpass --stats v.stats -t 2 --no-fast-pskip --cqm flat
--non-deterministic

--fps 24000/1001 --preset veryslow --tune animation --open-gop --b-adapt 2 --b-pyramid normal -f -2:0
--bitrate 3000 --aq-mode 1 -p 2 --stats v.stats -t 2 --no-fast-pskip --cqm flat --non-deterministic

…and this for x265:

--y4m -D 10 --fps 24000/1001 -p veryslow --open-gop --bframes 16 --b-pyramid --bitrate 3000 --rect
--amp --aq-mode 3 --no-sao --qcomp 0.75 --no-strong-intra-smoothing --psy-rd 1.6 --psy-rdoq 5.0
--rdoq-level 1 --tu-inter-depth 4 --tu-intra-depth 4 --ctu 32 --max-tu-size 16 --pass 1
--slow-firstpass --stats v.stats --sar 1 --range full

--y4m -D 10 --fps 24000/1001 -p veryslow --open-gop --bframes 16 --b-pyramid --bitrate 3000 --rect
--amp --aq-mode 3 --no-sao --qcomp 0.75 --no-strong-intra-smoothing --psy-rd 1.6 --psy-rdoq 5.0
--rdoq-level 1 --tu-inter-depth 4 --tu-intra-depth 4 --ctu 32 --max-tu-size 16 --pass 2
--stats v.stats --sar 1 --range full

PC first:

Contender #1 is my old Sony TT45X subnotebook, and this is the processor inside:

CPU-Z on a Core 2 Duo SU9600

A Core 2 Duo SU9600 Penryn, which is an ULV processor at 1.6GHz, shown under all-core load, where it doesn’t boost to 1.83GHz any longer (And yeah, this is still WinXP+POSReady2009).

Contender #2 is my secondary workstation at work, that I recently upgraded with an SSD and a better processor we had lying around. It’s using this chip now:

CPU-Z on a Core 2 Quad Q9505

A Core 2 Quad Q9505 Yorkfield, which has been overclocked to a rock solid 3.4GHz.

Let’s throw some video at them! First, we’ll try some good old H.264 on the slow-clocked Core 2 Duo mobile chip:

Core 2 Duo playing the beginning of "The Garden of Words" as 3Mbit H.264/AVC

The Core 2 Duo SU9600 playing the beginning of “The Garden of Words” as 3Mbit H.264/AVC (on a German version of Windows).

We can see some high load there. Mind you, since this is 10-bit H.264, there is no GPU acceleration whatsoever, minus maybe the bicubic scaler, which is implemented as Pixel Shader 2.0 code. Everything else has to be done by the CPU and its SSE extensions. Let’s take a look at the new H.265 then:

Core 2 Duo playing the beginning of "The Garden of Words" as 3Mbit H.265/HEVC

Same thing as before, but with 3Mbit H.265/HEVC.

The beginning of the movie, which shows only some moving logo on mostly uniform background and then some static text does just fine. You can really see what’s happening on screen when looking at the usage curve above. As soon as the first serious video frame with lots of movement starts to fade in, the machine just falls on its face, as the bitrate is rising. Realtime playback can no longer be achieved, A/V synchronization is being lost and a ton of frames are being dropped, resulting in a horrible experience.

Thing is, even with all the frame drops, it’s still losing sync even and falling short of delivering those frames which are still being decoded with the proper timing. It’s just abysmal.

So let’s change our environment to a CPU with twice the cores and roughly twice the clock speed, while staying with the same architecture / instruction set, H.264 first:

Core 2 Quad playing the beginning of "The Garden of Words" as 3Mbit H.264/AVC

The quad-core Q9505 hardly has any trouble with H.264 at all. It’s completely smooth sailing.

And with the harder stuff:

Core 2 Quad playing the beginning of "The Garden of Words" as 3Mbit H.265/HEVC

We can see that the load has roughly doubled with H.265.

Now, H.265 is clearly putting some load even on the quad-core. I’m assuming that with higher bitrates as we would use for 4K/UHD material, this processor might be in trouble. For 1080p and bitrates up to maybe 6Mbit/s it should still be ok however. Of course, with the most modern graphics cards you can give even a Core 2 a companion device which can do H.265 decoding in hardware, like an NVidia GPU that can provide you with [PureVideo] feature sets E or even F, or maybe an AMD Radeon with  [UVD] level 6. But even then, 10-bit content might not be accelerated, so you need to be careful when choosing your GPU. Intel has recently added 10-bit support to some of their onboard GPUs (HD Graphics 5500 & 6000, Iris Graphics 6100) with driver v15.36.14.4080, nVidia added it starting with the Maxwell generation and AMD Radeons still don’t have it as far as I know.

Now, what about mobile devices? A fast enough PC can do H.265/HEVC even without hardware assist, at the very least in 1080p, and likely 4K as well, when we look at modern Core i3/i5/i7 and somewhat comparable AMD Athlon FX chips. How about multicore ARMv7/v8 chips with NEON instruction set extensions?

Let’s look at two Android devices, a Sony Xperia Tablet Z2 (without a H.265 hardware decoder) and a Sony Xperia Z5 phone:

The Tablet features the following CPU:

The Xperia Tablet Z2 uses a Qualcomm Snapdragon 801

The Xperia Tablet Z2 uses a Qualcomm Snapdragon 801 quad-core.[1]

That’s not exactly a slow chip, but an out-of-order execution pipeline with NEON extensions at a decent clock rate. And the phone:

The Xperia Z5 has a Qualcomm Snapdragon 810

The Xperia Z5 has a Qualcomm Snapdragon 810 in big.LITTLE configuration.[1]

So the phone has a very modern 64-bit ARMv8 big.LITTLE CPU setup with four faster out-of-order cores, and four slower, energy-efficient in-order cores. Optimally, all of them should be used as much as possible when throwing a seriously demanding task at the device. Let’s look at how it goes, but first on the older tablet with H.264 for starters:

The Xperia Tablet Z2 playing H.264/AVC.

The Z2 manages to play the H.264/AVC version without any stuttering, but just barely. It has to boost its clock speed all the way to the top to manage.[1]

You’re probably thinking “There’s no way this can work with H.265”, right? Well, you’d be correct:

With H.265, the Xperia Tablet Z2 stumbles.

With 3-Mbit H.265, the Xperia Tablet Z2 stumbles. Full clock speed, all cores under load, no chance to play demanding scenes without massive problems.[1]

It just bombs. Frame drops and stuttering, no way you’d want to watch anything when it goes like that. Some of the calmer scenes (=lower bitrate) still work, but lots of rain all over the frame and the Z2 is done for. So let’s move to the more modern hybrid-core processor of the Sony Xperia Z5 smartphone:

The Z5 playing the 10-bit H.264 on CPU only

The Z5 playing the 10-bit H.264 on CPU only. The octa-core big.LITTLE processor seems to load its fat cores mostly, which makes sense with demanding workloads.[1]

The big Cortex-A57 cores seem to be doing most of the work here, clocking at their maximum speed. Given we’re over 50% load with the most difficult scenes however, some threads seem to be pushed over to the smaller Cortex-A53 cores as well. In any case, the end result is that H.264 works smoothly throughout the movie. But still, load is high, and our only headroom left are some free, slow in-order cores… So, H.265?

The Z5 fails with 10-bit H.264 too. Where is my hardware decoder?!

The Z5 fails as well. While it can cope a little better, scenes like the above are just too much. Where is my hardware decoder?![1]

Something strange is happening now: H.265/HEVC hardware decoders should support this 10-bit H.265 file just fine. But for some reason, MX Player still falls back to software decoding, despite the player being up-to-date – something even a CPU as powerful as a Snapdragon 810 cannot handle for all parts of “The Garden of Words” at the given settings. The really demanding scenes will still fail to play back decently. CPU clock is also lowered a bit now, probably because of power and/or heat management.

My assumption would be that MX Player simply doesn’t support the HW decoder in this phone yet, which is a shame if it’s true. Another reason might be that I used some parameters and/or features of H.265 in this encode, that are not implemented in this chip. Whichever the case may be, the Snapdragon 810 alone cannot handle it either!

Update: After further research it turns out that almost no hardware accelerator supports Hi10p or Main-10 for H.265. In other words: No 10-bit decoding for H.265! It’s possible on NVidias Tegra K1 & X1 due to some CUDA hacks in MX Player, but nowhere else it seems. The upcoming Snapdragon 820 should however support it, and devices based on it should become available around March 2016:

Snapdragon 820s' Advancements

This is what Snapdragon 820 (MSM8996) should give us over Snapdragon 810 (MSM8994), including 10-bit H.265/HEVC decoding in hardware ([source]Russian flag)!

And this concludes my little performance analysis, after which I can say that either you need a relatively ok PC to use H.265, or a hardware decoder chip that works with your files and your software, if you’re targeting other platforms like hardware players or smartphones and tablets!

PS.: Thanks fly out to Umlüx for doing the Android tests!

[1] Images are © 2016 Umlüx, used with express permission.

Jan 262014
 

Brotherhood of Nod logoSometimes when you have a seriously weird problem and cannot solve it by yourself, the solution might just turn up – years later. As it happened just now. So, I was playing Command & Conquer 3 – Tiberium Wars again, a game that was released in 2007, so it’s somewhat old by now. Still a classic in my book, as its gameplay is so similar to the very first Command & Conquer released by Westwood Studios in 1995.

At some point though, out of nowhere, the cutscene videos started to stutter. I wasn’t sure if it was some video codec fucking things up, or some graphics driver update, who knows? I stripped my system clean of all DirectShow codecs, tried different drivers, tried to dis-/enable vertical synchronization, everything. Nothing worked. And today, I thought “why not search the web for this one last time”.

And I found [this]German flag, which probably referenced [this thread] on the Steam forums. Those guys are talking about changing a weird option in a C&C3 configuration file that doesn’t sound as if it could be related to video playback performance. But why not try? So I navigated to %APPDATA%\Command & Conquer 3 Tiberium Wars\Profiles\<ingame username>\ (adjust for Windows Vista and newer, this path is XP style) and opened the Options.ini, which looked like this:

AlternateMouseSetup = 0
AmbientVolume = 50.000000
AnimationLOD = UltraHigh
AntiAliasingLOD = 8
AudioQualitySetting = High
Brightness = 50
DecalLOD = High
EffectsLOD = UltraHigh
EnableSpecialEditionContent = 1
GameSpyIPAddress = <I CUT THIS OUT, THIS IS AN IP ADDRESS>
HasSeenLogoMovies = yes
IdealStaticGameLOD = VeryLow
ModelLOD = High
MovieVolume = 70.000000
MusicVolume = 70.000000
Resolution = 2560 1600
SFXVolume = 70.000000
ScrollFactor = 50
SendDelay = 0
ShaderLOD = UltraHigh
ShadowLOD = UltraHigh
ShowTickerAds = 1
ShowTickerNews = 1
StaticGameLOD = Custom
TerrainLOD = UltraHigh
TextureQualityLOD = High
ToolTipDelay = 0
VoiceVolume = 70.000000
WaterLOD = UltraHigh

Then I changed the option IdealStaticGameLOD from VeryLow to UltraHigh, just like those people were suggesting, saved and re-launched the game. And gone were all woes! Ha! And it took me so many years to find this. I guess when I searched the web back then, the solution was not yet discovered. I still don’t get what a level-of-detail setting has to do with smooth video playback, especially since it gets faster when you set it to a supposedly more demanding value, but oh well. At least it finally works now. And in a few years, when I install the game again (if I still can, depending on my operating system), I know where to look for the solution. ;)

Kane is pleased with video playback performance now

Kane is now pleased with video playback performance