Oct 162017
 

Fraps logoDoes anyone still remember [Fraps]? It seems development has come to a halt in 2013, but it was once the standard tool for measuring frame rates in 3D games and programs under Windows. Also, of course, for recording in-game video footage. All that up to Direct3D 11 and any version of OpenGL. And while the latest (and probably last) version 3.5.99 still officially supports Windows XP, it was [recently brought to my attention]German flag that it throws the “not a valid Win32 application” error.

Since I’ve been using Fraps myself, I thought I’d take a quick look at this. After confirming the error, I looked at its binary files fraps32.dll, fraps64.dat, fraps64.dll, frapslcd.dll and fraps.exe with NTCores’ [CFF Explorer]. It was as expected, its platform target was set to Windows NT 6.0 (Windows Vista), which was why it refused to start on older operating systems.

So I patched the headers of the 32-bit files to a NT 5.1 target (Windows XP) and of the 64-bit files to a NT 5.2 target (XP x64, Server 2003):

Patching Fraps with CFF Explorer

Patching Fraps to a NT 5.1 platform target with CFF Explorer

And that’s all, because Fraps 3.5.99 is still fully API compatible to Windows XP. This basically proves, that choosing that NT 6.0 platform target was either a wrong decision on the developers’ side, or just a slip-up that went unnoticed because nobody’s testing this on XP any longer:

Running the binary-patched Fraps 3.5.99 on XP x64

Running the binary-patched Fraps 3.5.99 on XP x64 (Click to enlarge)

So much for that.

If you want to run Fraps 3.5.99 on XP, please install the [official version] first, and then unpack the following archive into your Fraps installation folder, overwriting all of the files there:

  • [Fraps 3.5.99], patched for Windows XP / XP x64 compatibility.

And no, I’m not going to build a complete installer with the patched files inside, because I’m far too lazy to do that! :roll:

I tested this briefly with 3DMark 2001 SE b330 on XP and XP x64 (I didn’t have anything better / more modern on the machine at hand right now), and it did indeed work, showing its fps counter just fine!

Fraps 3.5.99 measuring the frame rate in 3DMark 2001 SE under XP

Fraps 3.5.99 measuring the frame rate in 3DMark 2001 SE under XP. And no, I didn’t pick a good scene to snap, I know… (click to enlarge)

So that’s it, cheers! :)

Aug 182017
 

Adobe Photoshop CS6 logo1. Introduction

While I’ve been a Corel Photopaint user for about 20 years now (!), I’ve recently bought myself a license for Adobe Photoshop CS6 Extended, simply because I’ve hit some memory and stability limitations with Corel’s solution when working with very large files (A0 @ 300dpi) and lots of objects (=”layers”), sometimes hundreds of them. Generally, Photopaint tends to become unstable as you keep working with very large files, and things like masks or object copies start failing silently even well before its 32-bit address space is exhausted. Well, Adobe Photoshop CS6 was the last standalone version without relying on the “Creative Cloud” and the last that would work on Windows XP / XP x64, even in its 64-bit version on the latter. So there you have it. I was thinking of migrating.

2. Licensing woes

The first thing that really got on my nerves was Adobes’ licensing system. In my version of CS6, Adobes’ application manager just wouldn’t talk to the activation servers anymore and offline activation is broken as well, as the tool won’t generate any request codes for you (the option just isn’t there anymore). Why is this? During testing I found that Microsofts’ ancient Internet Explorer 8 – the latest on XP – can’t talk to certain activation servers via encrypted connections any longer. I guess Adobe upgraded SSL on some of their servers.

My assumption is that the Adobe application manager uses IEs’ browsing engine to connect to those servers via HTTPS, which fails on anything that isn’t Vista+ with a more modern version of Internet Explorer.

So? I was sitting there with a valid license, but unable to activate the software with it. It’s sad, but in the end, I had to crack Photoshop, which is the one thing I didn’t want to do. Even though Photoshop CS6 officially supported XP, it can no longer be activated on such old operating systems, and nobody cares as it’s all deprecated anyway. It’s annoying, but there was no other choice but to remove the copy protection. But hey, my conscience is clean, I did buy it after all.

3. Language woes

Well, I bought a German version of Photoshop. Which I thought was ok, as Adobe would never do something as stupid as to region-lock their licenses, right? Riiight? [Wrong]. You have to download the [correct version] for your region, or the license won’t work. I mean, German is my native language, but I’m using operating systems and software almost exclusively in English, so that was unacceptable. Thankfully, there is an easy fix for that.

First: Close Photoshop. Then, let’s assume you’ve installed it in the default location. If so, enter the folder %PROGRAMFILES%\Adobe\Adobe Photoshop CS6 (64 Bit)\Locales\de_DE\Support Files\ (swap the de_DE\ part with your locale, you’ll find it in that Locales\ folder). There you’ll find a .dat file, in my case it was called tw10428.dat. Rename it to something like tw10428.dat-backup. Relaunch Photoshop. It’ll be English from now on[1]!

4. The hardware I used for testing

This is important, as some features do depend on certain hardware specifications, like 512MiB of VRAM on the graphics card or certain levels of OpenGL and OpenCL support, those kinds of things. So here’s the hardware, which should be modern enough for CS6:

  • Intel Xeon X5690 3.46GHz hexcore with HT (Westmere architecture, think “Core i7 990X”)
  • 48GiB of RAM
  • nVidia GeForce GTX Titan Black 6GiB (Kepler architecture, somewhat similar to a GeForce GTX 780Ti)

Also, the driver used for the Titan Black is the latest and last one available for XP, 368.81. It even gives you modern things like OpenGL 4.5, OpenCL 1.2 and CUDA 8.0:

nVidia control panel for driver 368.81 on XP x64

nVidia control panel for driver 368.81 on XP x64

5. What it looks like without any compatibility hacks

However, even though the 64-bit version of Photoshop CS6 runs on Windows XP x64 (when cracked :( ), a few things are missing regarding GPU support and 2D/3D acceleration. Let’s launch the application:

Photoshop CS6 Extended on Windows XP x64 without any modifications

Photoshop CS6 Extended on Windows XP x64 without any modifications (click to enlarge)

What we can see immediately is that the 3D menu is missing from the menu bar. This is because Adobe generally forbids the use of extended 3D features (import and rendering of 3D models) on all XP systems. There was an explanation about that on the Adobe labs pages, but that link is dead by now, so I can’t be sure why it was originally disabled. I’ll talk about that stuff later. First, we’ll look at the hardware we do get to use for now, this can be found under Edit \ Preferences \ Performance...:

 

Well, we do get to use a ton of memory, but as you can see, my relatively powerful GTX Titan Black is not being used. Strangely, it says it’s because of me not running the Extended version of Photoshop CS6, but that’s not quite correct:

Photoshop CS6 Extended splash screen

Photoshop CS6 Extended splash screen

I’m not actually sure whether it always said that, as I’ve been messing with Photoshop for a while now, so maybe there was a different message there at first. But it doesn’t matter. The bottom line is, we’re not getting anything out of that GPU!

And we’re gonna fix that.

6. Getting 2D and OpenCL GPU acceleration to work

This is actually easy. We’ll fool Photoshop CS6 into thinking that my GTX Titan Black (which is newer than the software) is an “old GPU”. You can tell Photoshop to still do its stuff on older, slower GPUs despite them not being officially supported. And when that switch is flipped, it seems to apply to all unknown GPUs and graphics drivers, no matter what. Check out the following registry hack:

  1. Windows Registry Editor Version 5.00
  2.  
  3. [HKEY_CURRENT_USER\Software\Adobe\Photoshop\60.0]
  4. "AllowOldGPUS"=dword:00000001

Save that text into a .reg file, for instance AllowOldGPUS-in-Photoshop-CS6.reg. Close Photoshop, then double-click on that file, and it’ll create that dword value AllowOldGPUS for you and set it to 1[2].

Note that the registry path will be different for older versions of Photoshop, so the last part might not be 60.0 but something else, like 11.0 in such a case. Just check first if you have to, by launching Windows’ Regedit tool and navigating to that part of your user registry.

After that, launch Photoshop, and again look at Edit \ Preferences \ Performance...:

Graphics settings when "AllowOldGPUS" is enabled

Graphics settings when “AllowOldGPUS” is enabled

Aha! Looking better. But if you click on that “Advanced Settings…” button, you get this:

Advanced gaphics settings with "AllowOldGPUS" enabled

Advanced gaphics settings with “AllowOldGPUS” enabled

Almost everything checks out – even OpenCL – but we still don’t get to use the advanced OpenGL rendering mode that is supposed to make things even faster. But most of the GPU accelerated 2D functionality is there now, it’s just not working at maximum performance, at least according to Adobe. You’ll get to use stuff like Oil Paint and Scrubby Zoom etc. now though.

But, the 3D stuff that’s been disabled on XP is still missing entirely. And we’ll get to that final part now!

7. Getting 3D and the advanced level of OpenGL 2D acceleration to work

This is a bit more tricky. This stuff is blocked by Photoshops’ operating system detection routines. You can check their results by clicking on Help \ System Info... and looking at the top part of that report:

Adobe Photoshop Version: 13.0.1 (13.0.1.3 20131024.r.34 2013/10/24:21:00:00) x64
Operating System: Windows XP Professional 64-bit
Version: 5.2 Service Pack 2

So we have to fool it into thinking that it’s sitting on a more modern operating system. The tool for the job is Microsofts’ own [Application Verifier]. It’s a software testing tool meant to be used during testing phases of a software development process. It’ll do just what we need though. Download the 64-bit edition (32-bit one is included in that as well, you can ignore that for our purpose), and run it.

Microsofts' Application Verifier

Microsofts’ Application Verifier

Right click into its Applications pane, and add two applications that you can find in your Photoshop installation directory, Photoshop.exe and sniffer_gpu.exe. The latter is a command line tool that detects GPU VRAM, driver version, OpenGL / OpenCL versions etc. You can launch that on a terminal window as well, it outputs everything right to stdout, so you can see what it’s reporting right on the terminal. This program is launched by Photoshop every time during startup to determine GPU and graphics driver+API capabilities.

Well, uncheck everything in the Tests pane, and check just Compatibility \ HighVersionLie. Right-click it, and pick Properties and enter the following[3]:

Faking Windows 7 x64 SP2

Faking Windows 7 SP2

So we’re reporting an NT 6.1 kernel (Windows 7) with build number 7601 and service pack 2, which would be current as of today. Keep it like that, and launch Photoshop. You’ll see this:

Photoshop CS6s' full functionality, ready to be used?

Photoshop CS6s’ full functionality, ready to be used? (Click to enlarge)

And there we go, our 3D menu is active on top, right between the “Filter” and “View” menus. But that doesn’t mean that it’s tools are really working… Anyway, the System Info will now report a different operating system:

Adobe Photoshop Version: 13.0.1 (13.0.1.3 20131024.r.34 2013/10/24:21:00:00) x64
Operating System: Windows 7 64-bit
Version: 6.1

Oddly enough, the service pack 2 part is missing, but who cares. Let’s take a look at the graphics settings!

Advanced graphics settings, revisited

Advanced graphics settings, revisited

Alright, we can use advanced OpenGL effects after applying our Windows 7 fake! Very good. That means all 2D OpenGL / OpenCL accelerated parts are working at maximum performance now! And as a first indicator of 3D also working, we can now access the 3D settings as well, look at this:

3D settings

3D settings, even the full 6GB VRAM minus some 32MiB of reserved memory have been detected (click to enlarge)

For the 3D test, I decided to download a 3DStudioMax sample model from GrabCAD, [here] (that website requires registration). A complete render of that model can also be seen there, looks like this:

"Chainmail maelstrom"

It’s called “Chainmail maelstrom unit”

So, let’s create a new document, import that model via 3D \ New 3D Layer from File... and switch to Photoshops’ 3D view! And…

OpenGL accelerated 3D preview

OpenGL accelerated 3D preview (click to enlarge)

It’s totally working! Strangely you don’t get anti-aliasing even though the card can do it, but at least it’s something. I’m almost feeling like sitting in front of some kind of CAD software now, with all that realtime 3D rendering. Photoshop does interpret the material properties a bit weirdly however, so it’s not glossy enough and the color is wrong for something that’s supposed to be “steel”. I didn’t fix it up though, just left it as-is and clicked “render”! Then, after lots of waiting, you get some nicer output:

Final render

Final render (click to enlarge)

Please note that the above screenshots are 8-bit PNG, so they’re not true representations of the source at all. Still, should be good enough.

One disappointing part is, that the final render – or rather raytracing output – isn’t being crunched on the GPU. Instead, it’s using a lot of CPU cores. In this case, my hexcore Xeon with 12 threads was loaded up to 70-90%. But it would still be a lot faster if done by an OpenCL or CUDA raytracer on the GPU. I will need to check whether there are any plugins for that.

Anyway, this means that the full potential of Adobe Photoshop CS6 Extended can be unlocked on Windows XP Professional x64 Edition SP2! Cheers! Beer Smilie

8. What about Windows XP 32-bit?

No idea. I haven’t tested it. According to users on the web, the OpenCL part wouldn’t work on 32-bit at all, no matter which operating system. It’s supposed to be “by design” for whatever reason. I’m not sure if its just disabled like with 3D, or whether the libraries using it are really not there for the 32-bit version of Photoshop CS6. You would need to test that by yourself, but I wouldn’t get my hopes up with that. At least the other parts should be doable, including 3D.

Now, all that remains is to learn how to actually use Photoshop CS6. :roll: It’s probably going to be a bit painful coming from the Corel side of things…

 

[1] Adobe Photoshop CS6 – Change language to English, Tim Barrett, 2014-08-17 on Youtube

[2] Enable GPU Acceleration in x64 XP, op2rules, 2009-12-11 on Youtube

[3] A comment, Schlaubstar, 2012 on Youtube

May 312017
 

HakuNeko logo1.) What for?

Usually, porting my favorite manga ripper [HakuNeko][1] would involve slightly more exotic target platforms like [FreeBSD UNIX]. This changed with version 1.4.2 however, as this version – the most current at the time of writing – would no longer compile on Windows machines due to some issues with its build toolchain. And that’s the most common platform for the tool!

This is what the lead developer had to say about the issue while even suggesting the use of [FMD] instead of HakuNeko on Windows:

“The latest release does not compile under windows due to some header include contradictions of sockets […]”
    -in a [comment][1] to HakuNeko ticket #142 by [Ronny Wegener], HakuNeko project leader

[1] Edit: Links have been fixed, as the HakuNeko project has now been moved to HakuNeko Legacy due to the development of its replacement, [HakuNeko S].

Normally I wouldn’t mind that much and keep using 1.4.1 for now, but unfortunately this is not an option. Quite a few Manga websites have changed by now, breaking compatibility with the previous version. As this is breaking most of HakuNekos’ functionality for some important sites, it became quite unusable on Windows, leaving Linux as the only officially supported platform.

As using virtual machines or remote VNC/X11 servers for HakuNeko proved to be too tedious for me, I thought I’d try to build this by myself. As the MSYS2/MinGW Windows toolchain seemed to be broken for 1.4.2, I tried – for the very first time – to cross-compile on Linux, choosing CentOS 7.3 x86_64 and MinGW32 4.9.3 for the task. This was quite the challenge, given that HakuNeko comes completely unprepared for cross-compiling.

2.) First, the files

Took me many hours / days to get it done – 100% statically linked too – and it’s finished now! What I won’t provide is an installer (don’t care about that), but here are my v1.4.2 builds for Windows:

As of today, those versions have been tested successfully on the following operating systems:

  • Windows XP Professional SP3 / POSReady2009
  • Windows XP Professional x64 Edition SP2 w. Server 2003 updates
  • Windows Server 2003 R2 x64 SP2
  • Windows Vista Enterprise x64 SP2
  • Windows 7 Professional x64 SP1
  • Windows 10 Professional x64 build #1607

Please be aware that not all of the functionality has been tested by me, just a few downloads that wouldn’t have worked with 1.4.1, a few more random downloads here and there, plus chapter-wise .cbz packaging. It’s looking pretty good I think. Here are some sample screen shots as “proof” of it working (click to enlarge):

HakuNeko 1.4.2 downloading "Kobayashi-san Chi no Maid Dragon" on XP x64

HakuNeko 1.4.2 downloading “Kobayashi-san Chi no Maid Dragon” on XP x64 (Note: I own that Manga in paper form)

 

3.) What has been done to make this work?

3a.) Initial work:

First of all, cross-compiling is a bottomless, hellish pit, a horrible place that you never want to enter unless a.) The build toolchain of the software you wanna compile is very well prepared for it or b.) you really have/want to get your hands on that build or c.) you hate yourself so much you have to hurt yourself or d.) you actually enjoy c.).

The reasons for choosing cross-compiling were that Ronny Wegener had said, that the MSYS2/MinGW32 build would fail on Windows, plus it would require GCC version 5.3 to link with the bundled, pre-built static libraries (OpenSSL, cURL, wxWidgets).

So I thought it would be more likely to work if I were to run my own MinGW setup on Linux, not relying on the bundled stuff but linking against the libraries that come with MinGW32 4.9.3 on my platform of choice – CentOS 7.3 Linux.

One exception was the GUI library wxWidgets 3.0.2 that I had to cross-compile and statically link by myself as well, but luckily, that’s easy despite its size. wxWidgets is one piece of software that does come well-prepared for cross-compiling! In my case, that made it as simple as this (parallel compile with 6 CPUs):

$ ./configure --prefix=/usr/local/i686-w64-mingw32 --host=i686-w64-mingw32 --build=x86_64-linux \
 --enable-unicode --with-expat --with-regex --with-opengl --with-libpng --with-libjpeg --with-libtiff \
 --with-zlib --with-msw --enable-ole --enable-uxtheme --disable-shared
$ make -j6
# make install

3b.) HakuNeko build toolchain / Makefile modifications for cross-compiling:

HakuNeko is much harder, and I don’t even remember half of what I did, but most of it was manually editing the ./Makefile after $ ./configure --config-mingw32 would have produced something rather broken.

Let’s get to it, file paths are relative to the source root. First, edit the following parts of the ./Makefile (you need to look for them in different places of the file). First, the PREFIX, should be in the bottom half of the file:

PREFIX = /usr/local/i686-w64-mingw32/

CC and the CFLAGS:

CC = i686-w64-mingw32-g++
CFLAGS = -c -Wall -O2 -std=c++11 \
 -I/usr/local/i686-w64-mingw32/lib/wx/include/i686-w64-mingw32-msw-unicode-static-3.0 \
 -I/usr/local/i686-w64-mingw32/include/wx-3.0 -D_FILE_OFFSET_BITS=64 -D__WXMSW__ -mthreads \
 -DCURL_STATICLIB -I/usr/i686-w64-mingw32/sys-root/mingw/include

Add -DPORTABLE, if you want to build the portable version of HakuNeko.

Then, the Windows resource compiler, controlled by RC and RCFLAGS:

RC = /usr/bin/i686-w64-mingw32-windres
RCFLAGS = -J rc -O coff -F pe-i386 -I/usr/i686-w64-mingw32/sys-root/mingw/include \
 -I/usr/local/i686-w64-mingw32/include

And finally, the static linking part, which is the hardest stuff to get done right, LD, LDFLAGS and LDLIBS:

LD = i686-w64-mingw32-g++
LDFLAGS = -s -static -static-libgcc -static-libstdc++ -mwindows -DCURL_STATICLIB
LDLIBS = -L/usr/local/i686-w64-mingw32/lib   -Wl,--subsystem,windows -mwindows \
 -lwx_mswu_xrc-3.0-i686-w64-mingw32 -lwx_mswu_webview-3.0-i686-w64-mingw32 \
 -lwx_mswu_qa-3.0-i686-w64-mingw32 -lwx_baseu_net-3.0-i686-w64-mingw32 \
 -lwx_mswu_html-3.0-i686-w64-mingw32 -lwx_mswu_adv-3.0-i686-w64-mingw32 \
 -lwx_mswu_core-3.0-i686-w64-mingw32 -lwx_baseu_xml-3.0-i686-w64-mingw32 \
 -lwx_baseu-3.0-i686-w64-mingw32 -L/usr/i686-w64-mingw32/sys-root/mingw/lib -lcurl -lidn -liconv \
 -lssh2 -lssl -lcrypto -lpng -ljpeg -ltiff -lexpat -lwxregexu-3.0-i686-w64-mingw32 -lz -lrpcrt4 \
 -lwldap32 -loleaut32 -lole32 -luuid -lws2_32 -lwinspool -lwinmm -lshell32 -lcomctl32 -lcomdlg32 \
 -ladvapi32 -lwsock32 -lgdi32

Took a while to find the libraries (and static library order!) necessary to satisfy all the dependencies properly.

If you need it, here is the modified Makefile I’ve used to cross-compile:

  • [HakuNeko Makefile] for cross-compiling HakuNeko 1.4.2 for Windows on CentOS 7.3 x86_64 Linux (needs statically linked & installed wxWidgets first).

3c.) Source code modifications:

However, something will still not be quite right, because some of the crypto libraries will provide the MD5 functions MD5_Init(), MD5_Update() as well as MD5_Final(), and those are already defined by HakuNeko itself. This will break the static linking, as redundant definitions won’t work. We’ll rely on the libraries (libcrypto, libssl), and comment the built-in stuff out in src/v7/v7.c:

void MD5_Init(MD5_CTX *c);
void MD5_Update(MD5_CTX *c, const unsigned char *data, size_t len);
void MD5_Final(unsigned char *md, MD5_CTX *c);

…becomes:

/* void MD5_Init(MD5_CTX *c);
 * void MD5_Update(MD5_CTX *c, const unsigned char *data, size_t len);
 * void MD5_Final(unsigned char *md, MD5_CTX *c);
 */

On top of that, the configure system may have generated src/main.cpp as well as src/main.h. Those are bogus files, turning the entire tool into nothing but one large “Hello World” program. Plus, that’s hard to debug, as the binary won’t even output “Hello World” on a Windows terminal when it’s built as a GUI tool. I only found the issue when testing it with Wine on Linux. ;)

Please delete src/main.cpp and src/main.h before continuing.

Now, if you’re really lucky, you should be able to run something like $ make -j6 and watch everything work out nicely. Or watch it crash and burn, which is the much, much more likely scenario, given I’ve likely only given you half of what I did to the build tools.

Well, in any case, no need to run $ make install of course, just grab the binary build/msw/bin/hakuneko.exe and copy it off to some Windows machine, then try to run it. If you’ve built the portable version, you may wish to rename the file to hakuneko-portable.exe, just like the official developers do.

4.) The future

Let’s just hope that the developers of HakuNeko can get this fixed for versions >=1.4.3, because I really, really don’t want to keep doing this. It’s extremely painful, as cross-compiling is exactly the kind of living hell I heard a lot of people saying it is! I think it’s a miracle I managed to compile and run it at all, and it was so frustrating and tedious for somebody like me (who isn’t a developer).

The statement that this took “hours / days” wasn’t an exaggeration. I think it was something like 10-12 man hours of pure frustration to get it done. I mean, it does feel pretty nice when it finally works, but I wouldn’t bet on myself being able to do this again for future versions…

So please, make it work on Windows again, if possible, and keep HakuNeko cross-platform! It’s still my favorite tool for the task! Thanks! :)

Mar 132017
 

puTTY logoAnyone who has logged in to a UNIX or Linux machine remotely coming from a Windows box probably knows puTTY, which is practically “the” SSH and telnet client for Windows. In conjunction with a small X11 server like Xming you can even do Remote X. To my surprise, a new version has been released just last month, as a colleague told me! So there is version 0.68 now, and it comes in both a 32-bit and a 64-bit flavor.

Of course I had to try the 64-bit version on XP x64, and it did fail to execute:

64-bit puTTY failure on XP x64

A classic: 64-bit puTTY failure on XP x64, because it’s “not a valid Win32 application”

Out of curiosity, I fetched the puTTY source code from [here], because I thought I could just compile & link it myself. Building for 32-bit proved to be relatively easy; Just load the corresponding solution file into Microsoft Visual Studio 2010, build the whole project, done. But I wanted 64-bit! So I created a x64 build target and gave it a shot, but it couldn’t find the _addcarry_u64 intrinsic.

After a bit of searching on the web, it became clear that the intrinsics header of Visual Studio didn’t provide it. It’s too old, you need Visual Studio 2013 or newer for that. Funny part is, puTTY only comes with project files for 2010 and 2012, how are they building their x64 version? No idea. Maybe they’re linking against a different library version or something.

One attempt that I was going to make (build it with VS2013, linking against an older platform SDK) isn’t done yet, because I need to prepare my Windows 7 VM for it. I did manage to compile and run puTTY as 64-bit code on XP x64 by hacking up their program though! In the unpacked source tree, open sshbn.h and take a look at line #70:

expand/collapse source code
  1. #elif defined _MSC_VER && defined _M_AMD64
  2.  
  3.   /*
  4.    * 64-bit BignumInt, using Visual Studio x86-64 compiler intrinsics.
  5.    *
  6.    * 64-bit Visual Studio doesn't provide very much in the way of help
  7.    * here: there's no int128 type, and also no inline assembler giving
  8.    * us direct access to the x86-64 MUL or ADC instructions. However,
  9.    * there are compiler intrinsics giving us that access, so we can
  10.    * use those - though it turns out we have to be a little careful,
  11.    * since they seem to generate wrong code if their pointer-typed
  12.    * output parameters alias their inputs. Hence all the internal temp
  13.    * variables inside the macros.
  14.    */
  15.  
  16.   #include 
  17.   typedef unsigned char BignumCarry; /* the type _addcarry_u64 likes to use */
  18.   typedef unsigned __int64 BignumInt;
  19.   #define BIGNUM_INT_BITS 64
  20.   #define BignumADC(ret, retc, a, b, c) do                \
  21.       {                                                   \
  22.           BignumInt ADC_tmp;                              \
  23.           (retc) = _addcarry_u64(c, a, b, &ADC_tmp);      \
  24.           (ret) = ADC_tmp;                                \
  25.       } while (0)
  26.   #define BignumMUL(rh, rl, a, b) do              \
  27.       {                                           \
  28.           BignumInt MULADD_hi;                    \
  29.           (rl) = _umul128(a, b, &MULADD_hi);      \
  30.           (rh) = MULADD_hi;                       \
  31.       } while (0)
  32.   #define BignumMULADD(rh, rl, a, b, addend) do                           \
  33.       {                                                                   \
  34.           BignumInt MULADD_lo, MULADD_hi;                                 \
  35.           MULADD_lo = _umul128(a, b, &MULADD_hi);                         \
  36.           MULADD_hi += _addcarry_u64(0, MULADD_lo, (addend), &(rl));     \
  37.           (rh) = MULADD_hi;                                               \
  38.       } while (0)
  39.   #define BignumMULADD2(rh, rl, a, b, addend1, addend2) do                \
  40.       {                                                                   \
  41.           BignumInt MULADD_lo1, MULADD_lo2, MULADD_hi;                    \
  42.           MULADD_lo1 = _umul128(a, b, &MULADD_hi);                        \
  43.           MULADD_hi += _addcarry_u64(0, MULADD_lo1, (addend1), &MULADD_lo2); \
  44.           MULADD_hi += _addcarry_u64(0, MULADD_lo2, (addend2), &(rl));    \
  45.           (rh) = MULADD_hi;                                               \
  46.       } while (0)

I just commented out the entire codeblock using that modern _addcarry_u64 intrinsic, and replaced it with the code being used for the 32-bit version:

  1. #elif defined _MSC_VER && defined _M_AMD64
  2.  
  3.   /* 32-bit BignumInt, using Visual Studio __int64 as BignumDblInt 
  4.    * This is compatible with VS2010 & VS2012 for building a x86_64
  5.    * version of puTTY (no __int128 with those compilers).
  6.    */
  7.  
  8.   typedef unsigned int BignumInt;
  9.   #define BIGNUM_INT_BITS  32
  10.   #define DEFINE_BIGNUMDBLINT typedef unsigned __int64 BignumDblInt

I built that and it works, even though I keep thinking I should be using a wide 128-bit data type here (just like the original x64 code), but then we don’t have __int128 in MSVC before 2013, and I’m on 2010. And I don’t know how to use SSE registers in that context with things like __m128, which is why I left it alone. Looking good anyway:

puTTY 64-bit XP x64 version logged in to Windows 2000

puTTY 64-bit XP x64 version logged in to Windows 2000 using a modern SSH server (click to enlarge)

And:

puTTY 64-bit XP x64 version logged in to FreeBSD 10.3 UNIX

puTTY 64-bit XP x64 version logged in to FreeBSD 10.3 UNIX (click to enlarge)

In any case, here is a complete 64-bit build of puTTY that works on NT5.2 operating systems like Windows XP Professional x64 Edition or Windows Server 2003 x64:

  • [puTTY 0.68][1] (x64 version for NT5.2, portable without installer)

Maybe I’ll try to build a version with VS2013 on Windows 7 for the same platform target, we’ll see. But at least this works!

Oh and… No, I don’t really think anyone actually needs a 64-bit version of puTTY ;). Plus the 32-bit one works just fine on XP x64 / Server 2003 out-of-the-box anyway. But hey… You know… :roll:

[1] puTTY is © 1997-2017 Simon Tatham and is licensed under the MIT license

Mar 012017
 

Notepadqq @ CentOS 6 Linux logoIt’s rather rare for me to look for a replacement of some good Windows software for Linux/UNIX instead of the other way around, but the source code editor [Notepad++] is one example of such software. The program gedit on the Gnome 2 desktop environment of my old CentOS 6 enterprise Linux isn’t bad, but it isn’t exactly good either. The thing I was missing most was a search & replace engine capable of regular expressions.

Of course, vi can do it, but at times, vi can be a bit hard to use, so I kinda looked for a Notepad++ replacement. What I found was [Notepadqq], which is basically a clone using the Qt5 UI. However, this editor is clearly made for more modern systems, but I still looked for a way to get it to compile and run on my CentOS 6.8 x86_64 Linux system. And I found one. Here are the most important prerequisites:

  • A new enough GCC (I used 6.2.0), because the v4.4.7 platform compiler won’t work with the modern C++ stuff
  • Qt5 libraries from the [EPEL] repository
  • git

First, you’ll want a new compiler. That part is surprisingly easy, but a bit time consuming. First, download a fresh GCC tarball from a server on the [mirrors list], those are in the releases/ subdirectory, so a file like gcc-6.3.0.tar.bz2 (My version is still 6.2.0). It seems Notepadqq only needs some GCC 5, but since our platform compiler won’t cut it anyway, why not just use the latest?

Now, once more, this will take time, could well be hours, so you might wanna do the compilation step over night, the last step needs root privileges:

$ tar -xzvf ./gcc-6.3.0.tar.bz2
$ cd ./gcc-6.3.0/
$ ./configure --program-suffix="-6.3.0"
$ make
# make install

And when you do this, please never forget to add a --program-suffix for the configuration step!  You might seriously fuck things up if you miss that! So double-check it!

When that’s finally done, let’s handle Qt5 next. I’ll be using a binary distribution to make things easy, and yeah, I didn’t just install the necessary packages, I got the whole Qt5 blob instead, too lazy to do the cherry picking. Oh, and if you don’t have it, add git as well:

# yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-6.noarch.rpm
# yum install qt5* git

I assume # yum install qt5-qtwebkit qt5-qtwebkit-devel qt5-qtsvg qt5-qtsvg-devel qt5-qttools qt5-qttools-devel should also be enough according to the requirements, but I didn’t try that. Now, enter a free directory or one you generally use for source code and fetch the latest Notepadqq version (this will create a subfolder we’ll cd to):

$ git clone https://github.com/notepadqq/notepadqq.git
$ cd ./notepadqq

After that, we need to make sure that we’ll be using the correct compiler and that we’re linking against the correct libraries that came with it (like libstdc++.so.6.*). To do that, set the following environment variables, assuming you’re using the bash as your shell (use lib/ instead of lib64/ folders if you’re on 32-bit x86):

$ export CC="gcc-6.3.0"
$ export CXX="g++-6.3.0"
$ export CPP="cpp-6.3.0"
$ export CFLAGS="-I/usr/local/include/ -L/usr/local/lib64/"
$ export CXXFLAGS="-I/usr/local/include/ -L/usr/local/lib64/"
$ export LDFLAGS="-L/usr/local/lib64/"

The C-related settings are probably not necessary as Qt5 stuff should be pure C++, but you’ll never know, so let’s play it safe.

With that we’re including and linking against the correct libraries and we’ll be using our modern compiler as well. Time to actually compile Notepadqq. To do that, we’ll still need to tell it where to find the Qt5 versions of the qmake and lrelease binaries, but luckily, we can solve that with some simple configuration options. So, let’s do this, the last step requires root privileges again, from within the notepadqq/ directory that git clone created for us:

$ ./configure --qmake /usr/bin/qmake-qt5 --lrelease /usr/bin/lrelease-qt5
$ make
# make install

Now, there are some weird linking issues that I never got fixed on CentOS (some developer please tell me how, I have the same crap when building x265!). Because of that we still can’t launch Notepadqq as-is, we need to give it an LD_LIBRARY_PATH to find the proper libraries at runtime. Let’s just create an executable launcher script /usr/local/sbin/notepadqq.sh for that. Edit it and enter the following code:

#!/bin/sh
LD_LIBRARY_PATH="/usr/local/lib64" /usr/local/bin/notepadqq "$@"

Use this as your launcher script for Notepadqq and you’re good to go with your Notepad++ replacement on good old CentOS 6.x:

Running Notepadqq on CentOS 6 Linux

Running the latest Notepadqq on CentOS 6 Linux with Qt5 version 5.6.1, state 2017-03-01

Now, let’s see whether it’s even that good actually… :roll:

Nov 192016
 

FreeBSD GMABoost logoRecently, after finding out that the old Intel GMA950 profits greatly from added memory bandwidth (see [here]), I wondered if the overclocking mechanism applied by the Windows tool [here] had leaked into the public after all this time. The developer of said tool refused to open source the software even after it turning into abandonware – announced support for GMA X3100 and X4500 as well as MacOS X and Linux never came to be. Also, he did not say how he managed to overclock the GMA950 in the first place.

Some hackers disassembled the code of the GMABooster however, and found out that all that’s needed is a simple PCI register modification that you could probably apply by yourself on Microsoft Windows by using H.Oda!s’ [WPCREdit].

Tools for PCI register modification do exist on Linux and UNIX as well of course, so I wondered whether I could apply this knowledge on FreeBSD UNIX too. Of course, I’m a few years late to the party, because people have already solved this back in 2011! But just in case the scripts and commands disappear from the web, I wanted this to be documented here as well. First, let’s see whether we even have a GMA950 (of course I do, but still). It should be PCI device 0:0:2:0, you can use FreeBSDs’ own pciconf utility or the lspci command from Linux:

# lspci | grep "00:02.0"
00:02.0 VGA compatible controller: Intel Corporation Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Controller (rev 03)
 
# pciconf -lv pci0:0:2:0
vgapci0@pci0:0:2:0:    class=0x030000 card=0x30aa103c chip=0x27a28086 rev=0x03 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Controller'
    class      = display
    subclass   = VGA

Ok, to alter the GMA950s’ render clock speed (we are not going to touch it’s 2D “desktop” speed), we have to write certain values into some PCI registers of that chip at 0xF0hex and 0xF1hex. There are three different values regulating clockspeed. Since we’re going to use setpci, you’ll need to install the sysutils/pciutils package on your machine via # pkg install pciutils. I tried to do it with FreeBSDs’ native pciconf tool, but all I managed was to crash the machine a lot! Couldn’t get it solved that way (just me being too stupid I guess), so we’ll rely on a Linux tool for this. Here is my version of the script, which I call gmaboost.sh. I placed that in /usr/local/sbin/ for global execution:

  1. #!/bin/sh
  2.  
  3. case "$1" in
  4.   200) clockStep=34 ;;
  5.   250) clockStep=31 ;;
  6.   400) clockStep=33 ;;
  7.   *)
  8.     echo "Wrong or no argument specified! You need to specify a GMA clock speed!" >&2
  9.     echo "Usage: $0 [200|250|400]" >&2
  10.     exit 1
  11.   ;;
  12. esac
  13.  
  14. setpci -s 02.0 F0.B=00,60
  15. setpci -s 02.0 F0.B=$clockStep,05
  16.  
  17. echo "Clockspeed set to "$1"MHz"

Now you can do something like this: # gmaboost.sh 200 or # gmaboost.sh 400, etc. Interestingly, FreeBSDs’ i915_kms graphics driver seems to have set the 3D render clock speed of my GMA950 to 400MHz already, so there was nothing to be gained for me in terms of performance. I can still clock it down to conserve energy though. A quick performance comparison using a crappy custom-recorded ioquake3 demo shows the following results:

  • 200MHz: 30.6fps
  • 250MHz: 35.8fps
  • 400MHz: 42.6fps

Hardware was a Core 2 Duo T7600 and the GPU was making use of two DDR-II/667 4-4-4 memory modules in dual channel configuration. Resolution was 1400×1050 with quite a few changes in the Quake III configuration to achieve more performance, so your results won’t be comparable, even when running ioquake3 on identical hardware. I’d post my ~/.ioquake3/baseq3/q3config.cfg here, but in my stupidity I just managed to freaking wipe the file out. Now I have to redo all the tuning, pfh.

But in any case, this really works!

Unfortunately, it only applies to the GMA950. And I still wonder what it was that was so wrong with # pciconf -w -h pci0:0:2:0 0xF0 0060 && pciconf -w -h pci0:0:2:0 0xF0 3405 and the like. I tried a few combinations just in case my byte order was messed up or in case I really had to write single bytes instead of half-words, but either the change wouldn’t apply at all, or the machine would just lock up. Would be nice to do this with only BSD tools on actual FreeBSD UNIX, but I guess I’m just too stupid for pciconf

Sep 072016
 

TeamViewer on Linux logoI’m not exactly a big fan of TeamViewer, since you’ll never know what’s going to happen with that traffic of yours, so I prefer VNC over SSH instead. A few weeks ago I got TeamViewer access to a remote workstation machine for the purpose of processing A/V files however. Basically, it was about video and audio transcoding on said machine.

Since the stream meta data (like the language of an audio stream) wasn’t always there, I wanted to check it by playing back the files remotely in foobar2000 or MPC-HC. TeamViewer does offer a feature to relay the audio from a remote machine to your local box, as long as the remote server has some kind of soundcard / sound chip installed. I was using TeamViewer 11 – the newest version at the time of writing – to connect from CentOS 6.8 Linux to a Windows 7 Professional machine. Playing back audio yielded nothing but silence though.

Now, TeamViewer is actually not native Linux software. Both its Linux and MacOS X versions come with a bundled Wine 1.6 distribution preconfigured to run the 32-bit TeamViewer Windows binary. It was thus logical to assume that the configuration of TeamViewers’ built-in Wine was broken. This may happen in cases where you upgrade TeamViewer from previous releases (which is what I had done, 7 -> 8 -> 9 -> 11).

There are a multitude of proposed solutions to fix this, and since none of them worked for me as-is, I’d like to add my own to the mix. The first useful hint came from [here]. You absolutely need a working system-wide Wine setup for this. I already had one that I needed for work anyway, namely Wine 1.8.6 from the [EPEL] repository, configured using [winetricks]. We’re going to take some files from that installation and essentially replace TeamViewers’ own Wine with the one distributed by EPEL.

So I had TeamViewer 11 installed in /opt/teamviewer/ and some important configuration files for it in ~/.local/share/teamviewer11/ and ~/.config/teamviewer/. First, we backup the wine files of TeamViewer and replace them with the platform ones (the paths may vary depending on your Linux distribution, but the file names should not):

# mv /opt/teamviewer/tv_bin/wine/bin/wine /opt/teamviewer/tv_bin/wine/bin/wine.BACKUP
# mv /opt/teamviewer/tv_bin/wine/bin/wineserver /opt/teamviewer/tv_bin/wine/bin/wineserver.BACKUP
# mv /opt/teamviewer/tv_bin/wine/bin/wine-preloader /opt/teamviewer/tv_bin/wine/bin/wine-preloader.BACKUP
# mv /opt/teamviewer/tv_bin/wine/lib/libwine.so.1.0 /opt/teamviewer/tv_bin/wine/lib/libwine.so.1.0.BACKUP
# mv /opt/teamviewer/tv_bin/wine/lib/wine/ /opt/teamviewer/tv_bin/wine/lib/wine.BACKUP/
# cp /usr/bin/wine /usr/bin/wineserver /usr/bin/wine-preloader /opt/teamviewer/tv_bin/wine/bin/
# cp /usr/lib/libwine.so.1.0 /opt/teamviewer/tv_bin/wine/lib/
# cp -r /usr/lib/wine/ /opt/teamviewer/tv_bin/wine/lib/

This will replace all the binaries and libraries, in my case shoving Wine 1.8.6 underneath TeamViewer. This isn’t all that’s needed however. We’ll also need the system registry hive of your working Wine installation (with sound). That should be stored in ~/.wine/system.reg! Let’s replace TeamViewers’ own hive with this one:

$ mv ~/.local/share/teamviewer11/system.reg ~/.local/share/teamviewer11/system.reg.BACKUP
$ cp ~/.wine/system.reg ~/.local/share/teamviewer11/

Ok, and the final part is adding the proper Linux audio backend to this Wines’ configuration. That part is stored in ~/.wine/user.reg. Replacing the whole file didn’t work for me though, as TeamViewer would crash upon launch, probably missing some keys from its own user.reg. So, let’s just edit its file instead, open ~/.local/share/teamviewer11/system.reg with your favorite text editor and add the following line in a proper location (it’s sorted alphabetically):

[Software\\Wine\\Drivers\\winepulse.drv] 1473239241

The corresponding file should be found within TeamViewers’ replaced Wine distribution now by the way, in my case it’s /opt/teamviewer/tv_bin/wine/lib/wine/fakedlls/winepulse.drv.

Now, run the TeamViewer profile updater (Some people say it’s required to make this work, it wasn’t for me, but it didn’t hurt either): $ /opt/teamviewer/tv_bin/TeamViewer --update-profile and then its’ Wine configuration: $ /opt/teamviewer/tv_bin/TeamViewer --winecfg. After that, you should be greeted with this:

TeamViewer 11 running its own winecfg

TeamViewer 11 running its own copy of winecfg.

Before the modifications, the configuration window would show “None” as the driver, without any way to change it. So no audio, whereas we have Pulseaudio now. Press “Test Sound” if you want to check whether it truly works. I haven’t tested the ALSA backend by the way. In my case, as soon as the registry was fixed, Wine just autoselected Pulseaudio, which is fine for me.

Now launch TeamViewer and check out the audio options in this submenu:

TeamViewer preferences

The TeamViewer 11 preferences can be found here.

It should look like this:

TeamViewers audio options

Make sure “Play computer sounds and music” is checked! (click to enlarge)

Now, after having connected and logged in, you may also wish to verify the conference audio settings in TeamViewers’ top menu:

TeamViewer 11 conference audio settings

TeamViewer 11 conference audio settings, make sure “Computer sound” is checked!

When you play a sound file on the remote computer, you should hear it on your local one as well. With that, I can finally test the audio files I’m supposed to use on that remote machine for their actual language (which is a rather important detail) where meta data isn’t available.

This seems to be a problem of TeamViewers installation / update procedure which hasn’t been addressed for several major released now. I presume just removing all traces of TeamViewer and installing it from scratch might also do the trick, but I didn’t try it for myself.

Ah, and one more thing: If you can’t launch TeamViewer on CentOS 6.x because you’re getting the following error…

teamviewerd error

TeamViewer Daemon not running…

…forget about the solutions on the web on top of what this message is telling you. TeamViewer 11 uses a systemd-style script for launching its daemon on Linux now, and that won’t do on SysV init systems. Just become root and launch the crap manually: # /opt/teamviewer/tv_bin/teamviewerd &, then press <CTRL>+<d> and it works!

Let’s hope that daemon isn’t doing anything evil while running as root. :roll:

Aug 052016
 

VirtualDimension logoShort story: It’s [VirtualDimension].

Long story… It’s most definitely not what Microsoft added in Windows 10. Besides it being limited to Windows 10, it just sucks for a multitude of reasons. And there I was, having hopes for it as well… If you’ve ever used multiple desktops on a graphical, X11-based Linux or UNIX window manager / desktop environment, you’d know what I’m talking about. Usually, what you’d get on those systems, whether KDE, Gnome, Xfce4, LXDE, or whatever is just one small, configurable panel which allows you to control multiple virtual desktops. On my Gnome 2 on CentOS 6.8 Linux, it looks like this (others are very similar):

Virtual desktops on Gnome 2

Virtual desktops on Gnome 2

The leftmost desktop is my usual “Internet” environment, here I chat, read emails, browse the web for anything work-related and so on. The second one is a Linux distribution development desktop. Here I’m building a derivative of Klaus Knoppers’ [Knoppix] distro. Then comes the testing environment for said distribution on desktop #3. Usually some shells and one VMware Player instance. Next to that are two more VM desktops for software testing and for writing user guides for software installation on different operating systems. At the moment that’s MacOS X and a Windows XP x64 software build VM. Usually there’s also a Windows 7 one. One is empty (for arbitrary stuff), then comes the server administration desktop with 9 open shells, one for each server. And the last one is my private desktop with yet another web browser, and some shells for spawning screen sessions for running software compilations, encoding runs and the likes.

Now, I have a 30″ screen both at home and at work, resolution is 2560×1600. But it’s just never enough screen real estate. So I wanted well-integrated virtual desktops for Windows as well, but last time I tried out some software, I couldn’t find a good one. Recently, I tried again for some reason, like “let’s give this one last shot”. And I tried a lot of programs!

Among the software tested were [Dexpot], [Finestra], [VirtuaWin], [WindowsPager], Xilisoft [Multiple Desktops] and the Windows PowerToy predecessor of [Desktops 2.0] written by Windows Hacker Mark Russinovich and Bryce Cogswell. And finally, [VirtualDimension]. Some of those are free and open source software, others are not.

One of my primary requirements was compatibility to Windows XP x64. Of course it’d be nice if it worked on Windows 10 as well. But most of the above had important features missing or were severely misbehaving on XP. Some were just very, very sluggish when switching desktops. Others had missing features to begin with, like previews on the desktop tiles. A blank desktop tile doesn’t help at all, as I need to see roughly what’s running where at a glance.

I’m not gonna make this a lengthy top list or anything, I’m just gonna show you what the software of my choice – VirtualDimension – could do for me, let’s look at the tiles first:

VirtualDimension on XP x64

VirtualDimension on XP x64

We’ll start with my good old XP x64 first. Here you can see my system tray, and Miranda being open. VirtualDimension cannot be embedded into the taskbar properly (damn), but it has an “always on top” feature. Since the contact list in my docked and always-launched Miranda doesn’t go all the way down, there is free and unused space there. Perfect for VirtualDimension! And since it’s always on top, it doesn’t disappear when clicking on Miranda for chatting.

Given the source code is definitely coming from a UNIX or Linux user (given he built it with GCC/Mingw), some features immediately ring a bell. Like “mouse warp”, where you switch desktops by moving your mouse to the border of the screen. I disabled that, don’t like it. But yeah, it’s there.

Important: While it doesn’t give you live window geometry previews, it does give you iconized previews, so you can always identify any desktop quickly by seeing what’s running there. The desktops can also be named, and there is an OSD that you can have pop up on you when switching, like so:

VirtualDimension OSD

OSD showing right after a desktop switch

In this case I had just switched to desktop #2, which is for A/V processing exclusively. This is just the top left part of the screen, where one of my eight transcoding shells was running a x265 benchmark prototype test. Color, display duration, transparency to mouse clicks on the OSD part, font and size are configurable.

Also, you can freely define keyboard shortcuts for switching desktops as well. I chose CTRL+Shift+Right as well as CTRL+Shift+Left for switching desktops and Alt+Right / Alt+Left for pushing a window to the next/previous desktop as those don’t conflict with other shortcuts I’m using.

What else can it do? Let’s right click on one of the icons in the preview tiles:

VirtualDimension iconized window right click

Clicking on a program icon in VirtualDimensions’ preview tiles gives you this menu

The first five options from the top are global ones. However, the ones below are specific to the icon you right-clicked. With “Activate”, you’d switch to the target desktop and put focus on that programs’ window. The others are pretty self-explanatory as well I guess. We also get a graceful “Close” option, and a brutal “Kill” option that’s equivalent to murdering the process in task manager. Maybe useful since it’s faster that way.

And if we click on the free area?

VirtualDimension, right click on the free area of a preview tile

Right-clicking on the free area of a preview tile gives you a list of all programs on that desktop.

Ok, not sure how useful that is, but at least it may help with identifying the windows on a desktop in more detail, as you get the window titles here. For my encoding shells I could get very quick glance at the progress, but not exactly in great detail. So the helpfulness of this is limited.

What else?

Well, it’s extremely fast! That’s one major plus for VirtualDimension, as several of the other solutions (open source ones as well) were abysmally slow, at least on XP x64. But damn, VirtualDimension just flies! And its memory footprint is minimal. I saw less than 12MiB of consumption here. Even if you add a truckload of Desktops (there seems to be no upper limit), it just won’t slow down unless you spawn like 50 CPU intensive processes all over the place killing your CPU or maxing out your RAM. But that wouldn’t have been VirtualDimensions fault then. Its memory footprint will linearly grow by spawning more desktops, so with eight you may see around 20MiB. Still neat.

And what’s bad about it?

Well, sometimes, if you have a lot of windows on one desktop, the icons are’t cut off in the right spot at the bottom of the preview tile, so they overflow just a little bit. Just a cosmetic issue. Also, you should maybe deactivate the shell integration. With this, VirtualDimension hooks itself into all windows (such a DLL hook means entering another processes’ memory area). With that, you can get its functions via right clicking on a windows’ title bar, like on UNIX.

Nice, but dangerous! This can trigger anti-cheat systems in online games, because they really don’t like you stepping into their processes’ memory areas! That’s what cheating tools do to modify a games’ parameters on the fly as well. You don’t wanna be banned because of such a thing!

In my case, I managed to lock myself out of Mechwarrior Online because of this. I wasn’t banned, but the login process wouldn’t even let me launch its window. Disabling the feature, launching MWO, then re-enabling it and trying to log in caused a pretty abnormal process termination:

Mechwarrior Online really doesn't like VirtualDimensions' shell integration feature

Mechwarrior Online really doesn’t like VirtualDimensions’ shell integration feature! And no, there was no “update available”… (click to enlarge)

There is an exception list for this feature, and I added all of MWOs’ .exe files to it, to no avail. Better to stay away from this one.

Now, well, this otherwise beautiful piece of software was dropped by its developer around 2005. About the time my XP x64 came out. Latest alpha build is from some time in 2006. So this is ancient! It even supports Windows 98 and NT 4.0, I mean… So, how about Windows 10 then? I mean, Windows 10 doesn’t even have a GDI UI anymore, this is like one completely different world. Since I do have a Windows 10 machine (yeah, ew), let’s check it out:

VirtualDimension on Windows 10

VirtualDimension on Windows 10 – hey, it really works!?

Miranda seemingly can’t dock properly on Windows 10 anymore. It kinda… floats near the desktop border when docked. It’s strange and it wastes space, but well, I don’t know how to fix that yet. But anyway, I embedded VirtualDimension into Miranda (by just moving the window there, removing its title bar and resizing it properly again). And guess what?

It just works™!

I launched some Metro / Modern UI apps in a window as well, and while those aren’t shown in the preview tiles, they can be controlled with keyboard shortcuts, just like regular windows. Also, it’s just as blazing fast as it is on XP. Ah, and… yeah, it actually does work on all 64-bit x86 Windows versions it seems! It’s amazing, but an ancient piece of 32-bit software that does alter a Windows systems’ usage pattern quite fundamentally still works fine on Windows 10 64-bit. I gotta say, I’m pretty relieved, because Windows 10s’ own solution just sucks – where is my live preview? – and I don’t want to change my usage paradigm too much when switching operating systems (even from Linux/UNIX to Windows and back).

Some of the other solutions like Dexpot or Finestra may be faster on Windows 10 than they are on a just half-supported XP x64, but nah, with this tool, I don’t need them anymore, no matter the operating system.

VirtualDimension is as perfect as it gets, despite its age! Or maybe because of it?

Still, anyone interested in picking up that [VirtualDimension project on SourceForge] and in continuing its development? ;) I guess I can’t touch that code, would probably just mess everything up. Ah, it’s C++ by the way…

A few things could use some fixing, like the icon overflowing issue, Modern UI window detection, certain, rare windows being sticky on all Desktops even if nobody told them to do so (Miranda, X-Chat DCC windows) and that exclusion list for the shell integration, which doesn’t hook into all windows properly when active either anyway.

Would be so nice if somebody could continue working this! :)

Until that happens (I know, it never will), I’ll just continue using v0.95 alpha. ;)

Feb 252016
 

SlySoft logoIt’s over – after 13 years of being almost constantly under pressure by US-based companies, SlySoft finally had to close its doors. Most notably known for software such as CloneCD or AnyDVD, the Antiguan-based company has provided people all over the world with ways to quickly and easily circumvent disc-based copy protection mechanisms such as Sony ArcCos, CSS, ACSS or BD+ and many others for years.

The companys’ founder, a certain Mr. Giancarla Bettini had already been sued – and successfully so – before an Antiguan court. While it was strictly up to Antiguan Authorities to actually sue SlySoft (because the AACS-LA could not do so themselves due to some legal constraints), this did finally happen in 2012, fining Mr. Bettini for a sum of USD $30.000. That didn’t result in SlySoft closing down however.

What it was that happened exactly a few days ago is unclear, as SlySoft seems to be under NDA or maybe legal pressure as to not release any statement regarding the reasons for the shutdown, quote, “We were not allowed to respond to any request nor to post any statement”. The only thing that we have besides a forum thread with next to zero information is the statement on the official website, which is rather concise as well:

“Due to recent regulatory requirements we have had to cease all activities relating to SlySoft Inc.
We wish to thank our loyal customers/clients for their patronage over the years.”

It should be relatively clear however, that this has to have something to do with the AACS-LA and several movie studios as well as software and hardware companies “reminding” the United States government of SlySofts illicit activities just recently. This would’ve resulted in Antigua being put onto the US priority [watchlist] of countries violating US/international copyright laws. Ultimately, being put onto that list can result in trade barriers being put up within a short time, hurting a countrys’ economy, thus escalating the whole SlySoft thing to an international incident. More information [here].

AnyDVD HD

This little program and its little brothers made it all the way to the top and became an international incident! Quite the career…

It seems – and here is where my pure speculations start – that there was some kind of agreement found between SlySofts’ founder and the AACS-LA and/or the Antiguan and US governments resulting in the immediate shutdown of SlySoft without further consequences for either its founder or other members of the company. If true then SlySoft will surely also have to break their promise of releasing a “final” version of AnyDVD HD including all the decryption keys from the online database in case they have to close their doors forever. This is, what “[…] we have had to cease all activities relating to SlySoft Inc. […]” means after all.

So what are the consequences, technically?

I can only say for AnyDVD HD as according to the forums over at SlySoft, but the latest version 7.6.9.0 supposedly includes some 130.000 AACS keys and should still be able to decrypt a lot of Blu-Rays, even if not all of them.

In the end however, the situation can only deteriorate as time passes and new versions of AACS keys and BD+ certificates are being released, even if you bypass the removed DNS A-Records of key.slysoft.com and access one of the key servers by resolving the IP address locally (via your hosts file). Thing is, nobody can tell when SlySoft will be forced to implement more effective methods of making their services inaccessible, like by just switching off the machines themselves.

But even if they stay online for years to come, no new keys or certificates are going to be added, so it’s probably safe to say that the red fox is truly dead.

AddendumJust to be clear for those of you who are scared of even accessing any SlySoft machines with their real IPs any longer; According to a SlySoft employee (you can read it in their forums), all of the servers are still 100% under SlySofts physical control, and their storage backends are encrypted. They were not raided or anything. So it seems you do not have to fear “somebody else listening” on SlySofts key servers.

PS.: A sad day if you ask me, a victorious one if you ask the movie industry. Maybe somebody should just walk over and tell them, that cheaper, DRM-free media actually work a lot better on the market, when compared to jailing users into some “trusted” (by them) black boxes with forced software updates and closed software. Yeah, I actually want to play my BD movies on the PC (legally!!), and on systems based on free software like Linux and BSD UNIX as well, not on some blackboxed HW player, so go suck it down, Hollywood. I mean, I’m even BUYING your shit, for Christs’ sake…

Oh, by the way, China is actually sitting on that copyright watchlist (I mean, obviously), and they gave us DVDFab. Also, there are MakeMKV and [others as well]. We’ll see whether the AACS-LA can hunt them all down… And even if they can… Will it really make them more money? Debatable at best…

Red Fox logoUpdate: Those guys work fast! While SlySoft is gone, several of the developers have grabbed the software and moved the servers to Belize, the discussion forums have already been migrated and a new version 7.6.9.1 of AnyDVD HD has been released, including new keys and reconfigured to access the new key servers as well. The company is now called “Red Fox” and the forums can be accessed via [forum.redfox.bz].

By now, AnyDVD HD respects the old licenses as well, and this will stay this way for the transition period. Ultimately however, according to posts on the forums, people will have to buy new licenses, even if they had a lifetime license before. They also said they’ll cook up “something nice” for people who bought licenses just recently. Probably some kind of discount I presume.

Still, if I may quote one of the developers: “SlySoft is dead, long live RedFox!”

Feb 202016
 

H.265/HEVC logoRecently, after [successfully compiling] the next generation x265 H.265/HEVC video encoder on Windows, Linux and FreeBSD, I decided to ask for guidance when it comes to compressing Anime (live action will follow at a later time) in the Doom9 forums, [see here]. Thing is, I didn’t understand all of the knobs x265 has to offer, and some of the convenient presets of x264 didn’t exist here (like --tune film and --tune animation). So for a newbie it can be quite hard to make x265 perform well without sacrificing far too much CPU power, as x265 is significantly more taxing on the processor than x264.

Thanks to [Asmodian] and [MeteorRain]/[LittlePox] I got rid of x265s’ blurring issues, and I took their settings and turned them up to achieve more quality while staying within sane encoding times. My goal was to be able to encode 1080p ~24fps videos on an Intel Xeon X5690 hexcore @ 3.6GHz all-core boost clock at >=1fps for a target bitrate of 2.5Mbit.

In this post, I’d like to compare 7 scenes from the highly opulent Anime [The Garden of Words] (言の葉の庭) by [Makoto Shinkai] (新海 誠) at three different average bitrates, 1Mbit, 2.5Mbit (my current x264 default) and 5Mbit. The Blu-Ray source material is H.264/AVC at roughly 28Mbit on average. Also, both encoders are running in 10-bit color depth mode instead of the common 8-bit, meaning that the internal arithmetic precision is boosted from 8- to 16-bit integers as well. While somewhat “non-standard” for H.264/AVC, this is officially supported by H.265/HEVC for Blu-Ray 4K. The mode of operation is 2-pass to aim for comparable file sizes and bitrates. The encoding speed penalty for switching from x264 to x265 at the given settings is around a factor of 8. Somewhat.

The screenshots below are losslessly compressed 1920×1080 images. Since this is all about compression, I chose to serve the large versions of the images in WebP format to all browsers which support it (Opera 11+, Chromium-based Browsers like Chrome, Iron, Vivaldi, the Android Browser or Pale Moon as the only Gecko browser). This is done, because at maximum level, WebP does lossless compression much more efficiently, so the pictures are smaller. This helps, because my server only has 8Mbit/s upstream. If your browser doesn’t support WebP (like Firefox, IE, Edge, Safari), it’ll be fed lossless PNG instead. All of this happens automatically, you don’t need to do anything!

Now, let’s start with the specs.

Specifications:

Here are the source material encoding settings according to the video stream header:

cabac=1 / ref=4 / deblock=1:1:1 / analyse=0x3:0x133 / me=umh / subme=10 / psy=1 / psy_rd=0.40:0.00 /
mixed_ref=1 / me_range=24 / chroma_me=1 / trellis=2 / 8x8dct=1 / cqm=0 / deadzone=21,11 /
fast_pskip=1 / chroma_qp_offset=-2 / threads=12 / lookahead_threads=1 / sliced_threads=0 / slices=4 /
nr=0 / decimate=1 / interlaced=0 / bluray_compat=1 / constrained_intra=0 / bframes=3 / b_pyramid=1 /
b_adapt=2 / b_bias=0 / direct=3 / weightb=1 / open_gop=1 / weightp=1 / keyint=24 / keyint_min=1 /
scenecut=40 / intra_refresh=0 / rc_lookahead=24 / rc=2pass / mbtree=1 / bitrate=28229 /
ratetol=1.0 / qcomp=0.60 / qpmin=0 / qpmax=69 / qpstep=4 / cplxblur=20.0 / qblur=0.5 /
vbv_maxrate=31600 / vbv_bufsize=30000 / nal_hrd=vbr / filler=0 / ip_ratio=1.40 / aq=1:0.60

x264 10-bit encoding settings (pass 1 & pass 2), 2.5Mbit example:

--fps 24000/1001 --preset veryslow --tune animation --open-gop --b-adapt 2 --b-pyramid normal -f -2:0
--bitrate 2500 --aq-mode 1 -p 1 --slow-firstpass --stats v.stats -t 2 --no-fast-pskip --cqm flat
--non-deterministic

--fps 24000/1001 --preset veryslow --tune animation --open-gop --b-adapt 2 --b-pyramid normal -f -2:0
--bitrate 2500 --aq-mode 1 -p 2 --stats v.stats -t 2 --no-fast-pskip --cqm flat --non-deterministic

x265 10-bit encoding settings (pass 1 & pass 2), 2.5Mbit example:

--y4m -D 10 --fps 24000/1001 -p veryslow --open-gop --bframes 16 --b-pyramid --bitrate 2500 --rect
--amp --aq-mode 3 --no-sao --qcomp 0.75 --no-strong-intra-smoothing --psy-rd 1.6 --psy-rdoq 5.0
--rdoq-level 1 --tu-inter-depth 4 --tu-intra-depth 4 --ctu 32 --max-tu-size 16 --pass 1
--slow-firstpass --stats v.stats --sar 1 --range full

--y4m -D 10 --fps 24000/1001 -p veryslow --open-gop --bframes 16 --b-pyramid --bitrate 2500 --rect
--amp --aq-mode 3 --no-sao --qcomp 0.75 --no-strong-intra-smoothing --psy-rd 1.6 --psy-rdoq 5.0
--rdoq-level 1 --tu-inter-depth 4 --tu-intra-depth 4 --ctu 32 --max-tu-size 16 --pass 2
--stats v.stats --sar 1 --range full

Since x265 can only read raw YUV and Y4M, the source video is being fed to it via [libavs’] avconv tool, piping it into x265. The avconv commandline for that looks as follows:

$ avconv -r 24000/1001 -i input.h264 -f yuv4mpegpipe -pix_fmt yuv420p -r 24000/1001 - 2>/dev/null

If you want to do something similar, but you don’t like avconv, you can use [ffmpeg] as a replacement, the options are completely the same. Note that you should always specify the correct frame rates (-r) for input and output, or the bitrate setting of the encoder will be applied wrongly!

x264 on the other hand was linked against libav directly, using its decoding capabilities without any workarounds.

Let’s compare:

“The Garden of Words” has a lot of rain. This is a central story element of the 46 minute movie, and it’s hard on any encoder, because a lot of stuff is moving on screen all the time. Let’s take a look at such a scene for our first side-by-side comparison. Each comparison is done in two rows: H.264/AVC first (including the source material), and below that H.265/HEVC, also including the source.

Let’s go:

Scene 1, H.264/AVC encoded by x264 0.148.x:

Scene 1, H.265/HEVC encoded by x265 1.9+15-425b583f25db:

It has been said that x265 performs specifically well at two things: Very high resolutions (which we don’t have here) and low bitrates. And yep, it shows. When comparing the 1Mbit shots, it becomes clear pretty quickly that x265 manages to preserve more detail for the parts with lots of motion. x264 on the other hand starts to wash out the scene pretty severely, smearing out some raindrops, spray water and parts of the foliage. Also, it’s pretty bad around the outlines as well, but that’s true for both encoders. You can spot that easily with all the aliasing artifacts on the raindrops.

Moving up a notch, it becomes very hard to distinguish between the two. When zooming in you can still spot a few minor differences (note the kids umbrella, even if it’s not marked), but it’s quite negligible. Here I’m already starting to think x265 might not be worth it in all cases. There are still differences between the two 2.5Mbit shots and the original however, see the red areas of the umbrella and the most low-contrast, dark parts of the foliage.

At 5Mbit, I really can’t see any difference anymore. Maybe the colors are a little off or something, but when seen in motion, distinguishing between the two and the original becomes virtually impossible. Given that we just threw a really difficult scene at x264 and x265, this should be a trend to continue throughout the whole test.

Now, even more rain:

Scene 2, H.264/AVC encoded by x264 0.148.x:

Scene 2, H.265/HEVC encoded by x265 1.9+15-425b583f25db:

Now this is extreme at 1Mbit! Looking at H.264, the spray water on top of the cable can’t even be told apart from the cloud in the background anymore. Detail loss all over the scene is catastrophic in comparison to the original. Tons of raindrops are simply gone entirely, and the texture details on the tower and the angled brick wall of the house to the left? Almost completely washed out and smeared.

Now, let’s look at H.265 @ 1Mbit. The spray water is also pretty bad, but it’s amazing how much more detail was preserved overall. Sure, there are still parts of the raindrops missing, but it’s much, much closer to the original. We can now see details on the walls as well, even the steep angle one on the left. The only serious issue is the red light bleeding at the tower. There is very little red there in the original, so I’m not sure what happened there. x264 does this as well, but x265 is a bit worse.

At the next level, the differences are less pronounced again, but there is still a significant enough improvement when going from x264 to x265 at 2.5Mbit: The spray water on the cable becomes more well-defined, and more rain is being preserved. Also, the textures on the walls are a tiny little bit more detailed and crisp. Once again though, x265 is bleeding too much red light at the tower.

Since it’s noticeably not fully on the level of the source still, let’s look at 5Mbit briefly. x265 is able to preserve a tiny little bit more rain detail here, coming extremely close to the original. In motion, you can’t really see the difference however.

Now, let’s get steamy:

Scene 3, H.264/AVC encoded by x264 0.148.x:

Scene 3, H.265/HEVC encoded by x265 1.9+15-425b583f25db:

1Mbit first again: Let me just say: It’s ugly. x264 pretty much messes up the steam coming from the iron. We get lots of block artifacts now. Some of the low-contrast patterns on the ironing board are being smeared out at a pretty terrible level. Also, the bokeh background partly shows block artifacts and banding. x265 produces quite a lot of banding here itself, but no blocks. Also, outlines and sharp contrasts are more well-defined, and the low contrast part is done noticeably better.

At 2.5Mbit, the patterns repeat themselves now. The steam is only slightly better with x265, outlines are slightly more well-defined, and the low-contrast patterns are slightly more visible. For some of the blurred parts, x265 seems to be a tiny little bit to prone to banding though, in a very few spots, x264 might be just that 1% better. Overall however, x265 wins this, and even if it’s just for the better outlines.

At 5Mbit, you really need to zoom in and analyze very small details, e.g. around the outer border of the steam. Yes, x265 does better again. But you’d not really be able to notice this when watching.

How about we go cry a little bit:

Scene 4, H.264/AVC encoded by x264 0.148.x:

Scene 4, H.265/HEVC encoded by x265 1.9+15-425b583f25db:

Cutting onions would be a classic fun part in a slice-of-life anime. Here, it’s just kitchen work. And quite the bad looking one for H.264 at 1Mbit. The letters on the knife are partly lost completely, becoming unreadable. The onion parts that fly off are visibly worse than when encoded with x265 at the same bitrate. Also, x264 produced block artifacts in the blurred bokeh areas again, that simply aren’t there with x265.

On the next level, the two come much closer to each other. However, x265 simply does the outlines better. Less artifacts and sharper, just like with the writing on the knifes’ blade as well. The issues with the bokeh are nearly gone. What’s left is a negligible amount of blocking for x264 and banding for x265. Not really noticeable however.

Finally, at 5Mbit, x265 shows those ever so slightly more well-done outlines. But that’s about it, the rest looks nice for both, and pretty much identical to the source.

Now, please, dig in:

Scene 5, H.264/AVC encoded by x264 0.148.x:

Scene 5, H.265/HEVC encoded by x265 1.9+15-425b583f25db:

Let’s keep this short: x264 does blocking, and bad transitions/outlines. x265 does it better. Plain and simple.

At 2.5Mbit, x265 nearly reaches quality transparency when compared to the original, something x264 falls short of, just a bit. While x265 does the outlines and the steam part quite like in the original frame, x264 rips the outlines apart a bit too much, and slight block artifacts can again be seen for the steam part.

At 5Mbit, x264 still shows some blocking artifacts in a part that most lossy image/video compression algorithms traditionally suck at: The reds. While not true for all human beings, most eyes perceive much finer gradients for greens, then blues, and do worst with reds. Meaning, our eyes have an unequal sensitivity distribution when it comes to colors. So image and video codecs try to save bitrate in the reds first, because supposedly it’d be less noticeable. To me subjectively, x265 achieves transparency here, meaning it looks just like the original. x264 doesn’t manage entirely. Close, but not not entirely.

Next:

Scene 6, H.264/AVC encoded by x264 0.148.x:

Scene 6, H.265/HEVC encoded by x265 1.9+15-425b583f25db:

This is a highly static scene, with only few moving parts, so there is some rain again, and some shadow cast by raindrops as well. Now, for the static parts, incremental B frames really work wonders here. Most detail is being preserved by both encoders. What’s supposed to be happening is happening: The encoders save bit rate where the human eye can’t easily tell: In the parts where stuff is moving around very quickly. That’s how we lose a lot of raindrop shadows and some drops as well. x264 seems to have trouble separating the scene into even smaller macro blocks though? Not sure if that’s the reason, but a lot of mesh detail for the basket on the balcony on the top right is lost – x265 does better there! This is maybe because x264 couldn’t distinct the moving drops from the static background so well anymore?

At 2.5Mbit, the scenes become almost indistinguishable. The more static content we have, the easier it gets of course, so the transparency threshold becomes lower. And if you ask me, both of them reach perfect quality at 5Mbit.

Let’s throw another hard one at them for the last round:

Scene 7, H.264/AVC encoded by x264 0.148.x:

Scene 7, H.265/HEVC encoded by x265 1.9+15-425b583f25db:

Enough rain already? Pffh! Here we have a lot of foliage and low contrast added to the mix. And it gets smeared a lot by x264, rain detail lost, fine details of the bushes turning into green mud, that’s how it goes. x265 also loses too much detail here (I mean, 1Mbit is really NOT much), but again, it fares quite a bit better.

At 2.5Mbit, both encoders do very well. Somehow, this scene doesn’t seem to penalize x264 that much at the medium level. You’d really need your magnifying glass to find the spots where x265 still does better, which surprises me a bit for this scene. And finally, at 5Mbit – if you ask me – visual transparency is reached for both x264 and x265.

Final thoughts:

Clearly it’s true what a lot of people have been saying. x265 rocks at low bitrates, if configured correctly. But that isn’t gonna give me perfect quality or anything. Yeah, it sucks less – much less – than x264 in that department, but at a higher 2.5Mbit, where both start looking quite decent, x265 having just a slight edge… it becomes hardly justifiable to use it, simply because it’s that much slower to run it at decent settings.

Also, you need to take device compatibility into account. Sure, a powerful PC can always play the stuff. No matter if it’s some UNIX, Linux, MacOS X or Windows. But how about video game consoles? Older TVs? That kind of thing. Most of those can only play H.264/AVC. Or course, if you’re only using your PC and you have a lot of time and electricity to burn, then hey – why not?

But I’ll have to think really long and really hard about whether I want to replace x264 with x265 at this given point in time. Overall, it might not be practical enough on my current hardware yet. Maybe I’d need an AVX/AVX2-capable processor, as x265 has tons of optimizations for those instruction set extensions. But I’m gonna stay on my Xeon X5690 for quite a while, so SSE 4.2 is the latest I have.

I’d say, if you can stand some quality degradation, then x265 might be the way to go, as it can give you much smaller file sizes at lower bitrates with slight degradation.

If you’re aiming for high bitrates and quality, it might not be worth it right now, at least for 1080p. It’s been said the tables are turning once more when going up to 4K and UHD, but I haven’t tested that yet, as all my content – both Anime and live action movies – are “low resolution” *cough* 1080p or 720p.

Addendum:

Screenshots were taken using the latest stable mplayer 1.3.0 on Linux. Thank god they’re bundling it with ffmpeg now, making things much easier. This choice was made because mplayer can easily grab screenshots from specific spots in a video in an automated fashion. I used framesteps for this, like this:

$ mplayer ./TEST-H.265/HEVC-1mbit.mkv -nosound -vf framestep=24 \
 -vo png:z=9:outdir=./screenshots/1mbit/H.265/HEVC/:prefix=H.265/HEVC-1mbit-

This will decode every 24th frame of the video file TEST-H.265/HEVC-1mbit.mkv, and grab it into a .png file with maximum lossless compression as supported by mplayer. The .png files will be prefixed with a user-defined string and numbered, like H.265/HEVC-1mbit-00000001.pngH.265/HEVC-1mbit-00000002.png and so on, until the end of file is reached.

To encode the full size screenshots to WebP, the most recent [libwebp-0.5.0], or rather one of its companion tools – cwebp – was used as follows:

$ cwebp -z 9 -lossless -q 100 -noalpha -mt ./input.png -o ./output.webp

Now… somebody wanna grant me remote access to some quad socket Haswell-EX Xeon box for free?

No?

Meh… :roll: