Jan 152015
 

4Kn logoWhile I’ve been planning to build myself a new RAID-6 array for some time (more space, more speed), I got interested in the latest and greatest of hard drive innovations, which is the 4Kn Advanced Format. Now you may now classic hard drives with 512 byte sectors and the regular Advanced Format also known as 512e, which uses 4kiB physical sector sizes, but emulates 512 byte sectors for compatibility reasons. Interestingly, [Microsoft themselves state], that “real” 4Kn harddrives, which expose their sector size to the operating system with no glue layers in between are only supported in Windows 8 and above. So even Windows 7 has no official support.

On top of that, Intel [has stated], that their SATA controller drivers do not support 4Kn, so hooking one such drive up to your Intel chipsets’ I/O controller hub (ICH) or platform controller hub (PCH) will not work. Quote:

“Intel® Rapid Storage Technology (Intel® RST) version 9.6 and newer supports 4k sector disks if the device supports 512 byte emulation (512e). Intel® RST does not support 4k native sector size devices.”

For clarity, to make 4Kn work in a clean fashion, it must be supported on three levels, from lowest to highest:

  1. The firmware: For mainboards, this means your system BIOS/UEFI. For dedicated storage controllers, the controller BIOS itself.
  2. The kernel driver of the storage controller, so that’s your SATA AHCI/RAID drivers or SAS drivers.
  3. Any applications above it performing raw disk access, whether kernel or user space. File system drivers, disk cloning software, low level benchmarks, etc.

Granted, 4Kn drives are extremely new and still very rare. There is basically only the 6TB Seagate enterprise drives available ([see here]) and then some Toshiba drives, also enterprise class. But, to protect my future investments in that RAID-6, I got myself a [Toshiba MG04ACA300A] 3TB drive, which was the only barely affordable 4Kn disk I could get, basically also the only one available right now besides the super expensive 6TB Seagates. That way I can check for 4Kn compatibility relatively cheaply (click to enlarge images):

If you look closely, you can spot the nice 4Kn logo right there. In case you ask yourselves “Why 4Kn?”, well, mostly cost and efficiency. 4kiB sectors are 8 times as large as classic 512 byte ones. Thus, for the same data payload you need 8 times less sector gaps, 8 times less synchronization markers and 8 times less address markers. Also, a stronger checksum can be used for data integrity. See this picture from [Wikipedia]:

Sectors

Sector size comparison (Image is © Dougolsen under the CC-BY 3.0 unported license)

Now this efficiency is already there with 512e drives. 512e Advanced Format was supposedly invented, because more than half the programs working with raw disks out there can’t handle variable sector sizes and are hardcoded for 512n. That also includes system firmwares, so your mainboards’ BIOS/UEFI. To solve those issues, they used 4kiB sectors, then let a fast ARM processor translate them into 512 byte sectors on the fly to give legacy software something it could understand.

4Kn on the other hand is the purist, “right” approach. No more emulation, no more cheating. No more 1GHz ARM dual core processor in your hard drive just to be able to serve data fast enough.

Now we already know that Intel controllers won’t work. For fun, I hooked it up to my ASUS P6T Deluxes’ secondary SATA controller though, a Marvell 88SE6120. Plus, I gave the controller the latest possible driver, the quite hard-to-get version 1.2.0.8400. You can download that [here] for x86 and x64.  To forestall the result: It doesn’t work. At all. This is what the systems’ log has to say about it (click to enlarge):

So that’s a complete failure right there. Even after the “plugged out” message, the timeouts would still continue to happen roughly every 30 seconds, accompanied by the whole operating system freezing for 1-2 seconds every time. I cannot say for any other controllers like the Marvell 9128 or Silicon Image chips and others, but I do get the feeling that none of them will be able to handle 4Kn.

Luckily, I do already have the controller for my future RAID-6 right here, an Areca ARC-1883ix-12, the latest and greatest tech with PCIe x8 3.0 and SAS/12Gbit ports with SATA/6Gbit encapsulation. Its firmware and driver supports 4Kn fully as you can see in Arecas [specifications]. The controller features an out-of-band management system via its own ethernet port and integrated web server for browser-based administration, even if the system doesn’t even have any OS booted up. All that needs to be installed on the OS then is a very tiny driver (click to enlarge):

Plus, Areca gives us one small driver for many Windows operating systems. Only for the Windows XP 32-Bit NT5.1 kernel you’ll get a SCSI Miniport driver exclusively, while all newer systems (WinXP x64, Windows Vista, 7, 8) get a more efficient StorPort driver. So, plugged the controller in, installed the driver, hooked up the disk, and it seems we’re good to go:

The 4Kn drive is being recognized

The 4Kn drive is being recognized (click to enlarge)

Now, any legacy master boot record (MBR) partition table has a 32-bit address field. That means, it can address 232 elements. With each element being 512 bytes large, you reach 2TiB. So that’s where the 2TiB limit comes from. With 4Kn however, the smallest addressable atom is now eight times as large: 4096 bytes! So we should be able to reach 16TiB due to the larger sector size. Supposedly, some USB hard drive manufacturers have used this trick (by emulating 4Kn) to make their larger drives work easily on Windows XP. When trying to partition the Toshiba drive however, I hit a wall, as it seems Windows disk management is about as stupid as was the FAT32 formatter on Windows 98:

MBR initialization failed

MBR initialization failed (click to enlarge)

That gets me thinking. On XP x64, I can still just switch from MBR to the GPT partitioning scheme to be able to partition huge block devices. But what about Windows XP 32-bit? I don’t know how the USB drive manufacturers do it, so I can only presume they ship the drives pre-partitioned if its one of those that don’t come with a special mapping tool for XP. In my case, I just switch to GPT and carry on (click to enlarge):

Now I guess I am the first person in the world to be able to look at this, and potentially the last too:

fsutil.exe showing a 4Kn drive on XP x64

fsutil.exe showing a native SATA 4Kn drive on XP x64, encapsulated in SAS. Windows 7 would show the physical and logical sector size separately due to its official 512e support. Windows XP always reports the logical sector size (click to enlarge)

So far so good. The very first and most simple test? Just copy a file onto the newly formatted file system. I picked the 4k (no pun intended) version of the movie “Big Buck Bunny”:

Copying a first file onto the 4Kn disks NTFS file system

Copying a first file onto the 4Kn disks NTFS file system

Hidden files and folders are shown here, but Windows doesn’t seem to want to create a System Volume Information\ folder for whatever reason. Other than that it’s very fast and seems to work just nicely. Since the speed is affected by the RAID controllers write back cache, I thought I’d try HD Tune 2.55 for a quick sequential benchmark. Or in other words: “Let’s hit our second legacy software wall” (click to enlarge):

Yeah, so… HD Tune never detects anything above 2TiB, but this? At first glance, 375GB might sound quite strange for a 3TB drive. But consider this: 375 × 8 = 3000. What happened here is that HD Tune got the correct sector count of the drive, but misinterpreted each sectors’ size as 512 bytes. Thus, it reports the devices’ size as eight times as small. Reportedly, this is also the exact way how Intels RST drivers fail when trying to address a 4Kn drive. HD Tune 2.55 is thus clearly hardcoded for 512n. There is no way to make this work. Let’s try the paid version of the tool which is usually quite ahead of its free and legacy counterpart (click to enlarge):

Indeed, HD Tune Pro 5.00 works just as it should when accessing the raw drive. Users who don’t want to pay are left dead in the water here. Next, I tried HDTach, also an older tool. HDTach however reads from a formatted file system instead of from a raw block device. The file system abstracts the device to a higher level, so HDTach doesn’t know and doesn’t need to know anything about sectors. As a result, it also just works:

HD Tach benchmarking NTFS on a 4Kn drive

HD Tach benchmarking NTFS on a 4Kn drive (click to enlarge)

Next, let’s try an ancient benchmark, that again accesses drives on the sector level: The ATTO disk benchmark. It is here where we learn that 4Kn, or generally variable sector sizes aren’t space magic. This tool was written well before the times of 512e or 4Kn, and look at that (click to enlarge):

Now what does that tell us? It tells us, that hardware developers feared the chaotic ecosystem of tools and software that accesses disks at low levels. Some might be cleanly programmed, where most may not. That doesn’t just include operating systems’ built-in toolsets, but also 3rd party software, independently from the operating system itself. Maybe it also affects disk cloning software like from Acronis? Volume shadow copies? Bitlocker? Who knows. Thing is, to be sure, you need to test that stuff. And I presume that to go as far as hard drive manufacturers did with 512e, they likely found one abhorrent hell of crappy software during their tests. Nothing else will justify ARM processors at high clock rates on hard drives just to translate sector sizes plus all the massive work that went into defining the 512e Advanced Format standard before 4Kn Advanced Format.

Windows 8 might now fully support 4Kn, but that doesn’t say anything about the 3rd party software you’re going to run on that OS either. So we still live in a Windows world where a lot of fail is waiting for us. Naturally, Linux and certain UNIX systems have adapted much earlier or have never even made the mistake of hardcoding sector sizes into their kernels and tools.

But now, to the final piece of my preliminary tests: Truecrypt. A disk encryption software I still trust despite the project having been shut down. Still being audited without any terrible security hole discoveries so far, it’s my main choice for cross-platform disk encryption, working cleanly on at least Windows, MacOS X and Linux.

Now, 4Kn is disabled for MacOS X in Truecrypts source code, but seemingly, this [can be fixed]. I also discovered that TC will refuse to use anything other than 512n on Linux if Linux kernel crypto is unavailable or disabled by the user in TC, see this part of Truecrypts’ CoreUnix.cpp:

#if defined (TC_LINUX)
if (volume->GetSectorSize() != TC_SECTOR_SIZE_LEGACY)
{
  if (options.Protection == VolumeProtection::HiddenVolumeReadOnly)
    throw UnsupportedSectorSizeHiddenVolumeProtection();
 
  if (options.NoKernelCrypto)
    throw UnsupportedSectorSizeNoKernelCrypto();
}
#endif

Given that TC_SECTOR_SIZE_LEGACY equals 512, it becomes clear that hidden volumes are unavailable as a whole with 4Kn on Linux, and encryption is completely unavailable altogether if kernel crypto isn’t there. So I checked out the Windows specific parts of the code, but couldn’t find anything suspicious in the source for data volume encryption. It seems 4Kn is not allowed for bootable system volumes (lots of “512’s” there), but for data volumes it seems TC is fully capable of working with variable sector sizes.

Now this code has probably never been run before on an actual SATA 4Kn drive, so let’s just give it a shot (click to enlarge):

Amazingly, Truecrypt, another software written and even abandoned by its original developers before the advent of 4Kn works just fine. This time, Windows does create the System Volume Information\ folder on the file system within the Truecrypt container, and fsutil.exe once again reports a sector size of 4096 bytes. This shows clearly that TC understands 4Kn and passes the sector size on to any layers above itself in the kernel I/O stack flawlessly (The layer beneath it should be either the NT I/O scheduler or maybe the storage controller driver directly and the layer above it the NTFS file system driver, if my assumptions are correct).

Two final tests for data integrities’ sake:

Both a binary diff and SHA512 checksums prove, that the data copied from a 512n medium to the 4Kn one is still intact

Both a binary diff and SHA512 checksums prove, that the data copied from a 512n medium to the 4Kn one is still intact

So, my final conclusion? Anything that needs to work with a raw block device on a sector-by-sector level needs to be checked out before investing serious money in such hard drives and storage arrays. It might be cleanly programmed, with some foresight. It also might not.

Anything that sits above the file system layer though (anything that reads and writes folders and files instead of raw sectors) will always work nicely, as such software does not need to know anything about sectors.

Given the possibly enormous amount of software with hardcoded routines for 512 byte sectors, my assumption would be that the migration to 4Kn will be quite a sluggish one. We can see that the enterprise sector is adapting first, clearly because Linux and UNIX systems adapt much faster. The consumer market however might not see 4Kn drives anytime soon, given 512 byte sectors have been around for about 60 years (!) now.

Update 2014-01-16 (Linux): I just couldn’t let it go, so I took the Toshiba 4Kn drive to work with me, and hot plugged it into an Intel ICH10R. So that’s the same chipset as the one I ran the Windows tests on, an Intel X58. Only difference is, that now we’re on CentOS 6.6 Linux running the 2.6.32-504.1.3.el6.x86_64 kernel.  This is what dmesg had to say about my hotplugging:

ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata3.00: ATA-8: TOSHIBA MG04ACA300A, FP2A, max UDMA/100
ata3.00: 732566646 sectors, multi 2: LBA48 NCQ (depth 31/32), AA
ata3.00: configured for UDMA/100
ata3: EH complete
scsi 2:0:0:0: Direct-Access     ATA      TOSHIBA MG04ACA3 FP2A PQ: 0 ANSI: 5
sd 2:0:0:0: Attached scsi generic sg7 type 0
sd 2:0:0:0: [sdf] 732566646 4096-byte logical blocks: (3.00 TB/2.72 TiB)
sd 2:0:0:0: [sdf] Write Protect is off
sd 2:0:0:0: [sdf] Mode Sense: 00 3a 00 00
sd 2:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 2:0:0:0: [sdf] 732566646 4096-byte logical blocks: (3.00 TB/2.72 TiB)
 sdf:
sd 2:0:0:0: [sdf] 732566646 4096-byte logical blocks: (3.00 TB/2.72 TiB)
sd 2:0:0:0: [sdf] Attached SCSI disk

Looking good so far, also the Linux kernel typically cares rather less about the systems BIOS, bypassing whatever crap it’s trying to tell the kernel. Which is usually a good thing. Let’s verify with fdisk:

Note: sector size is 4096 (not 512)
 
WARNING: The size of this disk is 3.0 TB (3000592982016 bytes).
DOS partition table format can not be used on drives for volumes
larger than (17592186040320 bytes) for 4096-byte sectors. Use parted(1) and GUID 
partition table format (GPT).

Now that’s more like it! fdisk is warning me, that it will be limited to addressing 16TiB on this disk. A regular 512n or 512e drive would be limited to 2TiB as we know. Awesome. So, I created a classic MBR style partition on it, formatted it using the EXT4 file system, and mounted it. And what we get is this:

Filesystem            Size  Used Avail Use% Mounted on
/dev/sdf1             2.7T   73M  2.6T   1% /mnt/sdf1

And Intel is telling us that they don’t manage to give us any Windows drivers that can do 4Kn? Marvell doesn’t even comment on their inabilities? Well, suck this: Linux’ free driver for an Intel ICH10R south bridge (or any other that has a driver coming with the Linux kernel for that matter) seems to have no issues with that whatsoever. I bet it’s the same with BSD. Just weak, Intel. And Marvell. And all you guys who had so much time to prepare and yet did nothing!

Update 2014-01-20 (Windows XP 32-Bit): So what about regular 32-Bit Windows XP? There are stories going around that some USB drives with 3-4TB capacity would use a 4Kn emulation (or real 4Kn, bypassing the 512e layer by telling the drive firmware to do so?), specifically to enable XP compatibility without having to resort to special mapping tools.

Today, I had the time to install XP SP3 on a spare AMD machine (FX9590, 990FX), which is pretty fast thanks to a small, unused testing SSD I still had lying around. Before that I wiped all GPT partition tables from the 4Kn drive, both the one at the start as well as the backup copy at the end of the drive using dd. Again, for this test, the Areca ARC-1883ix-12 was used, now with its SCSI miniport driver, since XP 32-Bit does not support StorPort.

Please note, that this is a German installation of Windows XP SP3. I hope the screenshots are still understandable enough for English speakers.

Recognition and MBR initialization seems to work just fine this time, unlike on XP x64:

The 4Kn Toshiba as detected by Windows XP Pro 32-Bit SP3, again on an Areca ARC-1883ix-12

The 4Kn Toshiba as detected by Windows XP Pro 32-Bit SP3, again on an Areca ARC-1883ix-12 (click to enlarge)

Let’s try to partition it:

Partitioning the drive once more, MBR style

Partitioning the drive once more, MBR style

Sure looks good! And then, we get this:

A Master Boot Record, Windows XP and 4Kn: It does work after all

A Master Boot Record, Windows XP and 4Kn: It does work after all (click to enlarge)

So why does XP x64 not allow for initialization and partitioning of a 4Kn drive using MBR? Maybe because it’s got GPT for that? So in any case, it’s usable on both systems, the older NT 5.1 (XP 32-Bit) as well as the newer NT 5.2 (XP x64, Server 2003). Again, fsutil.exe confirms proper recognition of our 4Kn drive:

fsutil.exe reporting a 4kiB sector size, just like on XP x64

fsutil.exe reporting a 4kiB sector size, just like on XP x64

So all you need – just like on XP x64 – is a proper controller with proper firmware and drivers!

There is one hard limit here though that XP 32-Bit users absolutely need to keep in mind; Huge RAID volumes using LUN carving/splitting and software JBOD/disk spanning using Microsofts Dynamic Volumes are no longer possible when using 4Kn drives. Previously, you could tell certain RAID controllers to just serve huge arrays to the OS in 2TiB LUN slices (e.g. best practice for 3ware controllers on XP 32-Bit). Then, in Windows, you’d just make those slices Dynamic Volumes and span a single NTFS file system over all of them, thus pseudo-breaking the 2TiB barrier.

This can no longer be done, as Dynamic Volumes seemingly do not work with 4Kn drives on Microsoft operating systems before Windows 8, or at least not on XP 32-Bit. The option for converting the volume from MBR to GPT is simply greyed out in Windows disk management.

That means that the absolute maximum volume size using 4Kn disks on 32-Bit Windows XP is 16TiB! On XP x64 – thanks to GPT – it’s just a few blocks short of 256TiB, a limit imposed on us by the NTFS file systems’ 32-bit address field and 64kiB clusters, as 232 * 64KiB × 1024 × 1024 × 1024 = 256TiB.

And that concludes my tests, unless I have time and an actual machine to try FreeBSD or OpenBSD UNIX. Or maybe Windows 7. The likelihood for that is not too high at the moment though.

Jan 312012
 

Vaio LogoNever, ever press the F10 key while booting a Sony Vaio notebook, especially not when you have already crippled the recovery partition when mirroring the system to that shiny new SSD a few years back. What happens is, the system will make some weird change to the master boot record (MBR) to make it point to some binary on the recovery partition. That’s however broken entirely, as that partition has been moved and resized in the meantime. And all I wanted was to get into the BIOS (Hint: that’s actually F2, not F10!). So it gets stuck, plain and simple. I have already re-written the MBR using ms-sys from a [Knoppix] CD, but no luck. Now I’m currently running [GParted] from yet another live CD, deleting the recovery partition, removing (and now even aligning to +1MB) the system partition, all in the faint hope of being able to get this damn thing to boot. After this, I will need to write another MBR and maybe modify the boot.ini to point to a different partition number. Oh, and if you’re wondering why I don’t just boot from a WinXP CD and try some fixboot/fixmbr: Well, the XP CD boots into a nice BSOD, so… At least I can try my WordPress commentary function to report news regarding this fuckup..

Bottom line: Sony Vaio. An F10 key. Just don’t press it! ;-)

Update:

It is fixed now after being shocked by several errors like Operating system not found or [this one], that I have never seen before. Thing is, I deleted the crippled recovery partition, but in Linux, the system partition remained as number 2, namely /dev/sda2. So I thought, that the partition number was still 2. Wrong. Linux somehow enumerates partitions differently from Windows, so a change in boot.ini needed to be done. Here are all the steps that I needed to undertake:

  • Boot GParted, delete the recovery partition.
  • Resize the system partition (an opportunity to re-align it to SSD page boundaries, like to +1MB).
  • Flag the system partition as bootable.
  • Boot Knoppix, mount the /dev/sda2 partition, edit the boot.ini with your favorite editor like vi.
  • Change the following lines (the thing I needed to change was the partition number, shown in underlined bold:
    • default=multi(0)disk(0)rdisk(0)partition(2)\WINDOWS
    • multi(0)disk(0)rdisk(0)partition(2)\WINDOWS=”Microsoft Windows XP Professional” /noexecute=optin /fastdetect

So, after changing the partition number from 2 to 1, all is back to how it once was, only that my SSD is now even faster since the partition and all data on it have been re-aligned to 1MB boundaries!