Why I love Linux

Why I love Linux


It was a dark and stormy night…

Well not really but that’s how all the classics start isn’t it? This one begins with that most fearsome of technological bugs — user error. Fortunately the perpetrator (that’s me btw) redeems himself and his data in the end all thanks to the friendly penguin. I suppose then this is really a love story.

Now in my job I tend to install all sorts of crap on my system both Windows and Linux. I have partitions set aside for the sole purpose of being whipping boys to whatever software I install there. But I also have a good portion of my work data backed up at times (key words there) to both a gateway server and removable storage.

I have three drives in my system a boot drive and two drives which are in RAID 0 because disks are simply way too slow these days. These are two 10k RPM drives and yes that’s still too slow. Bring on the solid states.

The RAID drives are actually RAIDed partitions because it’s simply better than fakeraid controllers with Windows and Linux having their own stripes.

I have Vista and two distros of Linux (Ubuntu Gentoo) in regular use — and Vista bless it brutally murdered its XP predecessor during install and so I had been forced to migrate sooner. And as it happens the answer as to why Vista killed XP all those months ago was answered through this experience.

So — here I was installing yet another Linux distro for review and I foolishly let GRUB install to the boot sector of one of the RAID drives (/dev/sdb instead of /dev/sda where it should have gone). This doesn’t particularly bother Linux but Vista was another story. Booting up it had ‘lost’ one of the drives and thus the dynamic partitions that made up the Windows stripe. And with it a nice chunk of my data. Trying to use Vista’s tools to restore the stripe in Disk Management didn’t help it was determined the drive was ‘absent’ even though a ‘new’ identical drive with partitions intact had appeared in its place. No matter what I did it couldn’t activate it or return it to an dynamic disk status.

Not to worry I thought I know Linux can mount Windows stripes so I’ll boot into Linux and mount the dynamic partitions copy data across and re-make the RAID stripes in Vista. Pain in the bum but I’ve mounted NTFS RAID stripes in the past.

But oddly once in Linux I couldn’t see the dynamic partitions (only the container file which fdisk sees as ‘SPS’). Something had changed. Now in Linux the Windows dynamic disk support is enabled through the LDM driver which had happily identified the dynamic disk RAID paritions XP had made. But it couldn’t see the Vista ones.

And suddenly my forced migration from XP to Vista became clear — even though I installed Vista as a dual-boot with XP and it cheerfully recognised the XP dynamic disks it also at the same time decided to commandeer them and re-write the LDM database for the drives. A format which funnily enough XP itself couldn’t read.

And here too the Linux LDM driver couldn’t interpret Vista’s new LDM format. Nothing like backwards compatibility eh?

A quick search on the Web revealed it was a known problem and Linux-NTFS maintainer Anton Altaparmakov had submitted a patch to the kernel tree for it.

So I grabbed it patched my kernel (2.6.22-rc4 with ck patches if you must know) and booted up.

Still no go the kernel couldn’t identify the LDM partitions.

Even though I had backups for most of the data on the stripes it had quickly become a matter of principle and a technical challenge. Like the time when I was a young geek (we’re talking back when Mosaic was the world’s only browser and gopher was popular) and I had left my phone cord at a friends — but just had to get onto the BBS that night with the modem — and so hacked up another cord tested the wires for signal from the modem and ended up blu-tacking them to the modem’s PCB. Which worked just fine as long as you didn’t bump the cord.

So I don’t like being stopped. And right here right now the striped partitions and all the data was still there it’s just Windows and Linux couldn’t tell where in that large dynamic disk container partition the individual LDM partitions started and finished. Solve this and the partitions can be readable.

For Windows it’s a lost cause either way — even if I could determine this there’s no way of telling Windows to create a stripe out of set range of sectors. As usual everything works fine in Windows as long as you don’t step outside the very tight boundaries it defines. When you do there isn’t even a creek let alone boat and paddle.

But Linux can and so it’s at this point I hatched a Crazy Plan. It went something like this:

Crazy Plan ™

1: Use md to build a striped md array from the dynamic partition containers Linux can see
2: Use dd to copy the entire drive as a file image to another partition
3: Split the image file at the partition boundary based on size
4: Mount the file as a loopback filesystem

If you’re not a Linux nerd and reading this English translation:

1: Use the Linux ‘md‘ RAID driver to RAID the drives anyway
2: Use ‘dd‘ — the ultimate copy tool — to mirror the array bit for bit to a file
3: Split the file into the partition boundaries I know based on size
4: Mount the split files as a loopback device (Linux can treat a file as a drive)

At about this time I had got in contact with Anton to ensure the LDM patch he had submitted to the kernel was the latest just in case my Crazy Plan didn’t need to go ahead. I also proudly told Anton of my Crazy Plan perhaps to show just how crazy it really could be or perhaps because I wanted assurance it wasn’t too crazy.

Anton immediately told me I was crazy. Proving himself to be a far greater geek than I he said that while my plan was indeed — lets avoid the word for a moment — innovative he none the less shot it down and in its place suggested a most brilliant solution:

1: Make two loopback devices out of the known sector ranges on each drive
2: Use md to RAID the two loopback devices
3: Mount the RAID

And if using loopbacks wasn’t going to work for raw mappings against a physical drive use dd to first copy the sector range to files first then loopback mount and raid the files. In commands it looks something like this:

losetup -o OFFSET /dev/loop0 /dev/[drive]
losetup -o OFFSET /dev/loop1 /dev/[drive]
mdadm –build /dev/md0 –chunk=64 –level=raid0 –raid-devices=2 /dev/loop0 /dev/loop1
mount -t ntfs /dev/md0 /mnt/ntfs

Again for the visitors: in essence creating custom-sized virtual drives out real drives on the fly then RAIDing the virtual drives and mounting the RAID.

Niiice.

Don’t tell me I’m the only one who gets hot under the collar by this — not only is this just plain cool no matter how you cut it it also shows off the Linux approach of using many small tools to solve big problems. Here three separate programs that do one job really well can be paired together like a jigsaw to solve a bigger puzzle. In this case recovering data from Windows partitions that Windows itself can’t do.

But that’s the journey we didn’t have the prize yet — the key of course is getting those sector offsets for the partitions. Fortunately even though the LDM driver for Linux was still being updated for Vista (and here I gave Anton more work — because it seems my partitions were the first to not work with the new driver) we could at least read the offsets directly from the LDM database still present on the drives.

So with ‘ldmdump’ in hand Anton gracefully perused the LDM structure of my drives and we pumped in some values:

losetup -o 15042977280 /dev/loop0 /dev/[drive]
losetup -o
15042977280 /dev/loop1 /dev/[drive]

Made a RAID and mounted as NTFS. No go though. Something was missing. Double-checking size to sector translations and trying a few other values didn’t help. I suggested that perhaps if Windows was having trouble with the LDM database then it might be corrupt — but that we could still find the partition boundaries by searching for the ‘NTFS’ signature on the drives byte for byte.

So Anton wrote a program to do just this and mailed it to me (how great is this guy?). In no time at all we had the exact sector offsets for the NTFS stripes on the RAID drives. Turns out the LDM database wasn’t wrong just that we had forgotten to add a 63 sector offset. Now this is my assumption: the first partition on a disk generally starts at an offset of 63 sectors — and it would appear the partitions created by Windows dynamic disks mirror this within the LDM contailer partition itself. So regardless of where the LDM partitions start on a drive there’s another 63 sector offset.

Bingo! In four simple commands I had my Vista NTFS softwre stripe RAID partitions mounted in Linux:

losetup -o 15043009536 /dev/loop0 /dev/[drive]
losetup -o
15043009536 /dev/loop1 /dev/[drive]
mdadm –build /dev/md0 –chunk=64 –level=raid0 –raid-devices=2 /dev/loop0 /dev/loop1
mount -t ntfs /dev/md0 /mnt/ntfs

And with that I copied all the data I wanted across to another partition on /dev/sda formatted and re-built the Windows arrays in Vista then copied the data back.

All was sweet again almost. In Vista even though I elected not to assign drive letters to the ‘raw’ partitions I would use for Linux the very next bootup Vista happily (read: annoyingly) assigned them drive letters anyway. No problem I thought: I’ll remove them again as it’s rather silly having an Explorer populated with six named partitions I’m never going to use clogging up the view.

It’s worth noting that with the Vista dynamic disks created anew the Linux LDM driver could see the Vista LDM partitions. Finally everything seemed back in place. But after removing Vista’s drive letter assignments to the ‘raw’ partitions I booted to a Linux that could now suddenly no longer see the LDM partitions again.

I have a stress ball on my desk that’s seen years of abuse. It seems to get much more frequent attention in and around the use of Windows. I eyed it now with malice.

Why would altering drive letter assignments in Vista on partitions it’s not even using cause Linux to no longer recognise the LDM format? Under (the now extinct) XP boot this never caused a problem.

So discussing with Anton it seems there was still more work to be done for the LDM driver for Vista’s changes to the format. Working with Anton to through a series of debugging patches to find out just what was going on with my system as the guinea pig he solved it in just a few days.

And so if you’re having trouble reading Vista LDM partitions there’s a new kernel patch upstream on the way which may help.

And all this — from the potency of Linux and its widespread interoperability to the community itself and developers like Anton — is why I love Linux.

It comes down to the most basic principle for the design of any operating system: it must enable you.


Linux to the rescue: the blessed output of top — showing Midnight Commander copying files from the NTFS mounted RAIDed loopback devices.