home of the madduck/ blog/
Data juggling

Letting yesterday pass by in memories, I wonder how I was able to function even a bit for the last two years or so -- information-wise.

While I was still at the Uni Zurich, I found myself almost constantly swamped with so much work that I (a) never got around to do any real research, and (b) turned my digital data into a complete mess. Forget about (a), it's (b) that concerns me right now.

The pattern was usually:

  1. My main machine was becoming unstable and dying frequently, either on disk I/O or CPU-intensive tasks.
  2. With much time pressure, I was forced to move my data to a new machine. Two or three times that was accomplished e.g. by copying oldmachine:~/ to newmachine:~/lost+found/oldmachine (I am, unfortunately, not kidding. For the past half a year, I was working out of lapse:~/wing/lost+found/cirrus/lost+found/diamond. wing, cirrus, and diamond are all deceased machines, lapse is my laptop).
  3. Repeat (1) and (2) about three times and add to the mix that I also had a machine at home and a laptop, no synchronisation strategy really, no real backups, and no version control system.

To be fair: my important data was certainly backed up, and most of my work was on F/OSS or projects in our lab's version control system anyway. I am mainly talking about scripts, ideas, correspondence, photos, ...

The chaos that built up had me busy all of yesterday. I brought all of my machines together, as well as the harddrives that were still lying around, hooked it all up, and started the endeavour. I set up a RAID 5 on one machine, mad of four 250Gb disks, and with 750Gb of space, I was ready to go.

Using any combination(s) of dd | nc, tar, cpio, scp, rsync, and unison, it took me several hours to bring all the data onto one machine. Usually, I would try to purge unwanted crap from ~, such as .thumbnails, or .ccache (!), once again wishing for widely-used ~/.var and ~/.etc,

So today I have to consolidate about 680Gb of data. Most of it can be purged (I found a linux-2.2.7 directory somewhere in there), but often, I will have to look at two or three versions of a file and decide which one to keep. Fortunately, I am fairly certain it's usually the newest one.

My intention is to move mostly everything to under the control of bzr, and then to use unison to synchronise ~ between all machines. However, there are still some unresolved points:

I am still looking for the right tool to do the task of consolidation. mc works okay, but it's just a little too much typing without tab completion. konqueror is just fat, and nautilus currently won't install on amd64. Again, if you have a suggestion, I'd love to hear it.

Data juggling is fun, or not. For sure, there's quite some adrenaline flowing around as you issue bulk commands, knowing that it's basically you and a scalpell, and all your digital life on the table.

NP: Porcupine Tree, "The Sky Moves Sideways"