Letting yesterday pass by in memories, I wonder how I was able to function even a bit for the last two years or so — information-wise.
While I was still at the Uni Zurich, I found myself almost constantly swamped with so much work that I (a) never got around to do any real research, and (b) turned my digital data into a complete mess. Forget about (a), it’s (b) that concerns me right now.
The pattern was usually:
- My main machine was becoming unstable and dying frequently, either on disk I/O or CPU-intensive tasks.
- With much time pressure, I was forced to move my data to a new
machine. Two or three times that was accomplished e.g. by copying
oldmachine:~/tonewmachine:~/lost+found/oldmachine(I am, unfortunately, not kidding. For the past half a year, I was working out oflapse:~/wing/lost+found/cirrus/lost+found/diamond.wing,cirrus, anddiamondare all deceased machines,lapseis my laptop). - Repeat (1) and (2) about three times and add to the mix that I also had a machine at home and a laptop, no synchronisation strategy really, no real backups, and no version control system.
To be fair: my important data was certainly backed up, and most of my work was on F/OSS or projects in our lab’s version control system anyway. I am mainly talking about scripts, ideas, correspondence, photos, …
The chaos that built up had me busy all of yesterday. I brought all of my machines together, as well as the harddrives that were still lying around, hooked it all up, and started the endeavour. I set up a RAID 5 on one machine, mad of four 250Gb disks, and with 750Gb of space, I was ready to go.
Using any combination(s) of dd | nc,
tar, cpio, scp,
rsync, and unison, it took me several
hours to bring all the data onto one machine. Usually, I would try
to purge unwanted crap from ~, such as
.thumbnails, or .ccache (!), once again
wishing for widely-used ~/.var and
~/.etc,
So today I have to consolidate about 680Gb of data. Most of it
can be purged (I found a linux-2.2.7 directory
somewhere in there), but often, I will have to look at two or three
versions of a file and decide which one to keep. Fortunately, I am
fairly certain it’s usually the newest one.
My intention is to move mostly everything to under the control
of bzr, and then to use unison to
synchronise ~ between all machines. However, there are
still some unresolved points:
-
unisonis for pairs, to keep three or more machines in sync, it’s beneficial to have a central machine. I don’t really want a central machine. -
csync2looks likeunisonfor three or more, but it’s a service, with listening daemon and all, and I thus cannot use it to synchronise with my home directories on machines where I don’t have root or cannot installcsync2. -
I don’t want everything synchronised. When it comes to documents and data, I cannot imagine any exceptions, but dot-directories in the home directory are a problem. E.g.:
-
~/.gnupg: I don’t want my GPG key to be synchronised to remote machines. However, I would not mind to synchronise the configuration or the public keyring. -
~/.ssh: Stuff likeconfigandknown_hostscan/should be synchronised. However,authorized_keysmust not be. -
~/.mozilla/firefox: Would be nice, but almost impossible because of Firefox’s braindead approach. For instance, you cannot write a newbookmarks.htmlfile while Firefox is running — it will dump its in-memory copy when quitting. -
Most of my accounts don’t need to store dot-directories for X programmes.
-
My shell or windowmanager configuration may have to be different between machines to support e.g. different defaults, Xinerama, relevant shell variables, etc.
etc. etc.
The ideal approach to the challenge of
~/.etc(I symlink into there for programmes not supporting non-default config file locations) would be a version control system. Since I don’t want to depend on a single machine,bzr,git, or the other decentralised systems seem appropriate: keep a base for e.g.~/.sshand branch for each machine.However, then I could not use
unisonfor~/.etc, and just synchronising all.bzr/subdirectories unfortunately does not work (as empirically determined). And I have not found an acceptable approach to keep the branches in sync; I surely don’t want to have to push changes to every other machine before quitting the day, nor do I want to pull in changes every morning from all over the place.I may have to look at
git, which IIRC stores every change/commit in a separate file, so then synchronisation of.git/should do what I want.If you have any input on this matter, I’d be delighted to hear it.
-
mc works okay, but it’s just a little
too much typing without tab completion. konqueror is
just fat, and nautilus currently won’t install on
amd64. Again, if you have a suggestion, I’d love to
hear it.
Data juggling is fun, or not. For sure, there’s quite some adrenaline flowing around as you issue bulk commands, knowing that it’s basically you and a scalpell, and all your digital life on the table.
NP: Porcupine Tree, “The Sky Moves Sideways”

