home of the madduck/ blog/ ocat/
madduck's droppings - blogs previously filed under the geek category

This page exists to ease the transition since I migrated my blog to a new software. You are interested in the posts previously filed in the “geek” category, which are listed below.

My new blog can be found at http://madduck.net/blog. Future articles, which would have been filed as “geek”, are going to show up here as well. However, please watch this space as these transitional pages may disappear at some point.

Dealing with SpamAssassin and users authenticating to a mail relay

This problem related to SpamAssassin and SASL-authenticated road warriors has bugged me for a while. Micah has a patch that alleviates the problem, but it’s not in upstream, and even though upstream does plan to address the problem he raised, that’s down the road.

Today I figured out a solution, based on the smtpd_sasl_authenticated_header configuration option available in Postfix 2.3 and later, using header_checks to rewrite headers. In the following PCRE map, lines are folded for readability. The regular expression needs to be on one line, with newlines and spaces at the beginning of the line removed:

/^Received: from ([-._[:alnum:]]+ \([-._[:alnum:]]+ [[.[:digit:]]{7,15}\]\))
    [[:space:]]+\(Authenticated sender: ([^)]+)\)
    [[:space:]]+by (seamus\.madduck\.net) \(([^)]+)\)
    with (E?SMTP) id ([A-F[:digit:]]+)
    [[:space:]]+for <([^>]+)>; (.*)/

  REPLACE Received: from auth sender $2 by $3 ($4) with $5 id $6 for <$7>; $8

Note how the host name is left in to ensure that the right header is replaced. This will replace the problem-causing Received line with the following single line — I have not found out how to insert line-breaks, but that’s not a problem, really:

Received: from auth sender madduck@madduck.net by
  seamus.madduck.net (postfix) with ESMTP id B4CD640343F for
  <madduck@madduck.net>; Fri, 30 Jun 2006 17:50:02 +0200 (CEST)

Now SpamAssassin mentions UNPARSEABLE_RELAY, but apparently that’s worth 0 points, and if I wanted, I could fix that too.

Now I would like to be able to do the same for clients that authenticate with TLS certificates, but I cannot figure out how.

Update: this wiki page pretty much discusses my problem.

Update: On the issue of not being able to use newlines in the replacement string, Michael Schaefer suggests to match a newline with the regular expression and then to use the reference to insert them into the replacement string.

Posted Wed 02 Dec 2009 21:16:32 CET Tags: ?geek
Firefox handing mailto links to mutt

If you like the mutt email client and have to use Mozilla Firefox, you might like to be able to click mailto: links and have mutt handle them.

There are various extensions that promise to take care of this, but none of them worked for me. So I wrote a little script to handle the interfacing. Instructions how to tie it in with Firefox are included in the header comments.

Enjoy! Comments, patches, and suggestions welcome.

NP: Dream Theater / Awake

Update: thanks to the suggestion by Nelson A. de Oliveira, the script now supports the setting of arbitrary headers, including In-Reply-To. This means you can now use it to answer list mail from the Debian list archive pages (but see #406866 and #406867).

Update: Debian’s mutt package now inlucdes the handler since bug #406850 was fixed.

Posted Sat 25 Jul 2009 13:32:21 CEST Tags: ?geek
Aftereffects of the keysigning experiment

The experiment I conducted at the last keysigning party caused this thread (cross-posted to here). While the discussion has long gone way off-topic, some interesting points have been raised. I also took the opportunity to clarify my point of view a bit on the issue over the previous blog post:

The Debian project heavily relies on keysigning for much of its work. However, I think the question what the signing of a key actually accomplishes has not been properly addressed. In my opinion, from the point of view of the Debian project, a person’s actual identity (as in the name on your birth certificate) matters very little; the Debian project does not actively interfere with a person’s real life in such a way as to require the birth certificate identity (legal cases, liability issues, etc.).

Moreover, it’s rather trivial in several countries of this world to change your official name. In this context, even the claim that in the case of a trust abuse, your reputation throughout the FLOSS community (and the rest of the Internet) should be properly tarnished, does not stand, IMHO.

From within the project, what matters is that everything you do within the project can be attributed to one and the same person: the same person that went through our NM process. The GPG key is one technical measure to allow for this form of identification. Its purpose is not, as Micah Anderson states, a means to confirm the validity of a government-issued ID.

This brings me to a point which Andreas Schuldei nicely stated at the beginning of the thread (as did others throughout):

I do not need an ID to identify martin, so i dont need to rely on his (forged or real) passport or other id from him in order to sign his key. If you did not know him before you should not sign his key (if your judgement was based on the unofficial ID).

When Andreas signs my ID, he voices his trust in that I am who I claim to be, and he does so not because I presented him with an ID with the claimed name, but because we’ve interacted many times before. In that line, Gunnar’s point stands:

Maybe we should just drop holding KSPs, and fall back to the traditional method of “Hey, nice dinner we had yesterday. Say, now that you know me, my family and my history, would you like to sign my key as well?” - Signing for people you actually know, not just linking

In my eyes, this is exactly what a keysigning is and should be all about: a statement of familiarity with a person, nothing more and nothing less. And as a project, we should either accept that, or find a better way to identify our developers.

So what to do in this very situation? Should you revoke your signature from my key (or not even sign it in the first place)? Should you revoke or refuse signatures to all participants, because some claim the keysigning party to have been subverted? I think the answer to both cases should be: no, unless you have not previously known the person whose key you wish to sign. That’s exactly what makes this decision very subjective, and a public call such as the original post rather unnecessary and missing the point.

If you do not care to read the entire thread, here are some of the better replies (in no particular order):

One question that arouse while reading this thread is whether Debian could actually persecute one of its members for computer fraud/sabotage/whatever on an international level. And if so, would the real identity really help that much, given that we’ll have countless IP addresses to go by? I know it would make things easier (despite it being only a name, no identity, as there is not birthplace or birthdate), but is it worth the hassle?

Posted Wed 06 May 2009 10:51:38 CEST Tags: ?geek ?gpg ?identity ?keysigning
Companies and attachments

I have a good friend over at Compuware, and I would send her email from time to time, usually asking her to join us at a party or the like. And usually, she would join.

Until about three weeks ago, when my emails started bouncing with “No such user” error messages. Had she left? Without telling me? A quick phone call confirmed this was not true. In fact, she told me, she’s been getting emails just fine.

So I try again, but keep getting told there wasn’t any user by her login name. Then, in a flash of genius, I decide to send an unsigned mail, which promptly arrives.

Please, postmasters of this earth, get a grip. If you deem it necessary to block standard attachments, at least make your system spout out the right error message. Just because I send my emails digitally signed does not mean that my correspondent isn’t a user at your system, whether you know what digital signatures are or not.

Posted Thu 26 Mar 2009 12:55:40 CET Tags: ?geek
Loop-mounting partitions from a disk image

Update: it seems that kpartx pretty much does all of the below. Thanks to Faidon Liambotis for the pointer.

Every now and then, I have a disk image (as produced by cat, pv, or dd) and I need to access separate partitions. Unfortunately, the patch allowing partitions on loop devices to be accessed via their own device nodes does not appear to be in the latest (Debian) 2.6.18 kernels — the loop module does not have a max_part parameter, according to modinfo.

So this time I sat down to come up with a recipe on how to access the partitions, and after some arithmetic and much swearing at disk manufacturers, and especially the designers of the msdos partition table type, I think I have found the solution, and the urge to document it for posterity.

It’s all about the -o parameter to losetup, which specifies how many bytes into the disk a given partition starts. Getting this number isn’t straight forward. Well, it is, if you know how, which is why I am writing this.

Let’s take a look at a partition table, with sectors as units:

$ /sbin/fdisk -lu disk.img
You must set cylinders.
You can do this from the extra functions menu.

Disk disk.img: 0 MB, 0 bytes
255 heads, 63 sectors/track, 0 cylinders, total 0 sectors
Units = sectors of 1 * 512 = 512 bytes

      Device Boot      Start         End      Blocks   Id  System
disk.imgp1   *          63       96389       48163+  83  Linux
disk.imgp2           96390     2056319      979965   82  Linux swap / Solaris
disk.imgp3         2056320    78140159    38041920    5  Extended
disk.imgp5         2056383     3052349      497983+  83  Linux
disk.imgp6         3052413    10859939     3903763+  83  Linux
disk.imgp7        10860003    68372639    28756318+  83  Linux
disk.imgp8        68372703    76180229     3903763+  83  Linux
disk.imgp9        76180293    78140159      979933+  83  Linux

The first few lines is fdisk complaining not being able to extract the number of cylinders, since it has to operate on a file which does not provide an ioctl interface.

The first important data are the units, which are stated to be 512 bytes per sector. We take note of this value as the factor for use in the next operation.

Let’s say we want to access the 7th partition, which is 10860003 sectors into the disk, according to the fdisk output. We know that each sector is 512 bytes, so:

10860003 * 512 = 5560321536

Passing this number to losetup produces the desired result:

# losetup /dev/loop0 disk.img -o $((10860003 * 512))
# file -s /dev/loop0
/dev/loop0: Linux rev 1.0 ext3 filesystem data
# mount /dev/loop0 /mnt
[...]
# umount /mnt
# losetup -d /dev/loop0

If the partition really holds a normal filesystem, you can also let mount set up the loop device, and manage it automatically:

# mount -o loop,offset=$((10860003 * 512)) disk.img /mnt
[...]
# umount /mnt

And since there’s aparently no means to automate the whole process for an entire disk, I hacked up plosetup. Enjoy:

# plosetup lapse.hda .
I: partition 1 of lapse.hda will become ./lapse.hda_p1 (/dev/loop0)...
I: plosetup: skipping partition 2 of type 82...
I: plosetup: skipping partition 3 of type 5...
I: partition 5 of lapse.hda will become ./lapse.hda_p5 (/dev/loop1)...
I: partition 6 of lapse.hda will become ./lapse.hda_p6 (/dev/loop2)...
I: partition 7 of lapse.hda will become ./lapse.hda_p7 (/dev/loop3)...
I: partition 8 of lapse.hda will become ./lapse.hda_p8 (/dev/loop4)...
I: partition 9 of lapse.hda will become ./lapse.hda_p9 (/dev/loop5)...
# ls -l
total 0
lrwxrwxrwx 1 root root 10 2006-10-20 13:25 lapse.hda_p1 -> /dev/loop0
lrwxrwxrwx 1 root root 10 2006-10-20 13:25 lapse.hda_p5 -> /dev/loop1
lrwxrwxrwx 1 root root 10 2006-10-20 13:25 lapse.hda_p6 -> /dev/loop2
lrwxrwxrwx 1 root root 10 2006-10-20 13:25 lapse.hda_p7 -> /dev/loop3
lrwxrwxrwx 1 root root 10 2006-10-20 13:25 lapse.hda_p8 -> /dev/loop4
lrwxrwxrwx 1 root root 10 2006-10-20 13:25 lapse.hda_p9 -> /dev/loop5
# plosetup -c .
# ls -l
total 0

(this post is dedicated to Penny for no other reason than the tunes I am listening to right now)

NP: Fly My Pretties / The Return of Fly My Pretties

Update: Be careful about the $((...)) style arithmetic. dash manages to overflow at 32bit. zsh and bash seem to get it right. If in doubt, use perl or a calculator.

Posted Wed 11 Feb 2009 09:40:35 CET Tags: ?geek
Down and up and down and up

You are probably using DHCP on the machine currently in front of you. The “Dynamic Host Configuration Protocol” is a way for your computer to obtain an Internet address from a pool of available addresses, and to return it to the pool when you no longer need it. Basically every Internet service provider uses DHCP, or something similar.

As a network operating system, Linux has DHCP support (and had it for ages). In true Unix fashion, Debian sports at least four DHCP clients. Debian’s default is dhcp3-client, also known as dhclient.

The theory is that the client requests an address lease from the server and periodically renews it. This process yields a number of events, to which the operating system can react. For instance, initially, the client issues a PREINIT event to get the interface into a state where it can talk on the network, and a BOUND event as soon as it acquired a lease, or FAIL if it, uh, failed.

After a certain period of time, the client tries to renew the lease. If it succeeds, it issues a RENEW event; if it fails, it yields EXPIRE.

So much for the theory.

It seems that dhclient is rather stupid, which I tried to document in bug #459813 — it does things differently: given a lease, after a certain period of time, it just issues an EXPIRE event, which causes the operating system to deconfigure the interface and take down connectivity. Then, the client spits out a PREINIT event, followed by BOUND or FAIL, as appropriate.

I have not quite investigated what all this means, but this much is for sure: periodically, your machine goes offline, only to come back online a second later. If this were Windows, one would probably knock on wood and be glad that it works at all. But we’re on Linux here, Debian even, so this cannot be.

I’d love to be proven wrong, so if you have a minute, please try to verify. One way of doing so is to insert

echo "$(date) got $reason" >> /tmp/dhclient-script.reasons

towards the top of /sbin/dhclient-script and monitor the output file. Once your client renews, it should read:

Wed Jan  9 11:53:27 CET 2008 got RENEW

but instead you’ll see

Wed Jan  9 11:53:27 CET 2008 got EXPIRE
Wed Jan  9 11:53:28 CET 2008 got PREINIT
Wed Jan  9 11:53:33 CET 2008 got BOUND

and if you look closely enough, your interface will be unconfigured those seconds between EXPIRE and BOUND.

NP: The Flower Kings: @Live Recording, Uppsala City Theatre, Sweden, 10 February 2003

Posted Fri 22 Aug 2008 07:25:01 CEST Tags: ?dhclient ?geek
Delaying mail delivery

My current mailfilter has two features which increase my day-to-day productivity:

  1. a “tickler,” which is a reminder system inspired by the tickler file component of David Allen’s Getting Things Done action management method: I can submit emails and notes to the tickler along with a timestamp in the future, and the tickler delivers the mail (or note) to my inbox when the timestamp has passed.

  2. the ability to delay certain types of mail (e.g. Debian mail is held until the weekend, while news items and the like are only delivered at night).

My current implementations work alright, but they’re brittle and crap. What’s worse is that the two features could be combined and handled by one and the same tool, but I implemented them differently for now:

The tickler consists of a Maildir, to which I can submit mails, either by mailing a note to a specific email address (currently broken, thus not linked from here), or by adding a tickle stamp (X-Tickle header) to a message with this script_, saving it to the local tickler mailbox and asking offlineimap to shove it to the server.

On this server, the tickler queue is regularly scanned by a script, which resubmits mail whose timestamp has expired to my mailfilter, where it is treated somewhat specially as a resubmitted mail.

After almost three months with this setup, I can identify the following shortcomings:

When I implemented the delay queue, I knew of these problems and went a different way: delayed mail is stored in a Maildir and a (msgid, timestamp, filename) tuple is inserted into an sqlite3 database on the mail server. This script_ regularly processes the queue.

If I just sent you the chills, at least we have the same taste. There are numerous problems with this approach, the foremost being that Maildir filenames are not guaranteed to be constant: mails jump between the new and cur directories, and tags, such as seen are encoded in the filename (thus, symlinks also cannot be used). My script now uses an ugly heuristic, which at least makes it work. I should investigate whether inodes could be used instead as I think those wouldn’t change throughout the lifetime of a mail, at least while it’s not moved between folders.

I initially considered just dumping messages to files and encoding the timestamp in the files’ mtime, but then I would not be able to access the queue with mutt in case I needed to fetch a delayed mail prematurely, or if I wanted to synchronise the queue with offlineimap as well.

The past few days, I’ve been condensing experiences from both approaches and am working out a new technique to combine both features. In essence, I think the database/index approach is the best, if I can figure out a way to uniquely identify mail message files, ideally across folders. Assuming I can use inodes for that, delayed mail would then be stored into the delayed Maildir and an (inode, timestamp) tuple saved into the database. Tickler mail would be stored along with all other mail in my store Maildir and would get a similar entry in the database.

This approach solves some problems and leaves others. Assuming I synchronise the store Maildir remotely (which I do), then I can easily fathom making modifications via IMAP which causes orphan records in the database (if only IMAP would allow me to store key=value pairs for mails…). Furthermore, I’d have to submit mails to the tickler by bouncing them to an email address, and deleting the local copy, unless I want duplicates. If now the mail is somehow dropped, I’ve lost mail.

Still unhappy about all of this, still searching for a better implementation, I’d appreciate any feedback!

NP: Luminous Flesh Giants: Duma I Upadek

Posted Fri 11 Jul 2008 11:21:06 CEST Tags: ?geek
Surveys on the console

As part of my research, I may have to conduct a survey among Debian contributors. The word “survey” usually elicits frowns because surveys are often misconducted. MJ has taken the time to draft up some advice to surveyors.

Problems with surveys generally fall into one of two categories: content and presentation. I’ll refrain from making statements about content (Wikipedia has some stuff on questionaire construction) and instead concentrate on presentation in the following.

Commonly in the digital age, surveys are administered via a web page or e-mail. In my recent Ph.D. transfer report, I identified a number of shortcomings with these approaches:

Asking Debian contributors to click radio buttons on a web page is a bit like expecting a mountain biking champion to ride a tricycle across a paddock: painful, if not offensive. Furthermore, web surveys can only be taken while on-line, when most of us have better things to do.

E-mail surveys address some of these problems, but create new ones: answers cannot be constrained to a domain (think multiple-choice), character set and formatting issues make evaluation difficult, and it’s impossible to prevent users from attaching comments or modifying responses.

In thinking about the issue, I came up with a third means to administer a survey: a console tool. Think of a Debian package which provides a console application controlled by a study-specific data file. The data file specifies the questions and their answer domains, and the tool presents those to the participant. Since most of Debian happens on the console anyway, such an approach to surveys seems more appropriate.

Interaction with the survey tool would be as easy as pressing the 2 or 4 keys to select one of the multiple choices, and the tool would immediately move on to the next question (and not wait for the user to hit enter). Obviously, n and p should allow navigation back and forth across the set, and c would spawn a text editor to give the user a chance to attach a comment to his/her current response, in which s/he might criticise the question or provide additional information. Finally, the tool should be able to pick up where it left off, should the user chose to exit/suspend the survey for now. Integration with debconf or another interface abstraction is also worth consideration.

There is more to it: people change their minds and should thus be able to amend responses. With their consent, it might be valuable to track such changes and inquire about their motivations. As I was thinking about how to realise this, I suddenly arrived at version control: use Git as a backend storage. The set of cool features this would enable seems to be endless: it works off-line and can be used to track aforementioned changes, but also offers the possibility to create a squashed result in case the participant prefers to submit only the final result. Furthermore, it’s a trivial change between anonymous submissions, and submissions authenticated by a GPG signature.

In addition, the survey tool should be able to display questions according to previous responses (control flow). For instance, if the survey determines that a given user is a contributor to the bug tracking system, but not a project member, it wouldn’t make sense to ask when s/he received his/her Debian account. Furthermore, questions could be dynamically creatable from context, so that the survey can drill into depth depending on previous responses, rather than asking the same questions to all participants.

I am currently applying for funding to outsource the development of such a tool. If you are interested in coding it up and getting paid for it, speak to me. Here are some more specifications to keep in mind before jumping on:

These are likely to be incomplete, but should convey the basic picture. Feedback is always welcome!

NP: Oceansize: Frames

Update: James Andrewartha pointed me to purity, which asks multiple-choice questions on the console. It has the kind of interface which I envision.

Also, Chris Lamb suggested this personality survey as a base line. Well, actually he just suggested I look into it.

Posted Fri 11 Jul 2008 11:21:06 CEST Tags: ?geek
Iceweasel/Firefox brings you the Windows experience!

For a while now, Iceweasel/Firefox comes with built-in phishing protection, which is undoubtedly a good thing given the number of idiots using the Web these days.

Screenshot of iceweasel's phishing protection warning

But the implementation is crap. If you surf to a phishing site, such as this test site, the page loads as you would expect and then the first strange things happen:

I get similar performance problems on the StaTravel website, and on map.search.ch, it almost always crashes.

Thank you, Firefox developers, for bringing the joys of the Windows world to Linux!

NP: Dream Theater: Metropolis Pt 2: Scenes from a Memory

Update: several people have responded that they cannot reproduce the problem. I thus created a new profile, uninstalled all system-wide extensions, removed all plugins such that about:plugins was empty and tried again. The problems persist, although I think the lags are not quite as long as before and the memory consumption is obviously down. Maybe this is amd64-related?

Update: It’s not amd64-related, as several users have pointed out.

Update: still no luck, but James Andrewart pointed me to this bug report, which might improve things a bit:

“The new protocol specifies a single lookup algorithm for all tables, rather than having per-table logic. This lookup logic was moved in to the db service from the javascript.

URL canonicalization was moved completely into C++ too. The DB service can now handle a query from a raw URI, which will be needed for malware blocking.”

Posted Fri 11 Jul 2008 11:21:05 CEST Tags: ?geek
If you procmail, read this

I just had a hard time finding this excellent procmail resource on the web. I am thus blogging it for posterity, in case anyone is looking for procmail documentation, tips, tricks, a how-to, or anything else related to procmail.

And if you procmail and have not read the document, I suggest you do. It’s truly outstanding.

NP: A Silver Mt. Zion: He Has Left us Alone, but Shafts of Light Sometimes Grace the Corner of our Rooms

Posted Fri 11 Jul 2008 11:21:05 CEST Tags: ?geek