This page exists to ease the transition since I migrated my blog to a new software. You are interested in the posts previously filed in the “geek” category, which are listed below.
My new blog can be found at http://madduck.net/blog. Future articles, which would have been filed as “geek”, are going to show up here as well. However, please watch this space as these transitional pages may disappear at some point.
This problem related to SpamAssassin and SASL-authenticated road warriors has bugged me for a while. Micah has a patch that alleviates the problem, but it’s not in upstream, and even though upstream does plan to address the problem he raised, that’s down the road.
Today I figured out a solution, based on the
smtpd_sasl_authenticated_header configuration option available
in Postfix 2.3 and later,
using header_checks to
rewrite headers. In the following PCRE map, lines are
folded for readability. The regular expression needs to be on one
line, with newlines and spaces at the beginning of the line
removed:
/^Received: from ([-._[:alnum:]]+ \([-._[:alnum:]]+ [[.[:digit:]]{7,15}\]\))
[[:space:]]+\(Authenticated sender: ([^)]+)\)
[[:space:]]+by (seamus\.madduck\.net) \(([^)]+)\)
with (E?SMTP) id ([A-F[:digit:]]+)
[[:space:]]+for <([^>]+)>; (.*)/
REPLACE Received: from auth sender $2 by $3 ($4) with $5 id $6 for <$7>; $8
Note how the host name is left in to ensure that the right
header is replaced. This will replace the problem-causing
Received line with the following single line — I have
not found out how to insert line-breaks, but that’s not a problem,
really:
Received: from auth sender madduck@madduck.net by
seamus.madduck.net (postfix) with ESMTP id B4CD640343F for
<madduck@madduck.net>; Fri, 30 Jun 2006 17:50:02 +0200 (CEST)
Now SpamAssassin
mentions UNPARSEABLE_RELAY, but apparently that’s
worth 0 points, and if I wanted, I could fix that too.
Now I would like to be able to do the same for clients that authenticate with TLS certificates, but I cannot figure out how.
Update: this wiki page pretty much discusses my problem.
Update: On the issue of not being able to use newlines in the replacement string, Michael Schaefer suggests to match a newline with the regular expression and then to use the reference to insert them into the replacement string.
Posted Wed 02 Dec 2009 21:16:32 CETIf you like the mutt email
client and have to use Mozilla Firefox, you
might like to be able to click mailto: links and have
mutt handle them.
There are various extensions that promise to take care of this, but none of them worked for me. So I wrote a little script to handle the interfacing. Instructions how to tie it in with Firefox are included in the header comments.
Enjoy! Comments, patches, and suggestions welcome.
NP: Dream Theater / Awake
Update: thanks to the suggestion by Nelson A.
de Oliveira, the script now supports the setting of arbitrary
headers, including In-Reply-To. This means you can now
use it to answer list mail from the Debian list archive pages (but see
#406866 and #406867).
Update: Debian’s mutt package now inlucdes the handler since bug #406850 was fixed.
Posted Sat 25 Jul 2009 13:32:21 CESTThe experiment I conducted at the last keysigning party caused this thread (cross-posted to here). While the discussion has long gone way off-topic, some interesting points have been raised. I also took the opportunity to clarify my point of view a bit on the issue over the previous blog post:
The Debian project heavily relies on keysigning for much of its work. However, I think the question what the signing of a key actually accomplishes has not been properly addressed. In my opinion, from the point of view of the Debian project, a person’s actual identity (as in the name on your birth certificate) matters very little; the Debian project does not actively interfere with a person’s real life in such a way as to require the birth certificate identity (legal cases, liability issues, etc.).
Moreover, it’s rather trivial in several countries of this world to change your official name. In this context, even the claim that in the case of a trust abuse, your reputation throughout the FLOSS community (and the rest of the Internet) should be properly tarnished, does not stand, IMHO.
From within the project, what matters is that everything you do within the project can be attributed to one and the same person: the same person that went through our NM process. The GPG key is one technical measure to allow for this form of identification. Its purpose is not, as Micah Anderson states, a means to confirm the validity of a government-issued ID.
This brings me to a point which Andreas Schuldei nicely stated at the beginning of the thread (as did others throughout):
I do not need an ID to identify martin, so i dont need to rely on his (forged or real) passport or other id from him in order to sign his key. If you did not know him before you should not sign his key (if your judgement was based on the unofficial ID).
When Andreas signs my ID, he voices his trust in that I am who I claim to be, and he does so not because I presented him with an ID with the claimed name, but because we’ve interacted many times before. In that line, Gunnar’s point stands:
Maybe we should just drop holding KSPs, and fall back to the traditional method of “Hey, nice dinner we had yesterday. Say, now that you know me, my family and my history, would you like to sign my key as well?” - Signing for people you actually know, not just linking
In my eyes, this is exactly what a keysigning is and should be all about: a statement of familiarity with a person, nothing more and nothing less. And as a project, we should either accept that, or find a better way to identify our developers.
So what to do in this very situation? Should you revoke your signature from my key (or not even sign it in the first place)? Should you revoke or refuse signatures to all participants, because some claim the keysigning party to have been subverted? I think the answer to both cases should be: no, unless you have not previously known the person whose key you wish to sign. That’s exactly what makes this decision very subjective, and a public call such as the original post rather unnecessary and missing the point.
If you do not care to read the entire thread, here are some of the better replies (in no particular order):
- http://lists.debian.org/debian-devel/2006/05/msg01416.html
- http://lists.debconf.org/lurker/message/20060525.092124.f59d8c57.en.html
- http://lists.debian.org/debian-devel/2006/05/msg01447.html
- http://lists.debian.org/debian-devel/2006/05/msg01471.html
- http://lists.debian.org/debian-devel/2006/05/msg01585.html
- http://lists.debian.org/debian-devel/2006/05/msg01463.html
- http://lists.debian.org/debian-devel/2006/05/msg01464.html
One question that arouse while reading this thread is whether Debian could actually persecute one of its members for computer fraud/sabotage/whatever on an international level. And if so, would the real identity really help that much, given that we’ll have countless IP addresses to go by? I know it would make things easier (despite it being only a name, no identity, as there is not birthplace or birthdate), but is it worth the hassle?
Posted Wed 06 May 2009 10:51:38 CESTI have a good friend over at Compuware, and I would send her email from time to time, usually asking her to join us at a party or the like. And usually, she would join.
Until about three weeks ago, when my emails started bouncing with “No such user” error messages. Had she left? Without telling me? A quick phone call confirmed this was not true. In fact, she told me, she’s been getting emails just fine.
So I try again, but keep getting told there wasn’t any user by her login name. Then, in a flash of genius, I decide to send an unsigned mail, which promptly arrives.
Please, postmasters of this earth, get a grip. If you deem it necessary to block standard attachments, at least make your system spout out the right error message. Just because I send my emails digitally signed does not mean that my correspondent isn’t a user at your system, whether you know what digital signatures are or not.
Posted Thu 26 Mar 2009 12:55:40 CETUpdate: it seems that kpartx pretty much does all of the below. Thanks to Faidon Liambotis for the pointer.
Every now and then, I have a disk image (as produced by
cat, pv, or dd) and I need
to access separate partitions. Unfortunately, the patch allowing partitions
on loop devices to be accessed via their own device nodes does
not appear to be in the latest (Debian) 2.6.18 kernels — the
loop module does not have a max_part
parameter, according to modinfo.
So this time I sat down to come up with a recipe on how to
access the partitions, and after some arithmetic and much swearing
at disk manufacturers, and especially the designers of the
msdos partition table type, I think I have found the
solution, and the urge to document it for posterity.
It’s all about the -o parameter to
losetup, which specifies how many bytes into
the disk a given partition starts. Getting this number isn’t
straight forward. Well, it is, if you know how, which is why I am
writing this.
Let’s take a look at a partition table, with sectors as units:
$ /sbin/fdisk -lu disk.img
You must set cylinders.
You can do this from the extra functions menu.
Disk disk.img: 0 MB, 0 bytes
255 heads, 63 sectors/track, 0 cylinders, total 0 sectors
Units = sectors of 1 * 512 = 512 bytes
Device Boot Start End Blocks Id System
disk.imgp1 * 63 96389 48163+ 83 Linux
disk.imgp2 96390 2056319 979965 82 Linux swap / Solaris
disk.imgp3 2056320 78140159 38041920 5 Extended
disk.imgp5 2056383 3052349 497983+ 83 Linux
disk.imgp6 3052413 10859939 3903763+ 83 Linux
disk.imgp7 10860003 68372639 28756318+ 83 Linux
disk.imgp8 68372703 76180229 3903763+ 83 Linux
disk.imgp9 76180293 78140159 979933+ 83 Linux
The first few lines is fdisk complaining not being
able to extract the number of cylinders, since it has to operate on
a file which does not provide an ioctl interface.
The first important data are the units, which are stated to be 512 bytes per sector. We take note of this value as the factor for use in the next operation.
Let’s say we want to access the 7th partition, which is 10860003
sectors into the disk, according to the fdisk output.
We know that each sector is 512 bytes, so:
10860003 * 512 = 5560321536
Passing this number to losetup produces the desired
result:
# losetup /dev/loop0 disk.img -o $((10860003 * 512))
# file -s /dev/loop0
/dev/loop0: Linux rev 1.0 ext3 filesystem data
# mount /dev/loop0 /mnt
[...]
# umount /mnt
# losetup -d /dev/loop0
If the partition really holds a normal filesystem, you can also
let mount set up the loop device, and manage it
automatically:
# mount -o loop,offset=$((10860003 * 512)) disk.img /mnt
[...]
# umount /mnt
And since there’s aparently no means to automate the whole process for an entire disk, I hacked up plosetup. Enjoy:
# plosetup lapse.hda .
I: partition 1 of lapse.hda will become ./lapse.hda_p1 (/dev/loop0)...
I: plosetup: skipping partition 2 of type 82...
I: plosetup: skipping partition 3 of type 5...
I: partition 5 of lapse.hda will become ./lapse.hda_p5 (/dev/loop1)...
I: partition 6 of lapse.hda will become ./lapse.hda_p6 (/dev/loop2)...
I: partition 7 of lapse.hda will become ./lapse.hda_p7 (/dev/loop3)...
I: partition 8 of lapse.hda will become ./lapse.hda_p8 (/dev/loop4)...
I: partition 9 of lapse.hda will become ./lapse.hda_p9 (/dev/loop5)...
# ls -l
total 0
lrwxrwxrwx 1 root root 10 2006-10-20 13:25 lapse.hda_p1 -> /dev/loop0
lrwxrwxrwx 1 root root 10 2006-10-20 13:25 lapse.hda_p5 -> /dev/loop1
lrwxrwxrwx 1 root root 10 2006-10-20 13:25 lapse.hda_p6 -> /dev/loop2
lrwxrwxrwx 1 root root 10 2006-10-20 13:25 lapse.hda_p7 -> /dev/loop3
lrwxrwxrwx 1 root root 10 2006-10-20 13:25 lapse.hda_p8 -> /dev/loop4
lrwxrwxrwx 1 root root 10 2006-10-20 13:25 lapse.hda_p9 -> /dev/loop5
# plosetup -c .
# ls -l
total 0
(this post is dedicated to Penny for no other reason than the tunes I am listening to right now)
NP: Fly My Pretties / The Return of Fly My Pretties
Update: Be careful about the
$((...)) style arithmetic. dash manages
to overflow at 32bit. zsh and bash seem
to get it right. If in doubt, use perl or a calculator.
You are probably using DHCP on the machine currently in front of you. The “Dynamic Host Configuration Protocol” is a way for your computer to obtain an Internet address from a pool of available addresses, and to return it to the pool when you no longer need it. Basically every Internet service provider uses DHCP, or something similar.
As a network operating system, Linux has DHCP support (and had
it for ages). In true Unix fashion, Debian sports at least four DHCP clients.
Debian’s default is dhcp3-client, also
known as dhclient.
The theory is that the client requests an address lease from the
server and periodically renews it. This process yields a number of
events, to which the operating system can react. For instance,
initially, the client issues a PREINIT event to get
the interface into a state where it can talk on the network, and a
BOUND event as soon as it acquired a lease, or
FAIL if it, uh, failed.
After a certain period of time, the client tries to renew the
lease. If it succeeds, it issues a RENEW event; if it
fails, it yields EXPIRE.
So much for the theory.
It seems that dhclient is rather stupid, which I
tried to document in bug
#459813 — it does things differently: given a lease, after a
certain period of time, it just issues an EXPIRE
event, which causes the operating system to deconfigure
the interface and take down connectivity. Then, the client spits
out a PREINIT event, followed by BOUND or
FAIL, as appropriate.
I have not quite investigated what all this means, but this much is for sure: periodically, your machine goes offline, only to come back online a second later. If this were Windows, one would probably knock on wood and be glad that it works at all. But we’re on Linux here, Debian even, so this cannot be.
I’d love to be proven wrong, so if you have a minute, please try to verify. One way of doing so is to insert
echo "$(date) got $reason" >> /tmp/dhclient-script.reasons
towards the top of /sbin/dhclient-script and
monitor the output file. Once your client renews, it should
read:
Wed Jan 9 11:53:27 CET 2008 got RENEW
but instead you’ll see
Wed Jan 9 11:53:27 CET 2008 got EXPIRE
Wed Jan 9 11:53:28 CET 2008 got PREINIT
Wed Jan 9 11:53:33 CET 2008 got BOUND
and if you look closely enough, your interface will be
unconfigured those seconds between EXPIRE and
BOUND.
NP: The Flower Kings: @Live Recording, Uppsala City Theatre, Sweden, 10 February 2003
Posted Fri 22 Aug 2008 07:25:01 CESTMy current mailfilter has two features which increase my day-to-day productivity:
-
a “tickler,” which is a reminder system inspired by the tickler file component of David Allen’s Getting Things Done action management method: I can submit emails and notes to the tickler along with a timestamp in the future, and the tickler delivers the mail (or note) to my inbox when the timestamp has passed.
-
the ability to delay certain types of mail (e.g. Debian mail is held until the weekend, while news items and the like are only delivered at night).
My current implementations work alright, but they’re brittle and crap. What’s worse is that the two features could be combined and handled by one and the same tool, but I implemented them differently for now:
The tickler consists of a Maildir, to which I can submit mails,
either by mailing a note to a specific email address (currently
broken, thus not linked from here), or by adding a tickle stamp
(X-Tickle header) to a message with
this script_, saving it to the local tickler mailbox and asking
offlineimap
to shove it to the server.
On this server, the tickler queue is regularly scanned by a script, which resubmits mail whose timestamp has expired to my mailfilter, where it is treated somewhat specially as a resubmitted mail.
After almost three months with this setup, I can identify the following shortcomings:
-
Since the queue is only processed on the server, I cannot register a mail with the tickler without going online and synchronising my mail before the timestamp expires. So if I am on my way to a conference on Monday and tell the tickler to resubmit a given message on the next day, but I don’t connect before Wednesday, I won’t get the mail on Tuesday (because the mail does not actually reach the tickler until Wednesday, when it’s immediately resubmitted).
I have not found an algorithm that would let me run a queue processor on the server and my laptop without the two causing mail duplication and potentially interfering with each other.
-
I keep all my mail in a single Maildir to make searches easy. Since mail registered with the tickler is stored in a separate Maildir, I often need to search two Maildirs if I am looking for a specific message. This is awkward.
-
The queue processor has to iterate the entire queue, which is fine for a hundred messages, but it does not scale.
-
I have not found a way to ask mutt to tickle me about a message I send without going to the sent messages folder and scheduling the message for tickling manually. One approach would be
record=noand a customsendmailscript, which would store mail to thestoreorticklerMaildirs, depending on whether theX-Tickleheader is present, but I want to avoid a custom sendmail script for reasons unknown. Another approach might be to setbccto the appropriate tickle address, but this would result in duplicate mails, unless I setrecord=noindividually.
When I implemented the delay queue, I knew of these problems and went a different way: delayed mail is stored in a Maildir and a (msgid, timestamp, filename) tuple is inserted into an sqlite3 database on the mail server. This script_ regularly processes the queue.
If I just sent you the chills, at least we have the same taste.
There are numerous problems with this approach, the foremost being
that Maildir filenames are not guaranteed to be constant: mails
jump between the new and cur directories,
and tags, such as seen are encoded in the filename
(thus, symlinks also cannot be used). My script now uses an ugly
heuristic, which at least makes it work. I should investigate
whether inodes could be used instead as I think those
wouldn’t change throughout the lifetime of a mail, at least while
it’s not moved between folders.
I initially considered just dumping messages to files and
encoding the timestamp in the files’ mtime, but then I
would not be able to access the queue with mutt in
case I needed to fetch a delayed mail prematurely, or if I wanted
to synchronise the queue with offlineimap as well.
The past few days, I’ve been condensing experiences from both
approaches and am working out a new technique to combine both
features. In essence, I think the database/index approach is the
best, if I can figure out a way to uniquely identify mail message
files, ideally across folders. Assuming I can use
inodes for that, delayed mail would then be stored
into the delayed Maildir and an (inode, timestamp)
tuple saved into the database. Tickler mail would be stored along
with all other mail in my store Maildir and would get
a similar entry in the database.
This approach solves some problems and leaves others. Assuming I
synchronise the store Maildir remotely (which I do),
then I can easily fathom making modifications via IMAP which causes
orphan records in the database (if only IMAP would allow me to
store key=value pairs for mails…). Furthermore, I’d have to submit
mails to the tickler by bouncing them to an email address, and
deleting the local copy, unless I want duplicates. If now the mail
is somehow dropped, I’ve lost mail.
Still unhappy about all of this, still searching for a better implementation, I’d appreciate any feedback!
NP: Luminous Flesh Giants: Duma I Upadek
Posted Fri 11 Jul 2008 11:21:06 CESTAs part of my research, I may have to conduct a survey among Debian contributors. The word “survey” usually elicits frowns because surveys are often misconducted. MJ has taken the time to draft up some advice to surveyors.
Problems with surveys generally fall into one of two categories: content and presentation. I’ll refrain from making statements about content (Wikipedia has some stuff on questionaire construction) and instead concentrate on presentation in the following.
Commonly in the digital age, surveys are administered via a web page or e-mail. In my recent Ph.D. transfer report, I identified a number of shortcomings with these approaches:
Asking Debian contributors to click radio buttons on a web page is a bit like expecting a mountain biking champion to ride a tricycle across a paddock: painful, if not offensive. Furthermore, web surveys can only be taken while on-line, when most of us have better things to do.
E-mail surveys address some of these problems, but create new ones: answers cannot be constrained to a domain (think multiple-choice), character set and formatting issues make evaluation difficult, and it’s impossible to prevent users from attaching comments or modifying responses.
In thinking about the issue, I came up with a third means to administer a survey: a console tool. Think of a Debian package which provides a console application controlled by a study-specific data file. The data file specifies the questions and their answer domains, and the tool presents those to the participant. Since most of Debian happens on the console anyway, such an approach to surveys seems more appropriate.
Interaction with the survey tool would be as easy as pressing
the 2 or 4 keys to select one of the
multiple choices, and the tool would immediately move on to the
next question (and not wait for the user to hit
enter). Obviously, n and p
should allow navigation back and forth across the set, and
c would spawn a text editor to give the user a chance
to attach a comment to his/her current response, in which s/he
might criticise the question or provide additional information.
Finally, the tool should be able to pick up where it left off,
should the user chose to exit/suspend the survey for now.
Integration with debconf or another
interface abstraction is also worth consideration.
There is more to it: people change their minds and should thus
be able to amend responses. With their consent, it might be
valuable to track such changes and inquire about their motivations.
As I was thinking about how to realise this, I suddenly arrived at
version control: use Git as a
backend storage. The set of cool features this would enable seems
to be endless: it works off-line and can be used to track
aforementioned changes, but also offers the possibility to create a
squashed result in case the participant prefers to submit only the
final result. Furthermore, it’s a trivial change between anonymous
submissions, and submissions authenticated by a GPG
signature.
In addition, the survey tool should be able to display questions according to previous responses (control flow). For instance, if the survey determines that a given user is a contributor to the bug tracking system, but not a project member, it wouldn’t make sense to ask when s/he received his/her Debian account. Furthermore, questions could be dynamically creatable from context, so that the survey can drill into depth depending on previous responses, rather than asking the same questions to all participants.
I am currently applying for funding to outsource the development of such a tool. If you are interested in coding it up and getting paid for it, speak to me. Here are some more specifications to keep in mind before jumping on:
- the result must be released under a Free licence.
- the tool should be implemented in Python and use PyGit. Missing Git bindings should be implemented in and contributed to PyGit.
- the logic should be in a reusable module, and the application a thin layer on top of that.
- data files must be able to specify at least multiple-choice, Likert-scale and free-form-answer questions.
- data files should be able to encode conditional flow.
- data files should encode whether submissions can/must/mustn’t be anonymous.
- data files should encode policy whether changes can/must/mustn’t be tracked.
- it would be nice if data files could encode dynamic questions assembled at run-time.
- questions and answers must be translatable using standard
.pofiles. - in addition to the console interface, a debconf interface would be nice.
- every response results in a Git commit object, and commit messages may be automated or queried from the user, depending on context and configuration.
- data files should provide parent/seed SHA-1 hashes such that every participant essentially commits to a branch off the same parent, using e.g. uuencoded bundles.
- submission should take place via
git-pushorgit-send-email, depending on whether a net connection exists or not.
These are likely to be incomplete, but should convey the basic picture. Feedback is always welcome!
NP: Oceansize: Frames
Update: James Andrewartha pointed me to purity, which asks multiple-choice questions on the console. It has the kind of interface which I envision.
Also, Chris Lamb suggested this personality survey as a base line. Well, actually he just suggested I look into it.
Posted Fri 11 Jul 2008 11:21:06 CESTFor a while now, Iceweasel/Firefox comes with built-in phishing protection, which is undoubtedly a good thing given the number of idiots using the Web these days.
But the implementation is crap. If you surf to a phishing site, such as this test site, the page loads as you would expect and then the first strange things happen:
-
the mouse movement becomes sluggish as the browser increases its virtual memory needs from 466Mb to a measely 912Mb, while the RSS jumps from 134Mb to 423Mb. The entire computer hangs for 2-3 seconds whenever I drag the mouse pointer off the window. I am not saying that my window manager (fluxbox) or Xorg or anything else on this machine is flawless, but this problem only ever appears with Iceweasel/Firefox. Anyway, it’s impossible to follow any links on the page, the mouse pointer does not even change.
-
suddenly, an icon and a little triangle appear at the right side of the URL bar.
-
about 7 seconds later, the rest of the balloon appears, alerting me to the dangers of phishing attacks. Did people think those things are so extraordinarily beautiful that they had to be copied from Windows (or Apple, or whatever)?
-
another 5 seconds later, the webpage darkens.
-
with an entirely unresponsive browser, I have no other choice but to try to leave the page, which is easier said than done: Hitting Ctrl-W to close the tab does not have any effect… for almost 25 seconds; then the page was finally gone and the system restored itself to normal.
I get similar performance problems on the StaTravel website, and on map.search.ch, it almost always crashes.
Thank you, Firefox developers, for bringing the joys of the Windows world to Linux!
NP: Dream Theater: Metropolis Pt 2: Scenes from a Memory
Update: several people have responded that they
cannot reproduce the problem. I thus created a new profile,
uninstalled all system-wide extensions, removed all plugins such
that about:plugins was empty and tried again. The
problems persist, although I think the lags are not quite as long
as before and the memory consumption is obviously down. Maybe this
is amd64-related?
Update: It’s not amd64-related, as
several users have pointed out.
Update: still no luck, but James Andrewart pointed me to this bug report, which might improve things a bit:
Posted Fri 11 Jul 2008 11:21:05 CEST“The new protocol specifies a single lookup algorithm for all tables, rather than having per-table logic. This lookup logic was moved in to the db service from the javascript.
URL canonicalization was moved completely into C++ too. The DB service can now handle a query from a raw URI, which will be needed for malware blocking.”
I just had a hard time finding this excellent procmail resource on the web. I am thus blogging it for posterity, in case anyone is looking for procmail documentation, tips, tricks, a how-to, or anything else related to procmail.
And if you procmail and have not read the document, I suggest you do. It’s truly outstanding.
NP: A Silver Mt. Zion: He Has Left us Alone, but Shafts of Light Sometimes Grace the Corner of our Rooms
Posted Fri 11 Jul 2008 11:21:05 CEST

