home of the madduck/ blog/
The relation between cipher blocks and physical blocks

Many people have responded to my post on encrypted filesystems and power losses; thank you! To clear up the confusion, I decided to write a new entry rather than updating the previous one...

I learnt two important things about encrypted block devices, which make me conclude that the addition of encryption does not put your filesystem at jeopardy any more than if it lived directly on the medium.

First of all, AES works on blocks of size 128 bits, not 64 bytes, but that's mostly a cosmetic issue with respect to the original blog post. What's more important is that a power loss while writing out a cipher block effectively is the same as a power loss while writing out a physical block to the storage medium: the cipher block is usually smaller than the disk block, and an incompletely written disk block is considered to be invalid. So any incomplete cipher block results in an incomplete disk block, which in turn causes the filesystem to consider the block invalid and try to recover from the journal.

What confused me for a long time was the use of chaining block ciphers (see the excellent Wikipedia article for more info). Essentially, block chaining adds security to encryption by making each (cipher) block depend on its predecessor. Great, I thought, so if the first block on disk is corrupted, it'll percolate all the way through the disk and render it all unusable...

Well, fortunately, this is not the case... by that logic, a write to the first block would require all remaining blocks on the disk to be recalculated.

It is true that cipher blocks are chained, but only within a physical block. This means that a read error on the medium may corrupt a cipher block, which then renders all following cipher blocks contained in the physical block unusable (for they cannot be decrypted anymore). But since the filesystem would already consider the entire block invalid, this does not add to problem.

PS: Thanks to Peter Samuelson for (passively) leading me to the light.

Update: Mrten disagrees with the latter paragraph and offers the following analysis, which is sound to me:

suppose C(i) is a crypted version of messageblock M(i), suppose D[] is the decryption-function, and E[] is the encryption-function, and IV is the initial vector, + being XOR:

normally this holds true for CBC:

C(1) = E[ M(1) + IV ]
C(2) = E[ M(2) + C(1) ]
C(i) = E[ M(i) + C(i-1) ]
...

and, the reverse operation:

M(1) = D[ C(1) ] + IV
M(2) = D[ C(2) ] + C(1)
M(i) = D[ C(i) ] + C(i-1)
...

now suppose Q is an errorpattern from the read-error with block C(1), then for the decryption the following holds:

M(1) = D [ C(1) + Q ] + IV      < error
M(2) = D [ C(2) ] + C(1) + Q    < error
M(3) = D [ C(3) ] + C(2)        < OK!
M(i) = D [ C(i) ] + C(i-1)

of course, for this to have any consequence, you have to know which specific blocks had the read-error :)