-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
journal file hash table hardening + zstd support #16096
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9c35e15 to
2051792
Compare
|
Force pushed new version with @dnicolodi's point fixed. |
|
I only skimmed the changes and nothing looked obviously terrible. I don't like suggesting rotations if they can just be done automagically when appropriate, but didn't spend enough time groking the changes to appreciate why it might make the most sense. |
2051792 to
9390fca
Compare
|
Force pushed new version, only changes are the suggested ones. PTAL. |
bde34f3 to
68ca418
Compare
| error(p, "Objected with double compression"); | ||
| if (!!(o->object.flags & OBJECT_COMPRESSED_XZ) + | ||
| !!(o->object.flags & OBJECT_COMPRESSED_LZ4) + | ||
| !!(o->object.flags & OBJECT_COMPRESSED_ZSTD) > 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could use o->object.flags & OBJECT_COMPRESSION_MASK && !isPow2(o->object.flags & OBJECT_COMPRESSION_MASK).
I believe you have such a function or macro, if not:
bool isPow2(unsigned uVal) { return (uVal & ~(uVal - 1)) == uVal; }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is that really more readable though? not convinced...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No contest if you add a function/macro for it.
static bool has_one_bit_set(unsigned long uVal) {
return uVal && (uVal & ~(uVal - 1)) == uVal;
}
if(!has_one_bit_set(o->object.flags & OBJECT_COMPRESSION_MASK))
error...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, we could use __builtin_popcount() too for this if we wanted, but there too I am not sure it's that more readable
let's simplify the checks for ZSTD/LZ4/XZ As suggested: systemd#16096 (comment)
68ca418 to
864c8b2
Compare
|
Force pushed a new version, with (most) points addressed, see above. ptal. |
let's simplify the checks for ZSTD/LZ4/XZ As suggested: systemd#16096 (comment)
864c8b2 to
823c85f
Compare
let's simplify the checks for ZSTD/LZ4/XZ As suggested: systemd#16096 (comment)
823c85f to
623c668
Compare
let's simplify the checks for ZSTD/LZ4/XZ As suggested: systemd#16096 (comment)
623c668 to
8b8e12f
Compare
|
Force pushed a new version with the suggested changes |
Good point. I have another suggestion - just drop the check for hash chain length completely. Even with access to the file ID used to key the hash, an attacker will still have to redo the work of generating new colliding messages for every file rather than relying on a precomputed table. This consumes more resources than are used for inserting the colliding messages in the hash table. |
The field hash table is much smaller than the data hash table, which is desirable in the normal use case. Is there an attack vector here where a user logs many different fields to fill up the field hash table and trigger early rotation? (No need for any hash collision, just maxing out the hash table fill fraction.) |
So I had been thinking about this, and I think we can address this later too, but I think this should be independent of this PR. I mean, let me underline this: remote DoS is really important to fix, local DoS sucks and is something to fix, but there are so many others (think dbus) that I don't think it's as pressing Specifically, what I'd propose in the long run is that we split the IO part of journald into per-UID processes. i.e. journald collects all incoming messages as before, augments them with metadata, does ratelimiting, yadda, yadda but then instead of calling journal_file_append() directly to write the stuff we got into a file we do so out-of-process from a per-client-UID process that we either fork off or that systemd forks off for us, gets the ready-made data and just writes it to disk. This process would drop privs entirely (possibly even setresuid() to the client's UID), and there would be one instance for each client UID that recently wrote stuff. This would mean metadata collection (the stuff we need privs for) and the IO stuff would be split. it would also mean that clients that manage to trigger hash table collisions will slow their own logging down, but nothing else. i.e. they can shoot themselves in the foot without affecting anything else. Such a change would wouldn't even affect our global ordering guarantees, since the separations between journal files and between process would be the same here, and all ordering guarantees and absences would be maintained 1:1. And it wouldn't be a complex patch I figure, writing the iovec[] that makes up the entry to a socket instead of journal_file_append() is easy...
Well, but that just means all access goes from O(1) to O(n), so you basically make the effect of the DoS the default, who is helped by that? I think as long as people can only fuck their own user up the ability to trigger hash collisions is not a problem. But all that is independent of this patch. The remote DoS should definitely be dealt with independently, and any protection on top of that against slowing down/DoSing other users can then come on top of that. Note that right now we actually have a pretty effective DoS protection in place already: per-user ratelimiting. Hence what I propose above isn't even necessary to even make local behaviour safe. It's still a wise thing to do so, because ratelimiting is an optional feature, and even with very high limits we should try to not fall over. |
Another point here is that "local" log messages do not always come from local users. One example is the SSH server which logs failed login attempts with a message that can be partially controlled by an attacker (with sufficient control to give hash table collisions). However, the remote attacker has no way of reading the file ID.. |
The object flags field is a bitmask, hence don't sloppily define _OBJECT_COMPRESSED_MAX as one mor than the previous flag. That worked OK as long as we only had two flags, but will fall apart as soon as we have three. Let's fix this. (It's kinda sloppy how the string table is built here, as it will be quite sparse as soon as we have more enum entries, but let's keep it for now.)
We keep converting forth and back though we never need it in LE. Let's stop doing those conversions hence.
Let's clean this up a bit, following our usual nomenclature to name return parameters ret-xyz. This is mostly a bit of renaming, but there's also some minor other changes: if we return a pointer to a mmap'ed object plus its offset, in almost all cases we are happy if either parameter is NULL in case the caller is not interested in it. Let's fix the remaining case to do this too, to minimize surprises.
Let's prefix this with "jenkins_" since it wraps the jenkins hash. We want to add support for other hash functions to journald soon, hence better be clear with what this is. In particular as all other symbols defined by lookup3.h actually are prefixed "jenkins_".
This adds a new (incompatible) feature to journal files: if enabled the hash function used for the hash tables is no longer jenkins hash with a zero key, but siphash keyed by the file uuid that is included in the file header anyway. This should make our hash tables more robust against collision attacks, as long as the attacker has no read access to the journal files. We switch from jenkins to siphash simply because it's more well-known and we standardize for the rest of our codebase onto it. This is hardening in order to make collision attacks harder for clients that can forge log messages but have no read access to the logs. It has no effect on clients that have read access.
Even with the new keyed hash table journal feature: if an attacker manages to get access to the journal file id it could synthesize records that result in hash collisions. Let's rotate automatically when we notice that, so that a new journal file ID is generated, our performance is restored and the attacker has to guess a new file ID before being able to trigger the issue again. That said, untrusted peers should never get access to journal files in the first case...
Just an import, with no textual changes (some fixed URLs however)
let's simplify the checks for ZSTD/LZ4/XZ As suggested: systemd#16096 (comment)
8b8e12f to
5c0102f
Compare
|
Force pushed a new version with the points @keszybz pointed out addressed, no other fixes... (And no fix that adds per-UID journald writer processes, as dicussed, will work on that eventually though. Added to the TODO list in this PR, for now) |
Agreed.
journald stays O(1). Clients which follow the journal might become worse (but don't have to, if they do some hashing on their own side).
Yep. |
|
I made a quick re-review of the changed parts and everything lgtm. |
let's simplify the checks for ZSTD/LZ4/XZ As suggested: systemd/systemd#16096 (comment)
No description provided.