journal file hash table hardening + zstd support #16096

poettering · 2020-06-08T15:21:14Z

No description provided.

src/journal/journal-internal.h

poettering · 2020-06-08T16:14:00Z

Force pushed new version with @dnicolodi's point fixed.

src/journal/journal-file.c

src/journal/sd-journal.c

vcaputo · 2020-06-08T19:01:45Z

I only skimmed the changes and nothing looked obviously terrible. I don't like suggesting rotations if they can just be done automagically when appropriate, but didn't spend enough time groking the changes to appreciate why it might make the most sense.

src/journal/journal-file.c

src/journal/journal-verify.c

poettering · 2020-06-09T07:43:41Z

Force pushed new version, only changes are the suggested ones. PTAL.

src/journal/compress.c

nolange · 2020-06-16T09:08:08Z

src/journal/journal-verify.c

-                        error(p, "Objected with double compression");
+                if (!!(o->object.flags & OBJECT_COMPRESSED_XZ) +
+                    !!(o->object.flags & OBJECT_COMPRESSED_LZ4) +
+                    !!(o->object.flags & OBJECT_COMPRESSED_ZSTD) > 1) {


could use o->object.flags & OBJECT_COMPRESSION_MASK && !isPow2(o->object.flags & OBJECT_COMPRESSION_MASK).
I believe you have such a function or macro, if not:
bool isPow2(unsigned uVal) { return (uVal & ~(uVal - 1)) == uVal; }

is that really more readable though? not convinced...

No contest if you add a function/macro for it.

static bool has_one_bit_set(unsigned long uVal) { return uVal && (uVal & ~(uVal - 1)) == uVal; } if(!has_one_bit_set(o->object.flags & OBJECT_COMPRESSION_MASK)) error...

I mean, we could use __builtin_popcount() too for this if we wanted, but there too I am not sure it's that more readable

src/journal/compress.h

docs/JOURNAL_FILE_FORMAT.md

src/journal/journal-file.c

let's simplify the checks for ZSTD/LZ4/XZ As suggested: systemd#16096 (comment)

poettering · 2020-06-24T14:46:05Z

Force pushed a new version, with (most) points addressed, see above. ptal.

let's simplify the checks for ZSTD/LZ4/XZ As suggested: systemd#16096 (comment)

docs/JOURNAL_FILE_FORMAT.md

let's simplify the checks for ZSTD/LZ4/XZ As suggested: systemd#16096 (comment)

poettering · 2020-06-25T08:34:23Z

Force pushed a new version with the suggested changes

vuzdemav · 2020-06-25T09:54:26Z

I'm trying to wrap my head around how effective this mitigation is.
It works fine in the case of systemd-journal-remote. So for that alone, it seems worth merging.
OTOH, local mitigation does not seem to be effective.
A local user who has access to journal files can read the hash off the file and produce output that generates hash table collisions.
In the previous version of the PR, the max chain length was proportional to the table size, so that attacker would be able to make journald slow down significantly while doing O(n²) iteration on those chains.
In the latest version of the PR, max chain length is fixed to a constant, so the attacker can force journald to rotate files. This also creates a load, and in addition allows the attacker to flush log entries.
Note that rate limiting only helps a little bit here, because we allow many data fields in a single message. So in principle it should be fairly easy to generate a large number of collisions with just one or few journal entries.

It seems that the only real solution would be to not write the hash key at all until the file is closed. Any readers would have to read the whole file and figure out where objects are stored on their own. This also means that they key should not be based on the file id, but separate. Before the file is archived, the actual hash table could be written. To preserve operation on journald restarts, the "private" hash key and table could be stored in a sidecar file with no read permissions.

Good point.

I have another suggestion - just drop the check for hash chain length completely. Even with access to the file ID used to key the hash, an attacker will still have to redo the work of generating new colliding messages for every file rather than relying on a precomputed table. This consumes more resources than are used for inserting the colliding messages in the hash table.

vuzdemav · 2020-06-25T10:07:46Z

Note that rate limiting only helps a little bit here, because we allow many data fields in a single message.

The field hash table is much smaller than the data hash table, which is desirable in the normal use case. Is there an attack vector here where a user logs many different fields to fill up the field hash table and trigger early rotation? (No need for any hash collision, just maxing out the hash table fill fraction.)

poettering · 2020-06-25T11:43:16Z

I'm trying to wrap my head around how effective this mitigation is.
It works fine in the case of systemd-journal-remote. So for that alone, it seems worth merging.
OTOH, local mitigation does not seem to be effective.
A local user who has access to journal files can read the hash off the file and produce output that generates hash table collisions.
In the previous version of the PR, the max chain length was proportional to the table size, so that attacker would be able to make journald slow down significantly while doing O(n²) iteration on those chains.
In the latest version of the PR, max chain length is fixed to a constant, so the attacker can force journald to rotate files. This also creates a load, and in addition allows the attacker to flush log entries.
Note that rate limiting only helps a little bit here, because we allow many data fields in a single message. So in principle it should be fairly easy to generate a large number of collisions with just one or few journal entries.

So I had been thinking about this, and I think we can address this later too, but I think this should be independent of this PR. I mean, let me underline this: remote DoS is really important to fix, local DoS sucks and is something to fix, but there are so many others (think dbus) that I don't think it's as pressing

Specifically, what I'd propose in the long run is that we split the IO part of journald into per-UID processes. i.e. journald collects all incoming messages as before, augments them with metadata, does ratelimiting, yadda, yadda but then instead of calling journal_file_append() directly to write the stuff we got into a file we do so out-of-process from a per-client-UID process that we either fork off or that systemd forks off for us, gets the ready-made data and just writes it to disk. This process would drop privs entirely (possibly even setresuid() to the client's UID), and there would be one instance for each client UID that recently wrote stuff. This would mean metadata collection (the stuff we need privs for) and the IO stuff would be split. it would also mean that clients that manage to trigger hash table collisions will slow their own logging down, but nothing else. i.e. they can shoot themselves in the foot without affecting anything else.

Such a change would wouldn't even affect our global ordering guarantees, since the separations between journal files and between process would be the same here, and all ordering guarantees and absences would be maintained 1:1.

And it wouldn't be a complex patch I figure, writing the iovec[] that makes up the entry to a socket instead of journal_file_append() is easy...

It seems that the only real solution would be to not write the hash key at all until the file is closed. Any readers would have to read the whole file and figure out where objects are stored on their own. This also means that they key should not be based on the file id, but separate. Before the file is archived, the actual hash table could be written. To preserve operation on journald restarts, the "private" hash key and table could be stored in a sidecar file with no read permissions.

Well, but that just means all access goes from O(1) to O(n), so you basically make the effect of the DoS the default, who is helped by that?

I think as long as people can only fuck their own user up the ability to trigger hash collisions is not a problem.

But all that is independent of this patch. The remote DoS should definitely be dealt with independently, and any protection on top of that against slowing down/DoSing other users can then come on top of that.

Note that right now we actually have a pretty effective DoS protection in place already: per-user ratelimiting. Hence what I propose above isn't even necessary to even make local behaviour safe. It's still a wise thing to do so, because ratelimiting is an optional feature, and even with very high limits we should try to not fall over.

vuzdemav · 2020-06-25T12:21:17Z

OTOH, local mitigation does not seem to be effective.

Another point here is that "local" log messages do not always come from local users. One example is the SSH server which logs failed login attempts with a message that can be partially controlled by an attacker (with sufficient control to give hash table collisions). However, the remote attacker has no way of reading the file ID..

The object flags field is a bitmask, hence don't sloppily define _OBJECT_COMPRESSED_MAX as one mor than the previous flag. That worked OK as long as we only had two flags, but will fall apart as soon as we have three. Let's fix this. (It's kinda sloppy how the string table is built here, as it will be quite sparse as soon as we have more enum entries, but let's keep it for now.)

We keep converting forth and back though we never need it in LE. Let's stop doing those conversions hence.

Let's clean this up a bit, following our usual nomenclature to name return parameters ret-xyz. This is mostly a bit of renaming, but there's also some minor other changes: if we return a pointer to a mmap'ed object plus its offset, in almost all cases we are happy if either parameter is NULL in case the caller is not interested in it. Let's fix the remaining case to do this too, to minimize surprises.

Let's prefix this with "jenkins_" since it wraps the jenkins hash. We want to add support for other hash functions to journald soon, hence better be clear with what this is. In particular as all other symbols defined by lookup3.h actually are prefixed "jenkins_".

This adds a new (incompatible) feature to journal files: if enabled the hash function used for the hash tables is no longer jenkins hash with a zero key, but siphash keyed by the file uuid that is included in the file header anyway. This should make our hash tables more robust against collision attacks, as long as the attacker has no read access to the journal files. We switch from jenkins to siphash simply because it's more well-known and we standardize for the rest of our codebase onto it. This is hardening in order to make collision attacks harder for clients that can forge log messages but have no read access to the logs. It has no effect on clients that have read access.

Even with the new keyed hash table journal feature: if an attacker manages to get access to the journal file id it could synthesize records that result in hash collisions. Let's rotate automatically when we notice that, so that a new journal file ID is generated, our performance is restored and the attacker has to guess a new file ID before being able to trigger the issue again. That said, untrusted peers should never get access to journal files in the first case...

Just an import, with no textual changes (some fixed URLs however)

let's simplify the checks for ZSTD/LZ4/XZ As suggested: systemd#16096 (comment)

…essor

poettering · 2020-06-25T13:06:17Z

Force pushed a new version with the points @keszybz pointed out addressed, no other fixes...

(And no fix that adds per-UID journald writer processes, as dicussed, will work on that eventually though. Added to the TODO list in this PR, for now)

keszybz · 2020-06-25T13:07:58Z

So I had been thinking about this, and I think we can address this later too, but I think this should be independent of this PR.
I mean, let me underline this: remote DoS is really important to fix, local DoS sucks and is something to fix

Agreed.

Well, but that just means all access goes from O(1) to O(n), so you basically make the effect of the DoS the default, who is helped by that?

journald stays O(1). Clients which follow the journal might become worse (but don't have to, if they do some hashing on their own side).

Note that right now we actually have a pretty effective DoS protection in place already: per-user ratelimiting. Hence what I propose above isn't even necessary to even make local behaviour safe. It's still a wise thing to do so, because ratelimiting is an optional feature, and even with very high limits we should try to not fall over.

Yep.

keszybz · 2020-06-25T13:51:32Z

I made a quick re-review of the changed parts and everything lgtm.

let's simplify the checks for ZSTD/LZ4/XZ As suggested: systemd/systemd#16096 (comment)

poettering added journal documentation labels Jun 8, 2020

poettering added this to the v246 milestone Jun 8, 2020

dnicolodi reviewed Jun 8, 2020

View reviewed changes

src/journal/journal-internal.h Outdated Show resolved Hide resolved

poettering force-pushed the journal-hash-fix branch from 9c35e15 to 2051792 Compare June 8, 2020 16:13

vcaputo reviewed Jun 8, 2020

View reviewed changes

src/journal/journal-file.c Outdated Show resolved Hide resolved

vcaputo reviewed Jun 8, 2020

View reviewed changes

src/journal/journal-file.c Outdated Show resolved Hide resolved

vcaputo reviewed Jun 8, 2020

View reviewed changes

src/journal/sd-journal.c Outdated Show resolved Hide resolved

vuzdemav reviewed Jun 9, 2020

View reviewed changes

src/journal/journal-file.c Outdated Show resolved Hide resolved

vuzdemav reviewed Jun 9, 2020

View reviewed changes

src/journal/journal-verify.c Outdated Show resolved Hide resolved

poettering force-pushed the journal-hash-fix branch from 2051792 to 9390fca Compare June 9, 2020 07:20

mrc0mmand mentioned this pull request Jun 9, 2020

systemd-networkd-tests.py fails in 'test_bridge_configure_without_carrier' #16105

Closed

poettering force-pushed the journal-hash-fix branch 2 times, most recently from bde34f3 to 68ca418 Compare June 10, 2020 15:20

nolange reviewed Jun 16, 2020

View reviewed changes

vuzdemav reviewed Jun 17, 2020

View reviewed changes

poettering added a commit to poettering/systemd that referenced this pull request Jun 24, 2020

tree-wide: add new HAVE_COMPRESSION compile time flag

50d2a2a

let's simplify the checks for ZSTD/LZ4/XZ As suggested: systemd#16096 (comment)

poettering force-pushed the journal-hash-fix branch from 68ca418 to 864c8b2 Compare June 24, 2020 14:43

poettering added a commit to poettering/systemd that referenced this pull request Jun 24, 2020

tree-wide: add new HAVE_COMPRESSION compile time flag

105b11b

let's simplify the checks for ZSTD/LZ4/XZ As suggested: systemd#16096 (comment)

poettering force-pushed the journal-hash-fix branch from 864c8b2 to 823c85f Compare June 24, 2020 15:07

poettering added a commit to poettering/systemd that referenced this pull request Jun 24, 2020

tree-wide: add new HAVE_COMPRESSION compile time flag

da3ec79

let's simplify the checks for ZSTD/LZ4/XZ As suggested: systemd#16096 (comment)

poettering force-pushed the journal-hash-fix branch from 823c85f to 623c668 Compare June 24, 2020 17:08

vuzdemav reviewed Jun 25, 2020

View reviewed changes

docs/JOURNAL_FILE_FORMAT.md Outdated Show resolved Hide resolved

docs/JOURNAL_FILE_FORMAT.md Outdated Show resolved Hide resolved

poettering added a commit to poettering/systemd that referenced this pull request Jun 25, 2020

tree-wide: add new HAVE_COMPRESSION compile time flag

0bac99c

let's simplify the checks for ZSTD/LZ4/XZ As suggested: systemd#16096 (comment)

poettering force-pushed the journal-hash-fix branch from 623c668 to 8b8e12f Compare June 25, 2020 08:32

poettering added 18 commits June 25, 2020 15:00

macro: add CONST_MIN() similar to CONST_MAX()

d1d8f0f

journal-file: use FLAGS_SET where appropriate

a765609

journal: store NE hash instead of LE hash in Match object

cde8c5f

We keep converting forth and back though we never need it in LE. Let's stop doing those conversions hence.

journal-file: simplify boot ID acquiring

e958c05

journal-file: also show field hash table size in debug output

5030c85

journal: make signature arrays const

7851ec6

journal: support zstd compression for large objects in journal files

8653185

docs: import journal file format docs from fdo wiki

bbcd38e

Just an import, with no textual changes (some fixed URLs however)

docs: document the new journal file format additions

70cd1e5

tree-wide: add new HAVE_COMPRESSION compile time flag

d80b051

let's simplify the checks for ZSTD/LZ4/XZ As suggested: systemd#16096 (comment)

coredump: use log_error_errno() where appropriate

3afe5c0

compress: do something roughly reasonable when building without compr…

c85cf04

…essor

update TODO

5c0102f

poettering force-pushed the journal-hash-fix branch from 8b8e12f to 5c0102f Compare June 25, 2020 13:04

keszybz merged commit 12d14b7 into systemd:master Jun 26, 2020

Yamakuzure pushed a commit to elogind/elogind that referenced this pull request Aug 31, 2020

tree-wide: add new HAVE_COMPRESSION compile time flag

b13a6f2

let's simplify the checks for ZSTD/LZ4/XZ As suggested: systemd/systemd#16096 (comment)

Uh oh!

journal file hash table hardening + zstd support #16096

journal file hash table hardening + zstd support #16096

Uh oh!

Conversation

poettering commented Jun 8, 2020

Uh oh!

Uh oh!

poettering commented Jun 8, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vcaputo commented Jun 8, 2020

Uh oh!

Uh oh!

Uh oh!

poettering commented Jun 9, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nolange Jun 16, 2020

Choose a reason for hiding this comment

Uh oh!

poettering Jun 24, 2020

Choose a reason for hiding this comment

Uh oh!

nolange Jun 24, 2020

Choose a reason for hiding this comment

Uh oh!

poettering Jun 24, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

poettering commented Jun 24, 2020

Uh oh!

Uh oh!

Uh oh!

poettering commented Jun 25, 2020

Uh oh!

vuzdemav commented Jun 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vuzdemav commented Jun 25, 2020

Uh oh!

poettering commented Jun 25, 2020

Uh oh!

vuzdemav commented Jun 25, 2020

Uh oh!

poettering commented Jun 25, 2020

Uh oh!

keszybz commented Jun 25, 2020

Uh oh!

keszybz commented Jun 25, 2020

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

6 participants

vuzdemav commented Jun 25, 2020 •

edited

Loading