Skip to content

Conversation

@eklitzke
Copy link
Contributor

@eklitzke eklitzke commented Feb 20, 2018

I've been working in another branch on speeding up IBD/reindexing, and have been looking at how to improve page cache hits on Linux. This is a minor (but safe) improvement I've discovered during that work.

The change here uses posix_fadvise() on systems that support it (Linux, BSD, macOS) such that:

  • When opening a block file for processing, it advises the kernel that the block will be read immediately, and that the read will be sequential.
  • When closing a block file, it advises the kernel that it can discard entries in the page cache associated with that block file, which potentially leaves a bit more page cache room for things like chainstate files when chainstate reindexing happens.

There's also a minor/pedantic fix to related to an existing posix_fallocate() call to make sure it doesn't redefine _POSIX_C_SOURCE (which could potentially affect the behavior of header files included after the redefinition).

I don't expect this to be a huge performance win, but it's safe and well contained. The new util functions are potentially useful elsewhere. The speedup here is potentially bigger by using these methods selectively on CAutoFile, as that's how block files are loaded during the chainstate reindexing phase. I'm working on that as well, but that's more delicate since CAutoFile is used all over the place (compared to CBufferedFile which is just used when reindexing block files).

This primarily affects blocks when bitcoin is launched with -reindex, as
that causes the block files to be loaded as CBufferedFile objects one at
a time as the reindex progresses.
Copy link
Contributor

@jamesob jamesob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK 5259c72

Cool! Seems like a good idea.

int fd = fileno(file);
if (fd == -1)
goto close;
end = lseek(fd, 0, SEEK_END);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

off_t end = ... and remove line 840?

@jamesob
Copy link
Contributor

jamesob commented Feb 20, 2018

Note for future reviewers: the only usage of CBufferedFile I can find is in LoadExternalBLockFile.

@theuni
Copy link
Member

theuni commented Feb 21, 2018

Concept ACK on advising, but we'll definitely want some data first.

posix_fadvise(fd, start, end - start, POSIX_FADV_WILLNEED);
posix_fadvise(fd, start, end - start, POSIX_FADV_SEQUENTIAL);
}
lseek(fd, start, SEEK_SET);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If SEEK_END didn't fail, we need to be sure this doesn't fail either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree this should be fixed. Presumably the seek shouldn't fail, but if it did, the result could be confusing.

off_t end = lseek(fd, 0, SEEK_END);
if (end != -1) {
posix_fadvise(fd, start, end - start, POSIX_FADV_WILLNEED);
posix_fadvise(fd, start, end - start, POSIX_FADV_SEQUENTIAL);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think POSIX_FADV_NOREUSE may be useful here also (although my manpage says it is currently a no-op).

posix_fadvise(fd, 0, end, POSIX_FADV_DONTNEED);
}
#endif
close:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused labels might be a warning in some compiler versions; move inside #if block?

posix_fadvise(fd, start, end - start, POSIX_FADV_WILLNEED);
posix_fadvise(fd, start, end - start, POSIX_FADV_SEQUENTIAL);
}
lseek(fd, start, SEEK_SET);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree this should be fixed. Presumably the seek shouldn't fail, but if it did, the result could be confusing.

ftruncate(fileno(file), fst.fst_length);
#elif defined(__linux__)
// Version using posix_fallocate
#elif _POSIX_C_SOURCE >= 200112L
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @laanwj's comment about _POSIX_C_SOURCE at #12495 (comment) applies here as well, so this check is not right.

bool LockDirectory(const fs::path& directory, const std::string lockfile_name, bool probe_only=false);

//! Return the original FILE* unchanged. On POSIX systems that support it,
//! also advise the kernel that the file will be accessed sequentially.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to mention also that this advises WILLNEED in addition to just SEQUENTIAL.

@eklitzke eklitzke closed this Mar 10, 2018
@laanwj
Copy link
Member

laanwj commented Mar 10, 2018

Why close?

@ryanofsky
Copy link
Contributor

Why close?

I'm also curious.

@jamesob
Copy link
Contributor

jamesob commented Mar 27, 2018

Also curious why this was closed.

@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants