Skip to content

Memory improvements, claimtrie revamp#276

Merged
BrannonKing merged 1 commit intomasterfrom
memory_improvements
Jul 1, 2019
Merged

Memory improvements, claimtrie revamp#276
BrannonKing merged 1 commit intomasterfrom
memory_improvements

Conversation

@BrannonKing
Copy link
Copy Markdown
Member

@BrannonKing BrannonKing commented May 11, 2019

This PR addresses the following issues:

Fixes #106 (Reimplement claimtrie without so many empty nodes, save RAM)
Fixes #108 (Separate data structure and data operations)
Fixes #133 (Trie serialization failure)
Fixes #138 (Rename nCurrentHeight)
Fixes #145 (Rename CClaimTrieCache)
Fixes #158 (Properly delete CClaimTrieNode)

It also includes these enhancements:

  1. BLOCK_HASH is no longer needlessly stored in the claimtrie DB. I had a theory that this was masking places that we should have been comparing the claimTrieHash, which theory proved correct.
  2. An issue with takeover height computation was detected and (temporarily) worked around.
  3. cachedTakeoverHeights was removed; we now use the takeoverHeight field on the nodes in the cache.
  4. The getQueue* functions now re-use code.
  5. effectiveAmount is updated when reading from disk (we should now ensure that it's not being needlessly computed elsewhere).

Although the commits have been squashed, much of this work was done by @bvbfan .

We experimented with keeping empty nodes truly empty -- allocating claim space for them as needed. This saved less than 10MB of RAM and complicated the code a bit; we ditched it. We also experimented with a similar approach where empty node hashes were kept in a separate map. This also was too complicated to get right.

@BrannonKing BrannonKing requested review from bvbfan and lbrynaut May 11, 2019 02:52
@lbry-bot lbry-bot assigned bvbfan and lbrynaut and unassigned BrannonKing May 11, 2019
@BrannonKing BrannonKing self-assigned this May 11, 2019
Comment thread src/logging.h Outdated

static const bool DEFAULT_LOGTIMEMICROS = false;
static const bool DEFAULT_LOGIPS = false;
static const bool DEFAULT_LOGIPS = true;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lbrynaut Is this meant to be here?

Copy link
Copy Markdown
Member Author

@BrannonKing BrannonKing May 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad you brought this up. I'm the one that changed the default and checked it in. I found it frustrating during our last hard fork that none of the log files had IP addresses in them. I suppose this is there because some people may be uncomfortable exposing their internal IP addresses. I think that concern is generally a thing of the past, though, and they aren't forced to send me their logs. I'm certainly open to arguments the other way. I think we should enable IP logging on all instances that we manage.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. If we find it useful we should enable it, but maybe disable it by default for privacy.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd leave it disabled as well.

Comment thread src/prefixtrie.cpp
using iterator = Trie::iterator;
using const_iterator = Trie::const_iterator;

template class CPrefixTrie<Key, Data>;
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the lines below it are a c++ nuance. You have to instantiate a specific type of a template or you need the full definition in the header file. As a native C# programmer, I prefer the latter. @bvbfan prefers the former (as it is here).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some project is used *.impl file ext (not *.cpp) and gets include where it's needed, i'm ok with that too. But all in header makes things to look redundant and hard to read.

Comment thread src/prefixtrie.cpp
}

template <typename TKey, typename TData>
template <typename TDataUni>
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TDataUni allows us to have both lvalue and rvalue data inputs on this method. The alternative is to have two overloads of this method.

Comment thread src/prefixtrie.cpp
auto name = it.key();
name.insert(name.end(), key.begin(), key.end());
auto& node = insert(key, shared);
copy = iterator{name, node};
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a true forward iterator -- that would require the full stack. This is a children-only iterator. It's true of all the places we create the iterator here. We are specifically choosing to not initialize them fully, and we're using the weak_ptr workaround for the place where we need the parent function but don't have it.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's by design, iterator is forward through children, i mean this is not an exception. Thus we don't have ability to obtain parent, that's nodes stand for. We can go for full iterator but i'm afraid we will fill iterator stack for nothing i.e. we will not use it.

@BrannonKing BrannonKing changed the title Memory improvements, claimtrie rewrite Memory improvements, claimtrie revamp May 11, 2019
int validHeight = 0;
CClaimValue claimVal(claimOutPoint, claimId, amount, height, validHeight);
ctc.insertClaimIntoTrie("test", claimVal);
ctc.insertClaimIntoTrie("test", claimVal, true);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding true everywhere doesn't help me understand what is being passed in here. Shouldn't we just assign this to a variable like the others?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

claimtrie.h::insertClaimIntoTrie utilizes a default parameter value even though it's a virtual method. This is non-standard c++, and whether or not inheritors use that default is undefined. I had intended to remove it, but I apparently only made it part way down that road.

Comment thread src/test/claimtriecache_tests.cpp Outdated
BOOST_CHECK_EQUAL(count, 2);

count = 0;
// we cannot use range for on nodes whitout data
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without data

Copy link
Copy Markdown
Collaborator

@bvbfan bvbfan May 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is old, it should be removed at all, we have data on all nodes so it's safe to be used range-based for loop.

Comment thread src/validation.h Outdated
static const unsigned int NODE_NETWORK_LIMITED_MIN_BLOCKS = 288;

static const signed int DEFAULT_CHECKBLOCKS = 6;
static const signed int DEFAULT_CHECKBLOCKS = 24;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be higher for lbrycrd? blocks are just files on a disk at risk of malware manipulation. Bitcoin is 6 blocks but also a lot harder based on computational power requirements at 6. Maybe we should have something much more than 24 as a default? Just a thought since our hash power is minimal in comparison.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is probably a very low priority attack vector, but 6 blocks with lbrycrd is not a lot to overwrite with malware. Most exchanges have min confirmations at 200 for LBC. Validating the top 200 blocks is probably a good default for right now. Again the likelihood that enough people catch the LBC malware that does this is probably near zero...just a thought though after researching the default value.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly unrelated to this PR, but that's a good point. Bitcoin core did backtrack this from 288 to 6, but it's likely not appropriate for us. Reverting to 288 is probably more sane.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

24 is an hour -- same as bitcoin. Why would we need more than that?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think time is the intention here. The intention is to confirm enough blocks such that it is statistically painful to overwrite like with a big reorg for a double spend attack. 6 blocks of bitcoin is a standard for confirmation not meant to be a time equivalence. Admittedly, it is nice that it evenly breaks down to an hour, however arbitrary, but should not be the correlation between cryptocurrencies. However, I could be misinterpreting it.

Comment thread src/txdb.h
static const int64_t nMinDbCache = 4;
//! Max memory allocated to block tree DB specific cache, if no -txindex (MiB)
static const int64_t nMaxBlockDBCache = 2;
static const int64_t nMaxBlockDBCache = 8;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to 8GB max is an important note for the release that contains this. Should be in the change log. Also I am interested in the mindbcache purpose.

@bvbfan bvbfan force-pushed the memory_improvements branch from 2b25d73 to 9b32062 Compare May 15, 2019 12:17
@bvbfan bvbfan force-pushed the memory_improvements branch from 9b32062 to 3aa6911 Compare May 15, 2019 13:16
Comment thread src/init.cpp
}

CClaimTrieCache trieCache(pclaimTrie);
if (!trieCache.ReadFromDisk(chainActive.Tip()))
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still feel like this method fits better on the base class.

Comment thread src/init.cpp

const Consensus::Params& consensusParams = Params().GetConsensus();
pclaimTrie->setExpirationTime(consensusParams.GetExpirationTime(chain_active_height));
CClaimTrieCache(pclaimTrie).setExpirationTime(consensusParams.GetExpirationTime(chain_active_height));
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method also fits better on the base class.

Comment thread src/validation.h Outdated
static const unsigned int NODE_NETWORK_LIMITED_MIN_BLOCKS = 288;

static const signed int DEFAULT_CHECKBLOCKS = 6;
static const signed int DEFAULT_CHECKBLOCKS = 24;
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

24 is an hour -- same as bitcoin. Why would we need more than that?

@BrannonKing BrannonKing force-pushed the memory_improvements branch 4 times, most recently from c6288ae to 755e213 Compare May 24, 2019 16:37
Comment thread src/rpc/claimtrie.cpp Outdated
static UniValue getclaimsintrie(const JSONRPCRequest& request)
{
if (request.fHelp || request.params.size() > 1)
if (request.fHelp || !validParams(request.params, 0, 1))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not that I am well informed on whats going on here, but @BrannonKing wouldn't it be better to handle this sort of thing in the processing of the request rather than modifying so many lines of code? Seems like a DRY violation.

@BrannonKing BrannonKing force-pushed the memory_improvements branch 3 times, most recently from 1746953 to 9df2868 Compare June 19, 2019 20:16
@BrannonKing BrannonKing force-pushed the memory_improvements branch from 8fefa75 to fd39f05 Compare July 1, 2019 18:42
put getclaimsintrie back as deprecated


added test for adding a lot of data to the claimtrie


updated unit test to dodge expiration fork
@BrannonKing BrannonKing force-pushed the memory_improvements branch from fd39f05 to 8b48206 Compare July 1, 2019 18:44
@BrannonKing BrannonKing merged commit e6bb8ba into master Jul 1, 2019
@BrannonKing BrannonKing deleted the memory_improvements branch July 1, 2019 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

4 participants