Memory improvements, claimtrie revamp by BrannonKing · Pull Request #276 · lbryio/lbrycrd

BrannonKing · 2019-05-11T02:51:30Z

This PR addresses the following issues:

Fixes #106 (Reimplement claimtrie without so many empty nodes, save RAM)
Fixes #108 (Separate data structure and data operations)
Fixes #133 (Trie serialization failure)
Fixes #138 (Rename nCurrentHeight)
Fixes #145 (Rename CClaimTrieCache)
Fixes #158 (Properly delete CClaimTrieNode)

It also includes these enhancements:

BLOCK_HASH is no longer needlessly stored in the claimtrie DB. I had a theory that this was masking places that we should have been comparing the claimTrieHash, which theory proved correct.
An issue with takeover height computation was detected and (temporarily) worked around.
cachedTakeoverHeights was removed; we now use the takeoverHeight field on the nodes in the cache.
The getQueue* functions now re-use code.
effectiveAmount is updated when reading from disk (we should now ensure that it's not being needlessly computed elsewhere).

Although the commits have been squashed, much of this work was done by @bvbfan .

We experimented with keeping empty nodes truly empty -- allocating claim space for them as needed. This saved less than 10MB of RAM and complicated the code a bit; we ditched it. We also experimented with a similar approach where empty node hashes were kept in a separate map. This also was too complicated to get right.

tiger5226 · 2019-05-11T03:27:07Z


 static const bool DEFAULT_LOGTIMEMICROS = false;
-static const bool DEFAULT_LOGIPS        = false;
+static const bool DEFAULT_LOGIPS        = true;


@lbrynaut Is this meant to be here?

I'm glad you brought this up. I'm the one that changed the default and checked it in. I found it frustrating during our last hard fork that none of the log files had IP addresses in them. I suppose this is there because some people may be uncomfortable exposing their internal IP addresses. I think that concern is generally a thing of the past, though, and they aren't forced to send me their logs. I'm certainly open to arguments the other way. I think we should enable IP logging on all instances that we manage.

Agreed. If we find it useful we should enable it, but maybe disable it by default for privacy.

I'd leave it disabled as well.

BrannonKing · 2019-05-11T04:22:09Z

+using iterator = Trie::iterator;
+using const_iterator = Trie::const_iterator;
+
+template class CPrefixTrie<Key, Data>;


This and the lines below it are a c++ nuance. You have to instantiate a specific type of a template or you need the full definition in the header file. As a native C# programmer, I prefer the latter. @bvbfan prefers the former (as it is here).

In some project is used *.impl file ext (not *.cpp) and gets include where it's needed, i'm ok with that too. But all in header makes things to look redundant and hard to read.

BrannonKing · 2019-05-11T04:24:13Z

+}
+
+template <typename TKey, typename TData>
+template <typename TDataUni>


TDataUni allows us to have both lvalue and rvalue data inputs on this method. The alternative is to have two overloads of this method.

BrannonKing · 2019-05-11T04:26:11Z

+        auto name = it.key();
+        name.insert(name.end(), key.begin(), key.end());
+        auto& node = insert(key, shared);
+        copy = iterator{name, node};


Not a true forward iterator -- that would require the full stack. This is a children-only iterator. It's true of all the places we create the iterator here. We are specifically choosing to not initialize them fully, and we're using the weak_ptr workaround for the place where we need the parent function but don't have it.

That's by design, iterator is forward through children, i mean this is not an exception. Thus we don't have ability to obtain parent, that's nodes stand for. We can go for full iterator but i'm afraid we will fill iterator stack for nothing i.e. we will not use it.

tiger5226 · 2019-05-11T03:30:15Z

    int validHeight = 0;
    CClaimValue claimVal(claimOutPoint, claimId, amount, height, validHeight);
-    ctc.insertClaimIntoTrie("test", claimVal);
+    ctc.insertClaimIntoTrie("test", claimVal, true);


adding true everywhere doesn't help me understand what is being passed in here. Shouldn't we just assign this to a variable like the others?

claimtrie.h::insertClaimIntoTrie utilizes a default parameter value even though it's a virtual method. This is non-standard c++, and whether or not inheritors use that default is undefined. I had intended to remove it, but I apparently only made it part way down that road.

tiger5226 · 2019-05-11T03:32:13Z

+    BOOST_CHECK_EQUAL(count, 2);
+
+    count = 0;
+    // we cannot use range for on nodes whitout data


without data

The comment is old, it should be removed at all, we have data on all nodes so it's safe to be used range-based for loop.

tiger5226 · 2019-05-11T03:54:27Z

 static const unsigned int NODE_NETWORK_LIMITED_MIN_BLOCKS = 288;

-static const signed int DEFAULT_CHECKBLOCKS = 6;
+static const signed int DEFAULT_CHECKBLOCKS = 24;


should this be higher for lbrycrd? blocks are just files on a disk at risk of malware manipulation. Bitcoin is 6 blocks but also a lot harder based on computational power requirements at 6. Maybe we should have something much more than 24 as a default? Just a thought since our hash power is minimal in comparison.

this is probably a very low priority attack vector, but 6 blocks with lbrycrd is not a lot to overwrite with malware. Most exchanges have min confirmations at 200 for LBC. Validating the top 200 blocks is probably a good default for right now. Again the likelihood that enough people catch the LBC malware that does this is probably near zero...just a thought though after researching the default value.

Possibly unrelated to this PR, but that's a good point. Bitcoin core did backtrack this from 288 to 6, but it's likely not appropriate for us. Reverting to 288 is probably more sane.

24 is an hour -- same as bitcoin. Why would we need more than that?

I don't think time is the intention here. The intention is to confirm enough blocks such that it is statistically painful to overwrite like with a big reorg for a double spend attack. 6 blocks of bitcoin is a standard for confirmation not meant to be a time equivalence. Admittedly, it is nice that it evenly breaks down to an hour, however arbitrary, but should not be the correlation between cryptocurrencies. However, I could be misinterpreting it.

tiger5226 · 2019-05-11T03:58:36Z

 static const int64_t nMinDbCache = 4;
 //! Max memory allocated to block tree DB specific cache, if no -txindex (MiB)
-static const int64_t nMaxBlockDBCache = 2;
+static const int64_t nMaxBlockDBCache = 8;


Going to 8GB max is an important note for the release that contains this. Should be in the change log. Also I am interested in the mindbcache purpose.

BrannonKing · 2019-05-15T16:21:33Z

                }

+                CClaimTrieCache trieCache(pclaimTrie);
+                if (!trieCache.ReadFromDisk(chainActive.Tip()))


I still feel like this method fits better on the base class.

BrannonKing · 2019-05-15T16:21:51Z


    const Consensus::Params& consensusParams = Params().GetConsensus();
-    pclaimTrie->setExpirationTime(consensusParams.GetExpirationTime(chain_active_height));
+    CClaimTrieCache(pclaimTrie).setExpirationTime(consensusParams.GetExpirationTime(chain_active_height));


This method also fits better on the base class.

BrannonKing · 2019-05-15T16:25:57Z

 static const unsigned int NODE_NETWORK_LIMITED_MIN_BLOCKS = 288;

-static const signed int DEFAULT_CHECKBLOCKS = 6;
+static const signed int DEFAULT_CHECKBLOCKS = 24;


24 is an hour -- same as bitcoin. Why would we need more than that?

tiger5226 · 2019-06-01T01:09:21Z

+static UniValue getclaimsintrie(const JSONRPCRequest& request)
 {
-    if (request.fHelp || request.params.size() > 1)
+    if (request.fHelp || !validParams(request.params, 0, 1))


not that I am well informed on whats going on here, but @BrannonKing wouldn't it be better to handle this sort of thing in the processing of the request rather than modifying so many lines of code? Seems like a DRY violation.

put getclaimsintrie back as deprecated added test for adding a lot of data to the claimtrie updated unit test to dodge expiration fork

lbry-bot assigned BrannonKing May 11, 2019

BrannonKing requested review from bvbfan and lbrynaut May 11, 2019 02:52

lbry-bot assigned bvbfan and lbrynaut and unassigned BrannonKing May 11, 2019

BrannonKing self-assigned this May 11, 2019

tiger5226 reviewed May 11, 2019

View reviewed changes

lbry-bot unassigned BrannonKing May 11, 2019

BrannonKing commented May 11, 2019

View reviewed changes

BrannonKing changed the title ~~Memory improvements, claimtrie rewrite~~ Memory improvements, claimtrie revamp May 11, 2019

tiger5226 reviewed May 11, 2019

View reviewed changes

lbry-bot unassigned bvbfan May 13, 2019

bvbfan force-pushed the memory_improvements branch from 2b25d73 to 9b32062 Compare May 15, 2019 12:17

lbry-bot unassigned lbrynaut May 15, 2019

bvbfan force-pushed the memory_improvements branch from 9b32062 to 3aa6911 Compare May 15, 2019 13:16

BrannonKing commented May 15, 2019

View reviewed changes

bvbfan mentioned this pull request May 19, 2019

Slow responses and out of memory issues #277

Closed

BrannonKing force-pushed the memory_improvements branch 4 times, most recently from c6288ae to 755e213 Compare May 24, 2019 16:37

tiger5226 reviewed Jun 1, 2019

View reviewed changes

bvbfan approved these changes Jun 3, 2019

View reviewed changes

BrannonKing force-pushed the memory_improvements branch 3 times, most recently from 1746953 to 9df2868 Compare June 19, 2019 20:16

BrannonKing force-pushed the memory_improvements branch from 8fefa75 to fd39f05 Compare July 1, 2019 18:42

flattening prefix trie work

8b48206

put getclaimsintrie back as deprecated added test for adding a lot of data to the claimtrie updated unit test to dodge expiration fork

BrannonKing force-pushed the memory_improvements branch from fd39f05 to 8b48206 Compare July 1, 2019 18:44

BrannonKing merged commit e6bb8ba into master Jul 1, 2019

BrannonKing deleted the memory_improvements branch July 1, 2019 18:45

Conversation

BrannonKing commented May 11, 2019 • edited by lyoshenka Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

BrannonKing May 11, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bvbfan May 13, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

BrannonKing commented May 11, 2019 •

edited by lyoshenka

Loading

BrannonKing May 11, 2019 •

edited

Loading

bvbfan May 13, 2019 •

edited

Loading