Skip to content

Add a claim index for fast reference by id#110

Merged
kaykurokawa merged 3 commits intolbryio:masterfrom
lbrynaut:claimindex
May 18, 2018
Merged

Add a claim index for fast reference by id#110
kaykurokawa merged 3 commits intolbryio:masterfrom
lbrynaut:claimindex

Conversation

@lbrynaut
Copy link
Copy Markdown
Contributor

No description provided.

}
current->children[name[name.size()-1]] = node;

// NOTE: add all claims if we want non-winning claims in the claim
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all claims should go into the index, even nonwinning ones. When I run getclaimbyid, it doesn't care about whether that claim is winning.

@lbrynaut
Copy link
Copy Markdown
Contributor Author

Updated

@tzarebczan tzarebczan requested a review from lyoshenka March 28, 2018 13:49
@lbry-bot lbry-bot assigned lyoshenka and unassigned lbrynaut Mar 28, 2018
@lyoshenka lyoshenka requested review from kaykurokawa and removed request for lyoshenka March 28, 2018 13:58
@lbry-bot lbry-bot assigned kaykurokawa and unassigned lyoshenka Mar 28, 2018
@tzarebczan
Copy link
Copy Markdown
Contributor

tzarebczan commented Apr 3, 2018

@kaykurokawa can you please review at your earliest convenience, thanks! (Reached out on DM via Slack too)

@kaykurokawa
Copy link
Copy Markdown

Hmm, kind of getting bizarre results , for example winning claim for "test" is a607349ce83d3a86bac87a967ce7f9647e1ba736 , try: ./lbrycrd-cli getvalueforname test

And when I do : ./lbrycrd-cli getclaimbyid a607349ce83d3a86bac87a967ce7f9647e1ba736

{
"name": "testy",
"value": "{"ver": "0.0.3", "description": "test", "license": "Creative Commons Attribution 4.0 International", "author": "test", "title": "test", "language": "en", "sources": {"lbry_sd_hash": "29ad218c61c599499b22c17228371b5fe9a6e725edc9ef691a7819b3e7406500467852920629fbcddbcacf13d31579d4"}, "content_type": "image/png", "nsfw": false}",
"claimId": "a607349ce83d3a86bac87a967ce7f9647e1ba736",
"txid": "a84cfc89c6169da6e0b63fe2b0adeed71464566bb1dab71f737a78d7711b2526",
"n": 1,
"amount": 1000000000,
"effective amount": 0,
"height": 94633
}

Observer that the name above is "testy" not "test"

Also I checked the debug log for "addToClaimIndex:" and the name that's being added seems to be off ? Seems like some names have an extra junk letter added to the end..

The code looks correct so maybe the serialization of CClaimIndexElement.name isn't handled properly? Not sure.

@lbrynaut
Copy link
Copy Markdown
Contributor Author

lbrynaut commented Apr 5, 2018

Interesting, will take a look.

@lbrynaut
Copy link
Copy Markdown
Contributor Author

lbrynaut commented Apr 5, 2018

@kaykurokawa Can you give another example? I don't see this issue and the above passes correctly here. For now, do a fresh sync for testing (i.e. rm -rf ~/.lbrycrd, assuming no meaningful wallet) to make sure the index is built properly.

@lbrynaut
Copy link
Copy Markdown
Contributor Author

lbrynaut commented Apr 5, 2018

@kaykurokawa Ok, scratch that. I can now see the issue, after a restart. I see why you suspect the serialization.

@kaykurokawa
Copy link
Copy Markdown

Ok cool, I tried the command again after reindex-ing and the above problem went away btw.

@lbrynaut
Copy link
Copy Markdown
Contributor Author

lbrynaut commented Apr 5, 2018

@kaykurokawa Fixed. Please re-test, across restarts as well.

@kaykurokawa
Copy link
Copy Markdown

I see the fix you made ( by getting the substring of name..) but I don't understand why that is necessary to fix it , do you have any insight into this?

@lbrynaut
Copy link
Copy Markdown
Contributor Author

If you look at that method, everything referring to the name truncates/ignores the last character, including the main iteration loop. Why that is, I'm not sure yet, but that's how it comes back from disk. Whatever it is (intentional or not), it appears to have been there a long time.

@kaykurokawa
Copy link
Copy Markdown

I will look over this again and merge

@shyba
Copy link
Copy Markdown
Member

shyba commented Apr 20, 2018

I think this solves #74

@lyoshenka
Copy link
Copy Markdown
Member

it does. @kaykurokawa please take a look and merge :-)

@shyba
Copy link
Copy Markdown
Member

shyba commented Apr 21, 2018

I think (not yet sure) I've found another bug:
This is how dd622f534fb79af737e4edb8737a732ce74c45ff appears on getclaimsforname test:

    {                                                                                                                                                                                                              
      "claimId": "dd622f534fb79af737e4edb8737a732ce74c45ff",                                                                                                                                                       
      "txid": "e7857e5b96ae3ed836b7616dc8f454a4dc07bed5f64fd0eb939ebaa4bafe51ad",                                                                                                                                  
      "n": 0,                                                                                                                                                                                                      
      "nHeight": 193440,                                                                                                                                                                                           
      "nValidAtHeight": 196523,                                                                                                                                                                                    
      "nAmount": 10000000,                                                                                                                                                                                         
      "value": "\b\u0001\u0010\u0001\u001ae\b\u0001\u0012\u001e\b\u0004\u0010\u0001\u001a\nTEST VIDEO\"\u0000*\u00002\u00008\u0000J\u0000R\u0000Z\u0000\u001aA\b\u0001\u0010\u0001\u001a0\u00c3\u00b51d\tG\u00c2\u00be7\u00c2\u00af\u00c2\u00a5\u00c3\u008e\u00c3\u00b5\u00c2\u008a`\u00c3\u0081\u00c3\u00ae\u00c3\u009c\u00c2\u00bf\u0013\u0017\u00c3\u00abQ\u00c2\u00b6\u00c2\u009a\u00c2\u0092\u0013E\u00c3\u0093)\u00c3\u009a\u00c3\u0092I\u0018q\u00c2\u00ab\u00c3\u009a\u00c3\u0081\b\u001b\u0003g\u00c3\u00b6\u00c3\u00a1\u00c3\u0086\u00c3\u009f\u0001Y\u001e\u0016\"\tvideo/mp4*\\\b\u0001\u0010\u0003\u001a@0\u00c3\u00adz\u00c3\u009c\u00c2\u00beh2\u00c3\u008c\u00c3\u0086\u00c2\u009d\u00c2\u00b4\u000e\u00c2\u0097`\u00c3\u0098dX=Ho\u00c3\u0086<U\u00c2\u00bdDZ\u00c2\u0085\b\u00c2\u0080\u00c2\u008e=\u00c2\u0092\u00c3\u00ba\u00c3\u00ab\u00c2\u008b%\u00c3\u00a6%\u0011\u00c2\u0095\u00c2\u009a\u00c3\u008a\u00c2\u0092\u00c3\u0099\u00c2\u00a7\u00c3\u00a9\u00c3\u009d\u00c2\u0089\u0015\u00c2\u00b51\u00c3\u008e2\u0003\u00c3\u00aeWm\u00c3\u0096\u0013O\u00c3\u008c\u00c2\u009c\b\u00c2\u008b\"\u0014F\u00c3\u00ba\u00c3\u009c\u00c2\u00a3H\u0000\u00c3\u00aa<L\u00c2\u009a1CC\u00c2\u00a6\u00c3\u00a1\u00c2\u00bf\u00c2\u009f\u0019\u00c2\u0096\u00c2\u00a3",                               
      "nEffectiveAmount": 10000000,                                                                                                                                                                                
      "supports": [                                                                                                                                                                                                
      ]                                                                                                                                                                                                            
    }   

Now being returned by getclaimbyid on this branch rebased on master:

{
  "name": "test",
  "value": "\b\u0001\u0010\u0001\u001ae\b\u0001\u0012\u001e\b\u0004\u0010\u0001\u001a\nTEST VIDEO\"\u0000*\u00002\u00008\u0000J\u0000R\u0000Z\u0000\u001aA\b\u0001\u0010\u0001\u001a0\u00c3\u00b51d\tG\u00c2\u00be7\u00c2\u00af\u00c2\u00a5\u00c3\u008e\u00c3\u00b5\u00c2\u008a`\u00c3\u0081\u00c3\u00ae\u00c3\u009c\u00c2\u00bf\u0013\u0017\u00c3\u00abQ\u00c2\u00b6\u00c2\u009a\u00c2\u0092\u0013E\u00c3\u0093)\u00c3\u009a\u00c3\u0092I\u0018q\u00c2\u00ab\u00c3\u009a\u00c3\u0081\b\u001b\u0003g\u00c3\u00b6\u00c3\u00a1\u00c3\u0086\u00c3\u009f\u0001Y\u001e\u0016\"\tvideo/mp4*\\\b\u0001\u0010\u0003\u001a@0\u00c3\u00adz\u00c3\u009c\u00c2\u00beh2\u00c3\u008c\u00c3\u0086\u00c2\u009d\u00c2\u00b4\u000e\u00c2\u0097`\u00c3\u0098dX=Ho\u00c3\u0086<U\u00c2\u00bdDZ\u00c2\u0085\b\u00c2\u0080\u00c2\u008e=\u00c2\u0092\u00c3\u00ba\u00c3\u00ab\u00c2\u008b%\u00c3\u00a6%\u0011\u00c2\u0095\u00c2\u009a\u00c3\u008a\u00c2\u0092\u00c3\u0099\u00c2\u00a7\u00c3\u00a9\u00c3\u009d\u00c2\u0089\u0015\u00c2\u00b51\u00c3\u008e2\u0003\u00c3\u00aeWm\u00c3\u0096\u0013O\u00c3\u008c\u00c2\u009c\b\u00c2\u008b\"\u0014F\u00c3\u00ba\u00c3\u009c\u00c2\u00a3H\u0000\u00c3\u00aa<L\u00c2\u009a1CC\u00c2\u00a6\u00c3\u00a1\u00c2\u00bf\u00c2\u009f\u0019\u00c2\u0096\u00c2\u00a3",
  "claimId": "dd622f534fb79af737e4edb8737a732ce74c45ff",
  "txid": "e7857e5b96ae3ed836b7616dc8f454a4dc07bed5f64fd0eb939ebaa4bafe51ad",
  "n": 0,
  "amount": 10000000,
  "effective amount": 0,
  "height": 193440
}

As you can see, effective amount isn't considering the claim amount itself. While getclaimsforname considers it.

However, this also happens on the current getclaimbyid implementation:

{
  "name": "test",
  "value": "\b\u0001\u0010\u0001\u001ae\b\u0001\u0012\u001e\b\u0004\u0010\u0001\u001a\nTEST VIDEO\"\u0000*\u00002\u00008\u0000J\u0000R\u0000Z\u0000\u001aA\b\u0001\u0010\u0001\u001a0\u00c3\u00b51d\tG\u00c2\u00be7\u00c2\u00af\u00c2\u00a5\u00c3\u008e\u00c3\u00b5\u00c2\u008a`\u00c3\u0081\u00c3\u00ae\u00c3\u009c\u00c2\u00bf\u0013\u0017\u00c3\u00abQ\u00c2\u00b6\u00c2\u009a\u00c2\u0092\u0013E\u00c3\u0093)\u00c3\u009a\u00c3\u0092I\u0018q\u00c2\u00ab\u00c3\u009a\u00c3\u0081\b\u001b\u0003g\u00c3\u00b6\u00c3\u00a1\u00c3\u0086\u00c3\u009f\u0001Y\u001e\u0016\"\tvideo/mp4*\\\b\u0001\u0010\u0003\u001a@0\u00c3\u00adz\u00c3\u009c\u00c2\u00beh2\u00c3\u008c\u00c3\u0086\u00c2\u009d\u00c2\u00b4\u000e\u00c2\u0097`\u00c3\u0098dX=Ho\u00c3\u0086<U\u00c2\u00bdDZ\u00c2\u0085\b\u00c2\u0080\u00c2\u008e=\u00c2\u0092\u00c3\u00ba\u00c3\u00ab\u00c2\u008b%\u00c3\u00a6%\u0011\u00c2\u0095\u00c2\u009a\u00c3\u008a\u00c2\u0092\u00c3\u0099\u00c2\u00a7\u00c3\u00a9\u00c3\u009d\u00c2\u0089\u0015\u00c2\u00b51\u00c3\u008e2\u0003\u00c3\u00aeWm\u00c3\u0096\u0013O\u00c3\u008c\u00c2\u009c\b\u00c2\u008b\"\u0014F\u00c3\u00ba\u00c3\u009c\u00c2\u00a3H\u0000\u00c3\u00aa<L\u00c2\u009a1CC\u00c2\u00a6\u00c3\u00a1\u00c2\u00bf\u00c2\u009f\u0019\u00c2\u0096\u00c2\u00a3",
  "claimId": "dd622f534fb79af737e4edb8737a732ce74c45ff",
  "txid": "e7857e5b96ae3ed836b7616dc8f454a4dc07bed5f64fd0eb939ebaa4bafe51ad",
  "n": 0,
  "amount": 10000000,
  "effective amount": 0,
  "height": 193440
}

Maybe the fix for #33 wasn't applied here?

@shyba
Copy link
Copy Markdown
Member

shyba commented Apr 21, 2018

Should be fixed in #123 and isn't related to code touched by this PR.

@lbrynaut
Copy link
Copy Markdown
Contributor Author

@shyba Not sure I'm following. This code replaces the existing getclaimbyid in a way that improves performance by using a claim index that's built and maintained by this (PR) code. Are you seeing a discrepancy between the master branch and this one?

@shyba
Copy link
Copy Markdown
Member

shyba commented Apr 23, 2018

@lbrynaut No. At first I thought it was a discrepancy, then I found a bug and fixed in #123 but it turned out to be unrelated to this PR. Sorry for the confusion.

claim = itClaim->second.claim;
return true;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, the part that comes after this, where it checks the db does not seem necessary..

Correct me if I'm wrong but there should never be a case where if it's not found in claimIndex but its found in leveldb.

@lbry-bot lbry-bot assigned lbrynaut and unassigned kaykurokawa Apr 25, 2018
@kaykurokawa
Copy link
Copy Markdown

kaykurokawa commented Apr 26, 2018

Other than the one comment I left above, I think this is good to merge.

Also, reminder that this feature will require a reindex for it to work, this should be noted once a release is made.

@kaykurokawa
Copy link
Copy Markdown

kaykurokawa commented Apr 26, 2018

Actually two more thing:

Is it even necessary to keep and maintain claimIndex ? We could just query leveldb directly and we wouldn't have to worry about claimIndex growing unbounded. Since its just used for RPC calls, maybe its ok that its slow?

When we call BatchWriteClaimIndex, we are not deleting abandoned claims from leveldb even though we are deleting them from claimIndex using removeFromClaimIndex. Is that the behavior we want? Victor pointed out that before this PR, getclaimbyid would not return abandoned claims, but now it does, because it is stored in leveldb.

@lbrynaut
Copy link
Copy Markdown
Contributor Author

@kaykurokawa Both good points. I'll compare the performance and see if it's acceptable. As for the abandoned claims, either way. I'll remove since that's how the previous code worked.

@lbrynaut
Copy link
Copy Markdown
Contributor Author

lbrynaut commented May 7, 2018

Updated

@lbrynaut
Copy link
Copy Markdown
Contributor Author

Rebased, updated, added unit test. @kaykurokawa Merge if ready.

@tzarebczan tzarebczan requested a review from kaykurokawa May 16, 2018 03:06
@lbry-bot lbry-bot assigned kaykurokawa and unassigned lbrynaut May 16, 2018
@kaykurokawa
Copy link
Copy Markdown

This LGTM

@kaykurokawa kaykurokawa merged commit 7930840 into lbryio:master May 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants