-
Notifications
You must be signed in to change notification settings - Fork 38.7k
contrib: utxo_to_sqlite.py: add option to store txid/spk as BLOBs #32621
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code Coverage & BenchmarksFor details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32621. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. LLM Linter (✨ experimental)Possible typos and grammar issues:
No other typos were found. drahtbot_id_4_m |
I wonder if separate options would be better, eg
I think it will be surprising if Here's a patch: ajtowns@d1fc5df Could consider allowing
Seems like you can get addons for this, for whatever that's worth. |
77a79cf to
d6fe040
Compare
|
@ajtowns: Thanks for your suggestions, I like the idea of having more flexibility and mostly agree with the naming. Took your patch (TIL about argparse
Pretty neat ideas, and potentially useful for e.g. dust (and other "spam") or output script type analysis. Will consider for further follow-ups. |
7ad3829 to
aaf1173
Compare
|
🚧 At least one of the CI tasks failed. HintsTry to run the tests locally, according to the documentation. However, a CI failure may still
Leave a comment here, if you need help tracking down a confusing failure. |
w0xlt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK aaf1173
Maybe |
aaf1173 to
1b10e88
Compare
Concept ACK. I think there is room for some bike-shedding on details, like e.g. do we split up p2pk into uncompressed and compressed (or, god forbid, even hybrid?) or whether bare multisig should also be a category (if at all, then probably only with a very weak detection like "first byte is in range OP_1...OP_16, last byte is OP_CHECKMULTISIG", as implementing the full detection as used in Solver seems overblown for a conversion script). But, something to discuss in another PR anyways. |
w0xlt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reACK 1b10e88
Co-authored-by: Anthony Towns <[email protected]>
1b10e88 to
7378f27
Compare
w0xlt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reACK 7378f27
|
Conflicts with #32116 ; let's get that merged first? Approach ACK |
This PR is a late follow-up to #27432, introducing an option for the utxo-to-sqlite script to store the txid/scriptPubKey columns as bytes (=
BLOBstorage class in sqlite, see e.g. https://www.sqlite.org/datatype3.html in sqlite) rather than hex strings. This was proposed in earlier reviews (#27432 (comment), #27432 (comment)) and has the obvious advantage of a significantly smaller size of the resulting database (and with that, faster conversion) and the avoidance of hex-to-bytes conversion for further processing of the data [1]. The rationale on why hex strings were chosen back then (and still stays the default, if only for compatibility reasons) is laid out in #27432 (comment) [2].The approach taken is introducing new parameters
--spkand--txidwhich can either have the values "hex", "raw" (for scriptpubkey) and "hex", "raw", "rawle" (for txid). Thanks to ajtowns for providing this suggestion. Happy to take further inputs on naming and thoughts on future extensibility etc.[1] For a concrete example, I found that having these columns as bytes would be nice while working on a SwiftSync hints generator tool (https://github.com/theStack/swiftsync-hints-gen), which takes the result of the utxo-to-sqlite tool as input.
[2] note that in contrast what I wrote back then, I think there is no ambiguity on byte-string-serialization of txids; they are ultimately just hash results and hence, they should be stored as such, and adding a big/little endian knob wouldn't make much sense. The drawback of not being able to immediately show txid-strings (as one would need to do the bytes-reversal step first, which is not possible in sqlite, see e.g. #24952 (comment)) still remains though.