Skip to content

Conversation

@sstone
Copy link
Contributor

@sstone sstone commented Mar 11, 2022

This PR adds a new "tx output spender" index, which allows users to query which tx spent a given outpoint with the gettxspendingprevout RPC call that was added by #24408.

Such an index would be extremely useful for Lightning, and probably for most layer-2 protocols that rely on chains of unpublished transactions.

UPDATE: this PR is ready for review and issues have been addressed:

We use a composite key with 2 parts (suggested by @romanz): hash(spent outpoint) and tx position, with an empty value. Average composite key size is 15 bytes.

The spending tx can optionally be returned by gettxspendingprevout (even it -txindex is not set).

@sstone sstone force-pushed the add-txospender-index branch from 3f7cdb6 to aa9f963 Compare March 11, 2022 21:08
@ryanofsky
Copy link
Contributor

Concept ACK. It would be good know more about specific use-cases for this, if you can provide some links or more description. But the implementation seems very simple, so I don't think adding this would be a maintenance burden.

@maflcko
Copy link
Member

maflcko commented Mar 12, 2022

@ ryan, I think the idea is that there is one coin that is owned by more than one entity. There may be several (conflicting) pre-signed txs spending it and as soon as it is spent, all entities with ownership in it want to check which pre-signed tx spent the coin.
If they use Bitcoin Core to monitor the coin, they may use:

  • A watch-only wallet (which requires a file on disk and BDB/Sqlite)
  • Walk the raw chain and raw mempool themselves (According to rpc: add rpc to get mempool txs spending specific prevouts #24408 (comment) this is doable for the chain, but may not be for the mempool)
  • Use an index. A blockfilterindex might speed up scanning the chain, whereas this index directly gives the answer.

@sstone
Copy link
Contributor Author

sstone commented Mar 12, 2022

Lightning channels are basically trees of unpublished transactions:

  • a commit tx that spends a shared onchain "funding" UTXO
  • HTLC txs that spend the commit tx (there can be competing HTLC txs that spend the same commit tx output: one for successful payments, one for timed out payments)

There is also a penalty scheme built into these txs: if you cheat, you need to wait before you can spend your outputs but your channel peer can spend them immediately.

And LN is also transitioning to a model where these txs pay minimum fees and can be "bumped" with RBF/CPFP.

=> it's very useful for LN nodes to detect when transactions have been spent, but also how they've been spent and react accordingly (by extracting info from the spending txs for example), and walk up chains of txs that spend each other.

As described above, there are tools in bitcoin core that we could use and would help a bit, and Lightning nodes could also create their own indexes independently or use electrum servers, block explorers, but this would make running LN nodes connected to your own bitcoin node much easier.

@michaelfolkson
Copy link

Concept ACK

I think the use case is clear and if others think the maintenance burden is reasonable there doesn't seem to be an obvious downside.

This PR can be viewed as an "extension" to #24408

Seems like review should be focused on #24408 first but assuming that gets merged, Concept ACK for this.

@DrahtBot
Copy link
Contributor

DrahtBot commented Mar 12, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/24539.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK TheCharlatan
Concept ACK ryanofsky, michaelfolkson, mzumsande, glozow, aureleoules, achow101, hodlinator, willcl-ark
Approach ACK w0xlt

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

No conflicts as of last run.

Copy link
Contributor

@mzumsande mzumsande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK - I can see why this could be helpful for lightning, the code is very similar to the txindex code.

Some general thoughts:

  • getmempooltxspendingprevout in #24408 takes a list of outpoints, gettxospender takes a single outpoint. Would it make sense to have the same format here (I'd guess a common use case would be to check both RPCs with the same outpoint(s)?)
  • Taking this one step further, getrawtransaction queries transaction data from the mempool or from blocks (if txindex is enabled) in one RPC. Would it make sense to have something similar for outpoints, instead of two separate RPCs?
  • I think that the indexed data will not be deleted in case of a reorg (but possibly overwritten if an outpoint would be spent in a different tx as a result of a reorg). That means that a result from gettxospender could be stale. Would it make sense to delete entries or should we at least mention this possibility in the rpc doc?
  • What's the reason for the dependency on -txindex? As far as I can see it's not technical, so would knowing the fact that an outpoint was used in a tx (even if we can't lookup the details of that tx further) be sufficient for some use cases?

@sstone sstone force-pushed the add-txospender-index branch from aa9f963 to 1b82b7e Compare March 14, 2022 10:27
@sstone
Copy link
Contributor Author

sstone commented Mar 14, 2022

* `getmempooltxspendingprevout` in #24408 takes a list of outpoints, `gettxospender` takes a single outpoint. Would it make sense to have the same format here (I'd guess a common use case would be to check both RPCs with the same outpoint(s)?)

Bitcoin supports batched RPC request so from a functional p.o.v it's equivalent to passing a list of outpoints but it's a bit easier to reason about imho.
I think that the use case for getmempooltxspendingprevout is a bit different: you want to bump fees for a set of txs that are related to the same channel and want to understand who is spending what. You may also have txs with multiple inputs, and would pass them all together to getmempooltxspendingprevout.

For gettxospender the use case is trying to understand why channels have been closed and react if needed, if we could pass in multiple outpoints they would likely be unrelated.

* Taking this one step further, `getrawtransaction` queries transaction data from the mempool or from blocks (if txindex is enabled) in one RPC. Would it make sense to have something similar for outpoints, instead of two separate RPCs?

Maybe, let's see where the discussion in #24408 but to me it feels cleaner to have separate calls.

* I think that the indexed data will not be deleted in case of a reorg (but possibly overwritten if an outpoint would be spent in a different tx as a result of a reorg). That means that a result from `gettxospender` could be stale. Would it make sense to delete entries or should we at least mention this possibility in the rpc doc?

* What's the reason for the dependency on -txindex? As far as I can see it's not technical, so would knowing the fact that an outpoint was used in a tx (even if we can't lookup the details of that tx further) be sufficient for some use cases?

We would almost always want to get the actual tx that spends our outpoint. My reasoning was that relying on -txindex makes gettxospender robust and consistent in the case of reorgs and is also simpler and easier to understand that storing block positions instead of txids.

@glozow
Copy link
Member

glozow commented Apr 1, 2022

Concept ACK, nice

@sstone
Copy link
Contributor Author

sstone commented Apr 25, 2022

Rebased on #24408

@valentinewallace
Copy link

@sstone Question -- In LDK, when we confirm transactions, we require the transaction's index-in-block to be provided, for short channel ID construction. I'm curious how this API works for you without this information?

@sstone
Copy link
Contributor Author

sstone commented Nov 12, 2025

@sstone Question -- In LDK, when we confirm transactions, we require the transaction's index-in-block to be provided, for short channel ID construction. I'm curious how this API works for you without this information?

This index would be used to find if and how channels have been spent (after a restart for example) and would be much more efficient that walking down the blockchain from the tip.

To compute the short channel Id we simply call getrawtransaction which gives us the block hash for the block that confirmed the tx, then load that block to get the tx position in that block.

@willcl-ark
Copy link
Member

Concept ACK.

I like the idea of us providing (useful) optional indexes.

I haven't done an in-depth code review yet, but at a high level I feel that this feature would be worthy of a release note? I have reindexed with this enabled and see that the index currently takes about 75GB of space, which could be worth mentioning in such a note so users know what they need before enabling.

❯ du -h .
74G     ./txospenderindex/db
74G     ./txospenderindex

@achow101
Copy link
Member

To compute the short channel Id we simply call getrawtransaction which gives us the block hash for the block that confirmed the tx, then load that block to get the tx position in that block.

So you need txindex and make 3 RPC calls? ISTM storing the position in this new index would make it much more efficient and more useful to other people.

@sstone
Copy link
Contributor Author

sstone commented Nov 19, 2025

To compute the short channel Id we simply call getrawtransaction which gives us the block hash for the block that confirmed the tx, then load that block to get the tx position in that block.

So you need txindex and make 3 RPC calls? ISTM storing the position in this new index would make it much more efficient and more useful to other people.

I could easily return the block hash along with the spending tx, txindex would not be needed anymore but users would still need to call getblock.

@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed.
Task test each commit: https://github.com/bitcoin/bitcoin/actions/runs/19615986599/job/56168637751
LLM reason (✨ experimental): Compilation failed due to incorrect return types in src/index/txospenderindex.cpp (static_cast to bool and returning void), leading to a build error.

Hints

Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:

  • Possibly due to a silent merge conflict (the changes in this pull request being
    incompatible with the current code in the target branch). If so, make sure to rebase on the latest
    commit of the target branch.

  • A sanitizer issue, which can only be found by compiling with the sanitizer and running the
    affected test.

  • An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

@sstone sstone force-pushed the add-txospender-index branch 3 times, most recently from bb0cb11 to c7ba5de Compare November 24, 2025 09:46
Copy link
Contributor

@sedited sedited left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Block hash addition looks good to me. Just needs a small iwyu fix to get the CI to pass again.

private:
std::unique_ptr<BaseIndex::DB> m_db;
std::pair<uint64_t, uint64_t> m_siphash_key;
bool AllowPrune() const override { return true; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the index stores the tx position in disk, how is this correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed in 38f2f53

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. It would be good to add a test case for it.

@sstone sstone force-pushed the add-txospender-index branch 2 times, most recently from acf4fba to 08167df Compare November 24, 2025 16:30
Adds an outpoint -> txid index, which can be used to find which transactions spent a given output.
We use a composite key with 2 parts (suggested by @romanz): hash(spent outpoint) and tx position, with an empty value.
To find the spending tx for a given outpoint, we do a prefix search (prefix being the hash of the provided outpoint), and for all keys that match this prefix
we load the tx at the position specified in the key and return it, along with the block hash, if does spend the provided outpoint.
To handle reorgs we just erase the keys computed from the removed block.

This index is extremely useful for Lightning and more generally for layer-2 protocols that rely on chains of unpublished transactions.
If enabled, this index will be used by `gettxspendingprevout` when it does not find a spending transaction in the mempool.
@sstone sstone force-pushed the add-txospender-index branch from 08167df to 38f2f53 Compare November 24, 2025 16:51
Copy link
Contributor

@sedited sedited left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK 38f2f53

@DrahtBot DrahtBot requested a review from willcl-ark November 25, 2025 08:47
@sedited sedited requested review from fjahr, hodlinator and w0xlt and removed request for aureleoules, hodlinator and w0xlt January 12, 2026 13:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.