Skip to content

Conversation

@pinheadmz
Copy link
Owner

No description provided.

pinheadmz and others added 30 commits April 7, 2025 21:55
lightning/bolts#1245 highlights that invoice
writers must minimally encode Bolt11 features (and other fields).
We were already correctly doing that, but tests were lacking.

lightning/bolts#1242 added a test vector for
invoices without payment secrets, which we're now verifying.
Apply same fix to Bob and Carol used for Alice.

Race can result in `ExternalChannelSpent` being received by nodes after validating the channel update for a splice received from their peer. This will not happen in the wild.

I checked that both paths are exercised for all three nodes.
We previously included support for spending the remote anchor output.
This created some confusion: is this the anchor output using our key
but in the remote commitment? Or is it the anchor output using our
peer's key in our local commitment?

It is actually the former: the anchor output that uses our peers' key.
There is no reason to claim this output: it cannot be claimed while the
commitment tx is unconfirmed, and even after it is confirmed, we must
add wallet inputs to be able to claim this 330 sat output, which is
something we've never implemented (since it would be an economic loss).

It is only useful to spend those outputs when they pollute the utxo set
after commit txs have confirmed. For it to be economical, it only makes
sense when we're batching a lot of those outputs. This isn't something
we support, so we should simply remove that code for simplicity.
Our previous architecture for channel key management was designed when
we wanted the ability to use an external signer that would protect the
private keys used by `eclair`: this meant we couldn't directly deal
with private keys and forced us to have a more complex, indirect
architecture which made it harder to understand how key derivation is
done for commitment keys and forced us to partially derive some of the
keys in many different places, which was very error-prone (when should
we provide a `payment_key` instead of a `delayed_payment_key`?).

Since then, we've spent years working on lightning-specific HSMs before
VLS was even a thing, and ended up concluding that this was a dead end
and it made much more sense to put the whole lightning node inside a
secure enclave: https://acinq.co/blog/securing-a-100M-lightning-node

But we haven't refactored our key management code since then. Now that
we're working on Taproot, which adds more complexity to transactions,
it's a good time to re-work that part of the code and make it simpler.
This refactoring took a while and required a few iterations to get
right. I think the result is much better than what we previously had:

- all of the key derivation is contained in `ChannelKeys`, which makes
  it easier to grasp
- each channel has its own `ChannelKeys` instance, containing only what
  it needs
- from a `ChannelKeys` instance and a per-commitment point, we can
  derive a set of `CommitmentKeys` that allow creating local/remote
  commitment and HTLC transactions
- our force-close logic (in `Helpers.scala`) does *not* need to derive
  keys manually and can instead rely on a `CommitmentKeys` instance
- script and witness creation also take a `CommitmentKeys` instance
  instead of specific public keys, which was extremely error-prone
  because we relied on naming instead of types to ensure that the
  right public key was provided

Importantly, this is only a first step, that will allow us to also
greatly simplify our whole `TransactionWithInputInfo` architecture,
which is also starting to show its age.
We have a lot of duplicated code where we skip creating 2nd-stage and
3rd-stage transactions when their output would be below dust. We now
introduce a helper function to factorize this shared behavior.
Instead of having multiple overloads for `addSigs`, we move each one of
them inside the corresponding `TransactionWithInputInfo`. This is part
of an effort to group the logic belonging to each type of transaction
in the same place to make it easier to review and use.

This is a purely mechanical refactoring, where code has simply been cut
and pasted. The only functional change is for `HtlcPenaltyTx`, where we
incorrectly used the witness for a `MainPenaltyTx`, which was actually
smaller. We were thus under-estimating the fees we had to pay to reach
the desired feerate, which isn't much of an issue, but it's better to
use the correct witness for consistency.

We also remove the unused `minRelayFee` field.
We previously split the creation of HTLC-penalty transactions between
`Helpers.scala` and `Transactions.scala`. It makes more sense to group
everything in `Transactions.scala` and is a pre-requisite for the next
wave of transaction refactoring.
We have this legacy class in the `ClaimHtlcTx` hierarchy for legacy
cases that didn't store the `paymentHash` with the transaction. This
was introduced to ensure that nodes who upgraded while a force-close
was happening would still be able to finish force-closing the channel
and get their funds back.

This was introduced in ACINQ#2101 which was released in eclair v0.7.0 on
01/02/2022. Since we've made several important bug fixes since then,
it should be safe to assume that we don't care about node operators
who update from this very old version while a channel is closing, the
remote commit got confirmed, and some HTLCs must be claimed with the
preimage on-chain.
Most of our instances of `TransactionWithInputInfo` only contained the
input (`InputInfo` instance) and the transaction. This worked because
we stored the (segwit v0) redeem script in the `InputInfo` instance.
But we can actually get rid of that if we store with each transaction
the parameters that allow re-creating the scripts at runtime.

We don't need many parameters, so this is quite simple to add:

- delayed txs simply need the `to_self_delay` value used
- HTLC txs simply need the `payment_hash` and `htlc_expiry`

Based on that we can trivially re-build scripts and redeem info at
runtime using `ChannelKeys`. This ensures that we will only need the
`commitmentFormat` to know the exact script that is being used, instead
of having to encode different redeem information in `InputInfo`. This
will also reduce the size of our channel data, because storing redeem
info can be removed.

We also somewhat heavily re-design `Transactions.scala`. The
main change is that we stop exposing the default `sign` function that
was too abstract for callers, and have each `TransactionWithInputInfo`
expose its dedicated `sign` function, with simpler arguments depending
on each transaction type.

One very important thing to note is that there is a heavy distinction
between transactions that require both a local and a remote signature
(commit-tx, splice-tx, closing-tx and htlc-tx) and transactions that
only need a local signature. This means we can completely remove the
`addSigs` step for transactions that only need a local signature, and
directly create signed signatures, which simplifies the calling code.

This also lets us better encapsulate the `extraUtxos` parameter, which
only makes sense for txs for which we add wallet inputs, which is only
a small subset of our transactions.

This makes the calling code much less error-prone and removes the need
to store the redeem script in the `InputInfo` class, which wouldn't
have worked for taproot transactions. Note that for simplicity's sake,
we currently keep the field (to avoid a painful codecs migration) since
we will introduce v5 codecs when implementing taproot channels, which
will let us clean up this inconsistency.

We also introduce a `RedeemInfo` trait which makes the code forward
compatible with taproot channels.

We can actually use pre-computed weights for every type of force-close
transaction, which simplifies the code and ensures that the tests use
deterministic fees, regardless of the signature size (since ECDSA sigs
are der-encoded, they have a variable size which slightly impacts the
witness weight).

Note that we use per-transaction constants, even though some have the
same weight: this is more future-proof since future commitment formats
may use different script combinations.

---------

Co-authored-by: Fabrice Drouin <[email protected]>
We increase the delay after which we disconnect peers that don't send
their revocation (which can indicate malicious behavior or a known lnd
bug). The previous 20 seconds delay was too aggressive for Tor nodes,
or nodes that have large blocking file backups.

Fixes ACINQ#3081
We previously included confirmation targets directly in HTLC and anchor
transactions, which didn't really belong there. We now instead set
confirmation targets at publication time, based on the current state of
the force-close attempt.

We also extract witness data for HTLCs (preimage and remote signature)
when creating the command to publish the transaction, instead of doing
it inside the publisher actors.

We improve the checks done in the pre-publisher actors, which now take
into account the min-depth for commit txs before aborting anchor txs.
The `ReplaceableTxFunder` is also simpler now, thanks to a better
encapsulation of the transaction's contents.

We didn't watch our anchor output on the remote commit, which means we
weren't recording in the `AuditDB` the fees we paid to get a remote
commit tx confirmed by spending our anchor output. We only had that
code for the local commit, but we've recently started more aggressively
trying to get the *remote* commitment confirmed, so it's important to
correctly record those fees.
This change enables custom signets by allowing the user to specify the txid of a transaction to check or "" to skip this check for signet. See issue ACINQ#3078.
* Move `confirmationTarget` to closing helper function

We move functions that compute the confirmation target outside of
`LocalCommitPublished` and `RemoteCommitPublished` to helper functions
in the `Closing` object.

We also slightly refactor `doPublish` functions to remove the pattern
matching on the commitment format, which was actually unnecessary since
we introduced the `redeemableHtlcTxs` function.

* Refactor anchor tx creation

This is a trivial refactoring where we explicitly create a single anchor
transaction instead of replacing an empty list. We also clean up
comments and variable names in force-close helpers.

* Refactor upstream trimmed or timed out HTLCs

We stop relying on `LocalCommitPublished` and `RemoteCommitPublished` to
identify when HTLCs should be failed upstream because they are either
trimmed or their timeout transaction is confirmed. We instead rely on
commitment data for local commitments, and re-create the commit outputs
on-the-fly for remote commitments.

* Publish 3rd-stage penalty txs on tx confirmation

In case of a revoked close, we previously published 3rd-stage penalty
txs as soon as we saw our peer's HTLC transactions in our mempool.
This isn't necessary and may actually help them get confirmed since
it is essentially a CPFP. We now instead wait for the revoked HTLC
transaction to be confirmed before publishing a penalty transaction
claiming its outputs.
We add a new commitment format for simple taproot channels,
and implement creating and verifying musig2 and taproot sigs
for these channels.

Wire format and channel creation/update workflows have not
been modified, this will be done in another PR. This is currently
dead code, but is ready to be used and extensively tested!
We introduce a `CommitSigBatch` class to group `commit_sig` messages
when splice transactions are pending. We use this class to ensure that
all the `commit_sig` messages in the batch are sent together to our
peer, without any other messages in-between.

We move the incoming `commit_sig` batching logic outside of the channel
and into the `PeerConnection` instead. This slightly simplifies the
channel FSM and its tests, since the `PeerConnection` actor is simpler.

We unfortunately cannot easily do this in the `TransportHandler` because
of our buffered read of the encrypted messages, which may split batches
and make it more complex to correctly group messages.
Payment splitting is randomized which can lead to flaky tests.
Tests requiring MPP used `maxAttempts = 3` which which fails in a bit less than 1% of runs, increasing to `maxAttempts = 4` should reduce the failure rate to less than 1 per thousand.
We also increase the timeout for propagating channel updates.
When an HTLC settles downstream while the upstream channel is closing,
this allows us to either publish HTLC-success transactions for which we
were missing the preimage (if the downstream HTLC was fulfilled) or stop
watching the HTLC output (if the downstream HTLC was failed).

We previously re-computed every closing transaction in that case, which
was confusing and useless. We now explicitly handle those two cases and
only republish the HTLC-success transactions that become available, if
any.

We also change the default feerate used for `claim-htlc-txs`: we used a
high feerate in the channel actor, which meant we would skip small HTLCs
that weren't economical to spend at that high feerate. But the feerate
is actually set inside the tx-publisher actor based on the HTLC expiry,
which may happen many blocks after the beginning of the force-close, in
which case the feerate may have changed a lot. We now use the minimum
feerate in the channel actor to ensure we don't skip HTLCs and let the
tx-publisher actor handle RBF.
We previously immediately watched for confirmation of transactions that
our peer couldn't double-spend, which had a few issues:

- if we RBF-ed those transactions after a restart and a previous version
  confirmed, we wouldn't detect it and wouldn't move to the CLOSED state
- if the transaction had a long CSV delay, we were wasting performance
  in the watcher while that CSV isn't complete

We change that behavior and instead watch all outputs of the commitment
transaction that we may spend. We only watch for confirmations after we
detect that the output has been spent (either in the mempool or in a
block). This ensures that RBF attempts are correctly handled, and that
we don't watch transactions until they've been published (and thus CSV
delays are satisfied).

We also explicitly split 2nd-stage and 3rd-stage transactions. It is a
bit verbose and awkward for now, but it will become cleaner once we
stop storing unconfirmed transactions and instead re-compute them when
restarting. It also makes testing easier: we took this opportunity to
ensure that closing tests cover all scenarios and use better assertions
on watches and transactions.
We re-work the channel balance computation (`CheckBalance.scala`) used
to continuously monitor the overall balance of the node. We previously
assumed that all pending HTLCs would timeout on-chain, and monitored
our mempool to deduplicate unconfirmed transactions and RBF attempts.
But it is actually simpler to assume that all pending HTLCs will be
fulfilled, and keep counting them in our off-chain balance until the
spending transaction reaches `min-depth`.

We introduce a distinction between recently confirmed transactions and
deeply confirmed transactions in our on-chain balance: there will be
cases where we don't know yet whether we'll be able to gets funds back
and the information of pending on-chain funds lets us verify that our
overall balance is looking good. It always eventually converges to be
in our on-chain balance.

Fixes ACINQ#3085
Adds and updates channel_reestablish tests based on tests added to splice_tests.md of the splice spec as a result of interop testing.

For tests correspond to tests in splice_tests.md I added the corresponding test name from splice_tests.md and test flow diagram to the matching spec tests.
The `payoffer` API requires an amount but the user has currently no way to read the amount from the offer.
We add `parseoffer` so that the user can read the offer amount (among other things).
Since the user needs to specify the amount they want to pay, we allow paying offers with a currency (other than BTC).
Attribution data is added to both failed and fulfilled HTLCs
lightning/bolts#1044
When a channel was force-closed, we previously stored partially created
closing transactions for each output of the commitment transaction. In
some cases those transactions didn't include any on-chain fee (HTLC txs
with 0-fee anchors) and weren't signed. The fee is set in the publisher
actors which then sign those transactions.

This was an issue because we used those partial transactions as if they
were the final transaction that would eventually confirm: this made it
look like RBF wasn't an option to callers, and was thus easy to misuse.

We now change our data model to only store the outputs of the commit tx
that we may spend, without any actual spending transaction. We only
store spending transactions once they are confirmed, at which point they
can safely be used as closing transactions by callers such as our
balance checks.

This lets us get rid of some codec migration code related to closing
transactions, and is more future-proof because we now don't need to
encode closing transactions and can thus change their format easily.
This also reduces the size of our channel state during force-close.

We add a `max-closing-feerate` config parameter to cap the feerate used
for closing transactions which aren't at risk of being double-spent.
Node operators should generally use a low value here, and update it when
they need to reclaim liquidity if needed.
We previously stored the commitment transaction and HTLC transactions
in our channel data along with the remote signatures, without our local
signatures. It is unnecessary to store those transactions, as we have
all the data we need to recompute them on-the-fly if we need to force
close.

Storing those transactions can easily use a few hundreds of kB per
channel, which adds up to the post-restart latency when channels need
to be read from the DB.

By not storing those transactions in our commitments, we also remove
the need to support encoding them, which gives us greater flexibility
to change their format in the future.

It also made it confusing in tests, where we sometimes use unsigned
transactions as if they were signed.
Hold times in attribution data now use decaseconds instead of milliseconds
We previously waited for 12 blocks before removing spent channels from
our graph: the spec updates this delay to 72 blocks to allow more time
for splice announcement in lightning/bolts#1270.

We also make this configurable, which can simplify testing and allows
nodes to wait even longer if they wish.
The trampoline node must unwrap the attribution data with its shared secrets and use what's remaining as the attribution data from the next trampoline node.
This commit does not yet relay the attribution data but makes it available.
thomash-acinq and others added 28 commits September 22, 2025 16:40
Offers or invoices where the fields `offer_chains`, `offer_paths`, `invoice_paths`, `invoice_blindedpay` are present but empty are considered invalid.
While the spec does not necessarily rejects them explicitly, they can't be paid.
* Add PhoenixZeroReserve feature bit

Add a feature bit for zero-reserve channel. It is used by phoenix wallets and different from the feature bit proposed in lightning/bolts#1140.

* Simplify SimpleTaprootChannelsPhoenix channel type

It does not make sense to have versions with/without scid-alias and zero-conf.
Fix encoding of channel type TLV in splice_init/splice_ack

The channel type codec that we use is a "TLV field codec", wrapping it in a tlvField() codec was wrong
and messed up the encoding.

Co-authored-by: Bastien Teinturier <[email protected]>
We previously allowed 250 parallel incoming connections from nodes with
whom we don't have any channels yet. While this makes sense for wallet
providers, this is too large for standard routing nodes and exposes them
to more DoS risk. We reduce the default value to 64 instead, which can
of course be overridden by node operators in their `eclair.conf`.
When a mutual close transaction is published or recently confirmed, we
don't include the corresponding channel in our off-chain balance to
avoid counting it twice (which would resolve itself after confirmations
but is misleading).

There are degenerate cases where the channel will actually end up being
force-closed, even though we started with a mutual close, but that will
be very infrequent and will resolve itself after confirmations.
And simplify musig2 nonce using Scala classes.
We introduce a `DATA_CLOSED` class with minimal information about a
past channel that has been fully closed. This will let us deprecate
legacy channels without having backwards-compatibility issues with
very old closed channels inside our DB.

When channels have never been properly opened or used, we don't
bother storing them in our DB, as it would open the door to DoS
attacks.

We create a dedicated table to store `DATA_CLOSED`. We migrate the
existing DB and remove the foreign key constraint on `htlc_infos`.
We ensure that tests using the Phoenix taproot commitment format
correctly pass on feature branches without conflicts.

Co-authored-by: pm47 <[email protected]>
When we disconnect after sending `shutdown`, we must re-send `shutdown`.
When using taproot, we must generate a fresh nonce and store the private
nonce locally, otherwise we won't be able to create our `closing_sig`.
We don't need to store the anchor transaction in our channel data when
closing a channel: this isn't used anywhere and unnecessarily uses
space in our DB.

Also, once the commit tx is confirmed, we don't need to watch the
anchor output again on restarts.
Instead of storing every transaction spending the commit tx (which
potentially included remote anchor transactions, which we aren't
interested in), we explicitly store only the transactions that spend
outputs of the commit tx we're interested in. This is more verbose
than the previous code, but avoids unintended side-effects such as
storing remote transactions.
Since we don't have access to channel params in the `Peer` actor, we
don't know the remote `htlc_minimum` when receiving a splice. This may
lead to cases where we later fail because we end up with a negative
funding fee, which doesn't make any sense.

To avoid those failures, we hard-code the `htlc_minimum` value used by
Phoenix (which is the only consumer of this protocol so far) and use
the max with our local `htlc_minimum`.
We update `bitcoin-lib` to v0.45, which changes our JNI bindings for
`secp256k1` and adds support for P2A outputs.
When addres type or change type is not specified, bitcoind will
now start with addresstype=bech32m and changetype=bech32m.

We take this opportunity to fix feerate tests that failed because
of the following reasons:

- first of all, we had a bug where we didn't take into account the
  anchor amount in our fee calculation, so we ended up always adding
  `330 sats` to the on-chain fees we paid, which was hidden by our
  tolerance interval, but started appearing with smaller p2tr inputs
- then we fix the remaining tests that need manual tweaking of the
  utxos available in the test wallet, because they end up creating
  transactions where we don't have a change output (and overpay fees
  slightly, but not enough to make it worth adding a change output),
  these tests simply needed to be tweaked to accomodate p2tr weights

Co-authored by @sstone
We improve the connection logic when a proxy is configured, but Tor shouldn't
be used for IPv4 or IPv6 remote addresses.
After the v0.13.1 release.
We require closed channels to be migrated to the closed channels table
introduced in ACINQ#3170 before starting `eclair`. This ensures that we
will not lose channel data when removing support for non-anchor channels
in the next release.

Node operators will have to:

- run the v0.13.0 release to migrate their channel data to v5
- run the v0.13.1 release to migrate their closed channels

Afterwards, they'll be able to update to the (future) v0.14.x release
once all of their pre-anchor channels have been closed.
We remove support for `static_remotekey` channels and `default`
channels, as advertised in the v0.13 release. This lets us remove
some code related to feerate management and simplifies the test
matrix.

Node operators that still have such channels must not run this
version of `eclair`, which will otherwise fail to start.

Note that for now, we keep sending `update_fee` whenever necessary.
We could remove that now that package relay allows 1p1c packages to
propagate even when the parent is below the mempool minimum feerate,
but we defer that to a later PR for simplicity.
We stop sending `update_fee` and set the feerate to `1 sat/byte` for
channels with mobile wallet users. This removes edge cases around
`update_fee` handling in tricky cases (splicing, shutdown, etc) while
still allowing channels to force-close thanks to package relay.

Note that mobile wallets that don't have an on-chain wallet to use
CPFP on the commit transaction may not be able to get their commit
tx confirmed, but that was already the case before that change since
the LSP decides the commit feerate. This will get better with v3 txs
and https://delvingbitcoin.org/t/zero-fee-commitments-for-mobile-wallets/1453
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants