Commit d457cee
authored
Don't create child lookup if parent is faulty (#7118)
Issue discovered on PeerDAS devnet (node `lighthouse-geth-2.peerdas-devnet-5.ethpandaops.io`). Summary:
- A lookup is created for block root `0x28299de15843970c8ea4f95f11f07f75e76a690f9a8af31d354c38505eebbe12`
- That block or a parent is faulty and `0x28299de15843970c8ea4f95f11f07f75e76a690f9a8af31d354c38505eebbe12` is added to the failed chains cache
- We later receive a block that is a child of a child of `0x28299de15843970c8ea4f95f11f07f75e76a690f9a8af31d354c38505eebbe12`
- We create a lookup, which attempts to process the child of `0x28299de15843970c8ea4f95f11f07f75e76a690f9a8af31d354c38505eebbe12` and hit a processor error `UnknownParent`, hitting this line
https://github.com/sigp/lighthouse/blob/bf955c7543dac8911a6f6c334b5b3ca4ef728d9c/beacon_node/network/src/sync/block_lookups/mod.rs#L686-L688
`search_parent_of_child` does not create a parent lookup because the parent root is in the failed chain cache. However, we have **already** marked the child as awaiting the parent. This results in an inconsistent state of lookup sync, as there's a lookup awaiting a parent that doesn't exist.
Now we have a lookup (the child of `0x28299de15843970c8ea4f95f11f07f75e76a690f9a8af31d354c38505eebbe12`) that is awaiting a parent lookup that doesn't exist: hence stuck.
### Impact
This bug can affect Mainnet as well as PeerDAS devnets.
This bug may stall lookup sync for a few minutes (up to `LOOKUP_MAX_DURATION_STUCK_SECS = 15 min`) until the stuck prune routine deletes it. By that time the root will be cleared from the failed chain cache and sync should succeed. During that time the user will see a lot of `WARN` logs when attempting to add each peer to the inconsistent lookup. We may also sync the block through range sync if we fall behind by more than 2 epochs. We may also create the parent lookup successfully after the failed cache clears and complete the child lookup.
This bug is triggered if:
- We have a lookup that fails and its root is added to the failed chain cache (much more likely to happen in PeerDAS networks)
- We receive a block that builds on a child of the block added to the failed chain cache
Ensure that we never create (or leave existing) a lookup that references a non-existing parent.
I added `must_use` lints to the functions that create lookups. To fix the specific bug we must recursively drop the child lookup if the parent is not created. So if `search_parent_of_child` returns `false` now return `LookupRequestError::Failed` instead of `LookupResult::Pending`.
As a bonus I have a added more logging and reason strings to the errors1 parent 9a49720 commit d457cee
File tree
4 files changed
+119
-25
lines changed- beacon_node/network/src/sync
- block_lookups
- tests
4 files changed
+119
-25
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
106 | 106 | | |
107 | 107 | | |
108 | 108 | | |
109 | | - | |
| 109 | + | |
110 | 110 | | |
111 | 111 | | |
112 | 112 | | |
| |||
194 | 194 | | |
195 | 195 | | |
196 | 196 | | |
| 197 | + | |
| 198 | + | |
197 | 199 | | |
198 | 200 | | |
199 | 201 | | |
200 | 202 | | |
201 | 203 | | |
202 | 204 | | |
| 205 | + | |
203 | 206 | | |
204 | 207 | | |
205 | 208 | | |
206 | 209 | | |
207 | 210 | | |
208 | 211 | | |
209 | | - | |
| 212 | + | |
210 | 213 | | |
211 | 214 | | |
212 | 215 | | |
| |||
223 | 226 | | |
224 | 227 | | |
225 | 228 | | |
226 | | - | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
227 | 232 | | |
228 | 233 | | |
229 | 234 | | |
230 | 235 | | |
| 236 | + | |
231 | 237 | | |
232 | 238 | | |
233 | 239 | | |
234 | 240 | | |
235 | 241 | | |
236 | 242 | | |
237 | 243 | | |
| 244 | + | |
238 | 245 | | |
239 | 246 | | |
240 | 247 | | |
241 | 248 | | |
242 | 249 | | |
243 | | - | |
244 | | - | |
| 250 | + | |
| 251 | + | |
245 | 252 | | |
246 | 253 | | |
247 | 254 | | |
| |||
256 | 263 | | |
257 | 264 | | |
258 | 265 | | |
| 266 | + | |
259 | 267 | | |
260 | 268 | | |
261 | 269 | | |
| |||
363 | 371 | | |
364 | 372 | | |
365 | 373 | | |
| 374 | + | |
366 | 375 | | |
367 | 376 | | |
368 | 377 | | |
| |||
656 | 665 | | |
657 | 666 | | |
658 | 667 | | |
659 | | - | |
| 668 | + | |
660 | 669 | | |
661 | 670 | | |
662 | 671 | | |
| |||
665 | 674 | | |
666 | 675 | | |
667 | 676 | | |
668 | | - | |
| 677 | + | |
669 | 678 | | |
670 | 679 | | |
671 | 680 | | |
672 | 681 | | |
673 | 682 | | |
674 | 683 | | |
675 | | - | |
| 684 | + | |
676 | 685 | | |
677 | 686 | | |
678 | 687 | | |
| |||
691 | 700 | | |
692 | 701 | | |
693 | 702 | | |
694 | | - | |
| 703 | + | |
695 | 704 | | |
696 | 705 | | |
697 | 706 | | |
| |||
703 | 712 | | |
704 | 713 | | |
705 | 714 | | |
706 | | - | |
| 715 | + | |
707 | 716 | | |
708 | 717 | | |
709 | 718 | | |
| |||
757 | 766 | | |
758 | 767 | | |
759 | 768 | | |
| 769 | + | |
| 770 | + | |
760 | 771 | | |
761 | | - | |
762 | | - | |
763 | | - | |
764 | | - | |
765 | | - | |
766 | | - | |
767 | | - | |
768 | | - | |
| 772 | + | |
| 773 | + | |
| 774 | + | |
| 775 | + | |
| 776 | + | |
| 777 | + | |
| 778 | + | |
| 779 | + | |
| 780 | + | |
| 781 | + | |
| 782 | + | |
| 783 | + | |
| 784 | + | |
| 785 | + | |
| 786 | + | |
| 787 | + | |
| 788 | + | |
| 789 | + | |
| 790 | + | |
769 | 791 | | |
770 | | - | |
| 792 | + | |
771 | 793 | | |
772 | | - | |
| 794 | + | |
773 | 795 | | |
774 | 796 | | |
775 | 797 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
40 | 40 | | |
41 | 41 | | |
42 | 42 | | |
43 | | - | |
| 43 | + | |
44 | 44 | | |
45 | 45 | | |
46 | 46 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
929 | 929 | | |
930 | 930 | | |
931 | 931 | | |
932 | | - | |
| 932 | + | |
933 | 933 | | |
934 | 934 | | |
935 | 935 | | |
936 | 936 | | |
937 | | - | |
| 937 | + | |
| 938 | + | |
| 939 | + | |
| 940 | + | |
| 941 | + | |
| 942 | + | |
| 943 | + | |
| 944 | + | |
| 945 | + | |
938 | 946 | | |
939 | 947 | | |
940 | 948 | | |
| |||
945 | 953 | | |
946 | 954 | | |
947 | 955 | | |
948 | | - | |
949 | | - | |
| 956 | + | |
| 957 | + | |
| 958 | + | |
| 959 | + | |
| 960 | + | |
| 961 | + | |
| 962 | + | |
| 963 | + | |
| 964 | + | |
950 | 965 | | |
951 | 966 | | |
952 | 967 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1725 | 1725 | | |
1726 | 1726 | | |
1727 | 1727 | | |
| 1728 | + | |
| 1729 | + | |
| 1730 | + | |
| 1731 | + | |
| 1732 | + | |
| 1733 | + | |
| 1734 | + | |
| 1735 | + | |
| 1736 | + | |
| 1737 | + | |
| 1738 | + | |
| 1739 | + | |
| 1740 | + | |
| 1741 | + | |
| 1742 | + | |
| 1743 | + | |
| 1744 | + | |
| 1745 | + | |
| 1746 | + | |
| 1747 | + | |
| 1748 | + | |
| 1749 | + | |
| 1750 | + | |
| 1751 | + | |
| 1752 | + | |
| 1753 | + | |
| 1754 | + | |
| 1755 | + | |
| 1756 | + | |
| 1757 | + | |
| 1758 | + | |
| 1759 | + | |
| 1760 | + | |
| 1761 | + | |
| 1762 | + | |
| 1763 | + | |
| 1764 | + | |
| 1765 | + | |
| 1766 | + | |
| 1767 | + | |
| 1768 | + | |
| 1769 | + | |
| 1770 | + | |
| 1771 | + | |
| 1772 | + | |
| 1773 | + | |
| 1774 | + | |
| 1775 | + | |
| 1776 | + | |
| 1777 | + | |
| 1778 | + | |
| 1779 | + | |
| 1780 | + | |
| 1781 | + | |
| 1782 | + | |
| 1783 | + | |
| 1784 | + | |
1728 | 1785 | | |
1729 | 1786 | | |
1730 | 1787 | | |
| |||
0 commit comments