Skip to content

Further e2e fixes for reliability#198

Merged
ikatson merged 2 commits intomainfrom
further-e2e-fixes
Aug 19, 2024
Merged

Further e2e fixes for reliability#198
ikatson merged 2 commits intomainfrom
further-e2e-fixes

Conversation

@ikatson
Copy link
Copy Markdown
Owner

@ikatson ikatson commented Aug 19, 2024

I found the root cause of e2e failures.

it was around this line debug!("nothing left to do, disconnecting peer");. The server was disconnecting the peer if the server itself had the full torrent.

However, the reason it worked at all, is that "peer_chunk_requester" was stuck in "wait_for_bitfield" forever if the peer never sent it in the first place. So it never reached the bugged line.

sequence:

  1. the server starts seeding
  2. first peers connect. The don't send bitfield because they have nothing.
  3. the server's "peer_chunk_requester" blocks forever in "wait_for_bitfield" and never reaches the line that disconnects it.
  4. (intentionally) bad test peers take too long, or send garbage etc
  5. the good peers run out of pieces to request and hit the "sleep 10s" line.
  6. the server's rwtimeout is set to 10 seconds also, and it disconnects good peers as they weren't doing anything for 10 seconds
  7. good peers reconnect. By that time they have already a bitfield to send.
  8. the moment they send the bitfield, the server hits the line "nothing to do" and disconnects the peer again.

@ikatson ikatson marked this pull request as ready for review August 19, 2024 15:39
@ikatson ikatson merged commit e3ab7e2 into main Aug 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant