Conversation
Owner
Author
|
More testing with 16k NIC buffer size revealed that
|
Closed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
With this PR the EtherExpress16 driver enters a completed state. Several serious bugs have been weeded out, shared memory support has been added and the code has been significantly cleaned up. The new version is fast, reliable and supports important
bootoptsconfiguration flags. 8bit bus support is still missing - a relatively easy addition when/if the need arises because the scaffolding is already in place.Summary of key enhancements since the previous PR:
/bootopts, shared memory is used, otherwise pio. With more practical experience it may be desirable to force pio-mode when the buffer is 16k, like other drivers do. However, at this point there is little evidence to the claim of instability of the 16k/shmem combination.bootoptsif available, regardless of card settings. In verbose mode, the card settings are reported at boottime. The card's shared memory settings are read but currently ignored.ktcppermitting. A real kludge, a 600ms delay, has been added to the readselect routine in the driver. The delay speeds up outgoing file transfers by 300% average by avoiding a read select_wait call. A similar method may speed up other drivers as well. Looking into whether ktcp may be adjusted to avoid this kludge is on the list. See discussion below.((bufsize in k)>>3)&7, which means 2@16k, 4@32/64k. The current implementation of ktcp does not enable practical use of more than 2 tx-buffers, so the latter case is a waste - for now. Total number of NIC packet buffers is 10/20/40 @ 16/32/64k.The driver is (still) happily and silently overwrite unprocessed packets in the rx-queue under heavy load. Changing the code to handle this differently has been deemed not worthwhile this far - possibly because most testing has been done on a 32k buffer NIC (iow ample buffer space). The only case in which this issue may (!) become visible is when launching a flood ping with significant size packets while running a listing in an outgoing telnet connection. Flood pings and ftp transfers work well. AAMOF - ftp transfers into TLVC are just barely affected by a running flood ping, indicating that there is more performance to be gained from tuning/optimizing ktcp.
Which brings us back to the
TCP speed kludge
This 'trick' was accidentally discovered when removing
printks (actuallykputchars) turned out to kill the performance of outgoing FTPs completely, from 70+k to ~25k. Keeping thekputchars inlasttxstatus(), which is part of tx interrupt processing, brought the performance back. Thekputchars were eventually replaced byudelay()calls and moved to the_selectfunction where quite a bit of experimenting lead to what seems to be a reasonable delay value.This value will be different on a different speed machine, so - while in use - it should be calculated using a speed index, like the machine's BOGOMIPS value - which currently does not exist, but may be coming.
As to why this delay helps: What seems to be the case is that in the course of pushing packets and ticking off incoming ACKs, the delay is just enough to avoid a
select_wait()/wake_up()cycle when receiving the ACK immediately following a transmitted data packet. The delay optimizes the rhythm of the exchange so to speak. It is not obvious that there is an easy fix for this inktcp, but given the size of the improvement, it seems worth a discussion.