Skip to content

Implement support for PROXY Protocol#246

Merged
pepoluan merged 139 commits intoaio-libs:masterfrom
pepoluan:proxy-protocol
Feb 21, 2021
Merged

Implement support for PROXY Protocol#246
pepoluan merged 139 commits intoaio-libs:masterfrom
pepoluan:proxy-protocol

Conversation

@pepoluan
Copy link
Copy Markdown
Collaborator

@pepoluan pepoluan commented Feb 11, 2021

What do these changes do?

Implements support for the PROXY Protocol, notably the version as documented in v2.3.0 of the HAProxy program.

Are there changes in behavior for the user?

Should not affect existing users.

Related issue number

Closes #174

Checklist

  • I think the code is well written
  • Unit tests for the changes exist
  • tox testenvs have been executed in the following environments:
    • Windows 10 (via PyCharm tox runner): (ALL)
    • Windows 10 (via PSCore 7.1.1): (ALL)
    • Windows 10 (via Cygwin): qa,py-nocov
    • Ubuntu 18.04 on WSL 1.0: (ALL) + pypy3-{nocov,cov,diffcov}
    • FreeBSD 12.2 on VBox: (ALL) + pypy3-{nocov,cov,diffcov}
    • OpenSUSE Leap 15.2 on VBox: (ALL) + pypy3-{nocov,cov,diffcov}
  • Documentation reflects the changes
  • Add a news fragment into the NEWS.rst file

@pepoluan pepoluan added this to the 1.4 milestone Feb 11, 2021
@pepoluan pepoluan changed the title Proxy protocol Implement support for PROXY Protocol Feb 11, 2021
@codecov
Copy link
Copy Markdown

codecov bot commented Feb 11, 2021

Codecov Report

Merging #246 (61681e6) into master (cfd7cf6) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##            master      #246    +/-   ##
==========================================
  Coverage   100.00%   100.00%            
==========================================
  Files            6         7     +1     
  Lines         1206      1542   +336     
  Branches       220       285    +65     
==========================================
+ Hits          1206      1542   +336     
Impacted Files Coverage Δ
aiosmtpd/__init__.py 100.00% <100.00%> (ø)
aiosmtpd/main.py 100.00% <100.00%> (ø)
aiosmtpd/proxy_protocol.py 100.00% <100.00%> (ø)
aiosmtpd/smtp.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cfd7cf6...61681e6. Read the comment docs.

@lgtm-com
Copy link
Copy Markdown
Contributor

lgtm-com bot commented Feb 11, 2021

This pull request introduces 1 alert when merging bd5a997 into 6404ab2 - view on LGTM.com

new alerts:

  • 1 for __eq__ not overridden when adding attributes

@lgtm-com
Copy link
Copy Markdown
Contributor

lgtm-com bot commented Feb 11, 2021

This pull request introduces 1 alert when merging 64111e5 into 6404ab2 - view on LGTM.com

new alerts:

  • 1 for `__eq__` not overridden when adding attributes

@pepoluan
Copy link
Copy Markdown
Collaborator Author

This pull request introduces 1 alert when merging 64111e5 into 6404ab2 - view on LGTM.com

new alerts:

* 1 for `__eq__` not overridden when adding attributes

Oh for cryin' out loud!

IT WAS INTENTIONAL. DOESN'T THE "SUPPRESSION" INDICATE THAT ALREADY

@TimWolla
Copy link
Copy Markdown
Contributor

@pepoluan Feel free to ping me for review of this one, once it's good. It appears a proper rebase should be done?

@pepoluan
Copy link
Copy Markdown
Collaborator Author

pepoluan commented Feb 11, 2021

@pepoluan Feel free to ping me for review of this one, once it's good. It appears a proper rebase should be done?

Actually, I've been merging back from master for every change, so no additional rebase is necessary.

Even if there's a conflict, I prefer to merge from master then pushing back.

If there's no conflict, we can go direct to squamerging and all should be well.

Edit: At this moment, I'm just waiting for the checks to go all-green, then I'll undraft this and it's ready for review.

@TimWolla
Copy link
Copy Markdown
Contributor

Actually, I've been merging back from master for every change, so no additional rebase is necessary.

Looking at the diff and the huge number of more than 400 to-be-merged commits it looks like something went wrong. This PR as of right now appears to include unrelated changes.

@pepoluan
Copy link
Copy Markdown
Collaborator Author

Actually, I've been merging back from master for every change, so no additional rebase is necessary.

Looking at the diff and the huge number of more than 400 to-be-merged commits it looks like something went wrong. This PR as of right now appears to include unrelated changes.

It was due to all the rebasing that messes up the git history. That is why I don't want to rebase anymore.

(When squamerged it will all play out, no worry about that.)

If you check "Files changed", you will also see that there are lots of ".rst" files being changed. This is because this PR is finalized after #245 was finalized, and I cannot make a clean change to master that also applies cleanly to #245 due to the numer & shape of changes. So I merged from #245.

If you want to review the changes, your best bet is to use the "Files changed" tab and either (1) focus only on ".py" files ignoring others, or (2) wait until #245 gets merged, which should prune the number of files changed greatly.

@pepoluan pepoluan marked this pull request as ready for review February 11, 2021 19:49
@waynew
Copy link
Copy Markdown
Collaborator

waynew commented Feb 12, 2021

FWIW sometimes a rebase will collect more commits than necessary - use git rebase --interactive and make sure you only have the commits you're interested in.

Changes:
* Use internal func to get a field piecemeal using read() instead of
 readexactly(). This allows rapid response to Connection Lost instead of
  waiting for timeout. Also, introduces more points where cooperative
  concurrency context switching can take place.
* Pepper the func with lots of log.debug to help troubleshooting.
* Rename "rest" to "tail_part"

Also add some log.debug()s in get_proxy()
We now use "stringfield splitting" instead of regex.

Reasons:
- Regex is hard to maintain. Obligatory xkcd: https://xkcd.com/1171/
- With regex, it's all-or-none. Can't alert piecemeal unless you
  break the regex into lots of small groups. See above.
- If you're going to break the regex into small groups anyways, why
  not do stringfield splitting instead? Will be faster.
- Also with stringfield splitting, we can gradually validate the
  fields, emitting exact error.

Also:
- Pepper new parsing procedure with log.debug()s
- Introduce several cooperative context switching points (await's)

Finally, we add a slew of test cases to test the new implementation. And
also modified some existing ones to reflect changes.
Also modify the related test case
Originally the delay was set to 0.01. Not really a hiccup, but setting
it longer causes Connection Error.

After changing from readexactly() to piecemeal read()-and-accumulate,
hiccups no longer cause Connection Error.
warn() is much too noisy and cluttery.

log.warning() provides subtle warning visible in the logs
Artefact left over from since it was warnings.warn()

Called during ``NOOP``.

.. method:: handle_PROXY(server, session, envelope, proxy_data)
Copy link
Copy Markdown
Contributor

@TimWolla TimWolla Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this method signature correct? I had no success with a dedicated proxy_data parameter, but could grab the information from session.proxy_data.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, my bad. It should be as the documentation indicated, but apparently I forgot to change the implementation.

I'll fix it in the next commitpush.

else:
return
else:
rslt[typ_name] = val
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

val is a bytearray here. It might be my lack of Python expertise, but I think that a bytes object might be more useful here, because one usually does not need to handle individual characters, but instead process e.g. the UNIQUE_ID as-is.

Copy link
Copy Markdown
Collaborator Author

@pepoluan pepoluan Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bytes and bytearray are practically the same in Python. They are comparable, slice-able, decode-able, etc.

For example:

>>> b = b"123"
>>> ba = bytearray(b"123")
>>> b == ba
True
>>> b[0:2] == ba[0:2]
True

The only difference is that the latter is mutable, while the former is not.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, thank you for the clarification. Then this is good and I'll have to recheck what I did wrong in my test 😄

Copy link
Copy Markdown
Collaborator Author

@pepoluan pepoluan Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just remember you can't directly compare bytes to str. So the following

b"123" == "123"

will result in an error, as str is "Unicode codepoints". You need to either decode the bytes, or encode the str, like so:

(b"123").decode() == "123"
b"123" == "123".encode()

or if comparing a variable to a literal, don't forget the b prefix, like so:

proxy_data.tlv.UNIQUE_ID = b"some_unique_id"

Copy link
Copy Markdown
Contributor

@TimWolla TimWolla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a functional test with our in-house handler and HAProxy in front of it. I tested both with PROXYv1 and PROXYv2, this is working fine.

  • I only have the above remark regarding the documentation.
  • And the remark regarding bytearray vs bytes.

As described in the documentation.
Probably dupe of something else since this one does not increase
coverage. But MOAR TEST is GOOD
Sweeter and more illustrative of their purposes
Copy link
Copy Markdown
Contributor

@TimWolla TimWolla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did not do another functional test after the recent adjustments. I trust that you know what you are doing here. As my remarks have been resolved this LGTM now. Thank you for taking the time to tackle this and also for taking all my suggestions / complaints into account 😃

@pepoluan
Copy link
Copy Markdown
Collaborator Author

Did not do another functional test after the recent adjustments. I trust that you know what you are doing here. As my remarks have been resolved this LGTM now. Thank you for taking the time to tackle this and also for taking all my suggestions / complaints into account 😃

No problem!

It has been a good exercise for me 😉

This will get merged no later than next Wednesday in 1.4.0.

Currently am focusing on releasing 1.3.1 because 1.3.0 broke people's servers 😓

# Conflicts:
#	aiosmtpd/docs/NEWS.rst
#	aiosmtpd/smtp.py
# Conflicts:
#	aiosmtpd/__init__.py
@pepoluan
Copy link
Copy Markdown
Collaborator Author

pepoluan commented Feb 20, 2021

Final checks. If all these (and the next GA) succeeds, then I'll squamerge.

  • Windows 10 (via PyCharm tox runner): (ALL)
  • Windows 10 (via PSCore 7.1.2): (ALL)
  • Windows 10 (via Cygwin): qa,docs,py{36,38}-{nocov,cov}
  • Ubuntu 18.04 on WSL 1.0: (ALL) + static + pypy3-{nocov,cov,diffcov}
  • FreeBSD 12,2 on VBox: (ALL) + static + pypy3-{nocov,cov,diffcov}
  • OpenSUSE Leap 15.2 on VBox: (ALL) + static + pypy3-{nocov,cov,diffcov}

Local testing system done. As soon as it passes all repochecks, I'll squamerge.

@pepoluan pepoluan merged commit 4336a05 into aio-libs:master Feb 21, 2021
@pepoluan pepoluan deleted the proxy-protocol branch February 21, 2021 09:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

HAProxy proxy protocol

3 participants