Skip to content

Conversation

@ShakaibKhan
Copy link
Contributor

@ShakaibKhan ShakaibKhan commented Sep 27, 2021

closes: #17962

Added warning banner message for when /robot.txt is hit with on/off config

@boring-cyborg boring-cyborg bot added the area:webserver Webserver related Issues label Sep 27, 2021
@ShakaibKhan ShakaibKhan mentioned this pull request Sep 27, 2021
2 tasks
@potiuk potiuk closed this Sep 28, 2021
@potiuk potiuk reopened this Sep 28, 2021
@potiuk
Copy link
Member

potiuk commented Sep 28, 2021

There were some delays in night which caused waitign for image to timeout. Close/reopen to rebuild

Copy link
Contributor

@SamWheating SamWheating Sep 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this message maybe be extended with some more specific or useful information?

Maybe something like:

Recent requests to /robots.txt indicate that this deployment may be accessible to the public internet.

This warning can be disabled by setting webserver.warn_deployment_exposure=False in the configuration.

We could also link to a page in the docs with more information on webserver security.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flask uses cookie-based sessions. This warning will therefore only be displayed to clients who have visited /robots.txt, in our case bot crawlers. I think we should notify the administrator about this situation, not the bot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would be the best way to display the message for admin team? is there some way to post the message only for admin sessions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you need to write it down to the database as a new table or to the event log table.

@ShakaibKhan ShakaibKhan force-pushed the warn_public_deployment branch 2 times, most recently from 3e41c64 to 001298d Compare October 20, 2021 04:48
@ShakaibKhan ShakaibKhan force-pushed the warn_public_deployment branch from 001298d to f584371 Compare October 25, 2021 18:21
@ShakaibKhan
Copy link
Contributor Author

quarantined test are failing but having trouble running them locally to fix

@mik-laj
Copy link
Member

mik-laj commented Nov 2, 2021

@ShakaibKhan please ignore quarantined tests.

@potiuk
Copy link
Member

potiuk commented Nov 8, 2021

LGTM. @mik-laj @SamWheating ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be limited to the last week or something? The warning message explicitly states that Recent requests have been made to /robots.txt but this could be triggered by very old requests.

There's also no way to clear this entry without deleting the log event so it might persist for a long time, even if the vulnerability is fixed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if it would be good enough to clear requests older than a week and only display the message on recent requests

Copy link
Contributor

@SamWheating SamWheating Nov 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also just query for events within one week old? Something like:

robots_file_access_count = (
    session.query(Log)
    .filter(Log.event == "robots")
    .filter(Log.dttm > (utcnow() - timedelta(days=7)))
    .count()
)

Copy link
Contributor

@SamWheating SamWheating Nov 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this warning can be disabled? I don't see the value of warn_deployment_exposure actually being referenced anywhere other than the config.

Copy link
Contributor Author

@ShakaibKhan ShakaibKhan Nov 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed it during a merge, I will put it back in the if statement

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In one sentence, we are talking about a configuration option, and then we link to a documentation that does not contain a description of that option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add the config to documentation

@ShakaibKhan ShakaibKhan force-pushed the warn_public_deployment branch from f584371 to aafffd5 Compare November 10, 2021 23:21
@ShakaibKhan ShakaibKhan requested a review from kaxil as a code owner November 10, 2021 23:21
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This query should only be run if webserver.warn_deployment_exposure is enabled. Otherwise its putting additional load on the database for no reason, since the warning isn't going to be displayed anyways.

@kaxil kaxil requested a review from mik-laj December 8, 2021 23:44
@kaxil kaxil requested a review from uranusjr December 8, 2021 23:46
@github-actions
Copy link

github-actions bot commented Dec 8, 2021

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Dec 8, 2021
@kaxil
Copy link
Member

kaxil commented Dec 8, 2021

Can you review this again @SamWheating to see if your comments are addressed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use the get_docs_url function? Thanks to it, the link will be pinned to the version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably:

Suggested change
'"https://airflow.apache.org/docs/apache-airflow/stable/security/webserver.html">'
f'"{get_docs_url('security/webserver.html')}">'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
version_added: 2.2.0
version_added: 2.2.3

I think that this value will need to be updated, but if this is merged in the next week or two it will probably be included in the 2.2.3 release.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make it 2.3.0 please -- this won't be included in 2.2.3

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
version_added: 2.2.0
version_added: 2.3.0

Comment on lines 34 to 35
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Disable Deployment Exposure Warning
------------------------------------------------------
Disable Deployment Exposure Warning
---------------------------------------

Copy link
Member

@kaxil kaxil Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Airflow warns when recent requests are made to /robot.txt. To disable this warning set the below:
Airflow warns when recent requests are made to `/robot.txt`. To disable this warning set ``warn_deployment_exposure`` to ``False`` as below:

@ShakaibKhan ShakaibKhan force-pushed the warn_public_deployment branch from d70a162 to 8dfca34 Compare December 11, 2021 21:05
@ShakaibKhan ShakaibKhan requested a review from potiuk as a code owner December 11, 2021 21:05
@potiuk
Copy link
Member

potiuk commented Dec 18, 2021

Needs conflict resolving :(

@ShakaibKhan ShakaibKhan force-pushed the warn_public_deployment branch from ffc8ade to a3225b1 Compare December 18, 2021 17:53
@uranusjr uranusjr changed the title Warning of public exposure of deployment in UI with on/off config Add config to warn public deployment exposure in UI Dec 24, 2021
@uranusjr uranusjr merged commit 24a53fb into apache:main Dec 24, 2021
@ephraimbuddy ephraimbuddy added the type:new-feature Changelog: New Features label Apr 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:webserver Webserver related Issues full tests needed We need to run full set of tests for this PR to merge type:new-feature Changelog: New Features

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Warn if robots.txt is accessed

8 participants