Replies: 4 comments
-
|
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval. |
Beta Was this translation helpful? Give feedback.
-
|
See CVE-2022-40954 Please next time use GitHub discussions for questions. The issue tracker is for bug reports |
Beta Was this translation helpful? Give feedback.
-
|
You can also make PR to allow also apark3-submit and we will include it in the next release of the provider |
Beta Was this translation helpful? Give feedback.
-
|
Thank you! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Apache Airflow Provider(s)
apache-spark
Versions of Apache Airflow Providers
4.0.0
Apache Airflow version
2.5.1
Operating System
Debian GNU/Linux 10 (buster)
Deployment
Other
Deployment details
No response
What happened
in airflow-providers-apache-spark 4.0.0, the value of spark_binary
was hardcoded to be restricted to only either 'spark-submit' or 'spark2-submit'.
What was the reason for this? At the Wikimedia Foundation, we install the
spark 3 binary as 'spark3-submit'. This change in airflow spark 4.0.0 has broken
some of our dags, making us resort to things like this.
What you think should happen instead
We'd submit a patch to expand the restriction list to include 'spark3-submit', but we aren't sure why this was done in the first place. I understand the reasoning for removing
spark_home, but it seems strange to have aspark_binaryparameter and restrict it to these two values.Can we undo this? If not, should we submit a patch to add spark3-submit to the list?
How to reproduce
Set
spark_binaryto 'spark3-submit'Anything else
No response
Are you willing to submit PR?
Code of Conduct
Beta Was this translation helpful? Give feedback.
All reactions