SPARK-1254. Supplemental fix for HTTPS on Maven Central #209

srowen · 2014-03-23T15:00:46Z

It seems that HTTPS does not necessarily work on Maven Central, as it does not today at least. Back to HTTP. Both builds works from a clean repo.

…essarily work

AmplabJenkins · 2014-03-23T15:08:04Z

Merged build triggered.

AmplabJenkins · 2014-03-23T15:08:04Z

Merged build started.

AmplabJenkins · 2014-03-23T16:07:44Z

Merged build finished.

AmplabJenkins · 2014-03-23T16:07:44Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13367/

pwendell · 2014-03-23T17:56:45Z

Hey @srowen - I'm guessing that maven hostname delegates to a mirror/CDN network and maybe some of them support HTTPS and others don't. Seems fine to just fallback to HTTP in that case.

pwendell · 2014-03-23T17:57:16Z

Merged

## What changes were proposed in this pull request? This patch introduces advanced query pushdown to Redshift and is largely based on the work done by the Snowflake people: https://github.com/snowflakedb/spark-snowflake Supported operators: - Filter, Project, Sort, Limit Supported expressions (see PR apache#221 for more info): - most boolean logic operators - comparisons - basic arithmetic operations - numeric and string casts - most string functions - (uncorrelated) scalar subqueries Note: No support for `date` and `timestamp` yet. New feature flag: - `spark.databricks.redshift.pushdown` - enabled by default Future TODOs: - enable support for more complex expressions and operators (e.g. dates and timestamps, Aggr, Joins) (SC-5768) - integrate TPC-H testing suite (SC-5717) ## How was this patch tested? * pre-existing Redshift unit tests and redshift-integration-tests * adapted a large part of pre-existing integration tests to check both the old and the new code paths * three new test suites: {`Filter`,`Advanced`,`Randomized`}`PushdownIntegrationSuite` Author: Adrian Ionescu <[email protected]> Author: Juliusz Sompolski <[email protected]> Author: Adrian Ionescu <[email protected]> Closes apache#209 from adrian-ionescu/redshift-pushdown.

* metrics 3.2.2 * update deps

…re (Permission denied)" error on Spark 2.1.X MEP 4.0 - MapR SASL security enabled (apache#209) (cherry picked from commit 97c377b)

* test beta stub universes * use hdfs in test_hdfs

Clean volume resource after k8s jobs finished to avoid volume quota error

…constraint ### What changes were proposed in this pull request? This PR add support infer constraints from cast equality constraint. For example: ```scala scala> spark.sql("create table spark_29231_1(c1 bigint, c2 bigint)") res0: org.apache.spark.sql.DataFrame = [] scala> spark.sql("create table spark_29231_2(c1 int, c2 bigint)") res1: org.apache.spark.sql.DataFrame = [] scala> spark.sql("select t1.* from spark_29231_1 t1 join spark_29231_2 t2 on (t1.c1 = t2.c1 and t1.c1 = 1)").explain == Physical Plan == *(2) Project [c1#5L, c2#6L] +- *(2) BroadcastHashJoin [c1#5L], [cast(c1#7 as bigint)], Inner, BuildRight :- *(2) Project [c1#5L, c2#6L] : +- *(2) Filter (isnotnull(c1#5L) AND (c1#5L = 1)) : +- *(2) ColumnarToRow : +- FileScan parquet default.spark_29231_1[c1#5L,c2#6L] Batched: true, DataFilters: [isnotnull(c1#5L), (c1#5L = 1)], Format: Parquet, Location: InMemoryFileIndex[file:/root/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehouse/spark_29231_1], PartitionFilters: [], PushedFilters: [IsNotNull(c1), EqualTo(c1,1)], ReadSchema: struct<c1:bigint,c2:bigint> +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint))), [id=#209] +- *(1) Project [c1#7] +- *(1) Filter isnotnull(c1#7) +- *(1) ColumnarToRow +- FileScan parquet default.spark_29231_2[c1#7] Batched: true, DataFilters: [isnotnull(c1#7)], Format: Parquet, Location: InMemoryFileIndex[file:/root/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehouse/spark_29231_2], PartitionFilters: [], PushedFilters: [IsNotNull(c1)], ReadSchema: struct<c1:int> ``` After this PR: ```scala scala> spark.sql("select t1.* from spark_29231_1 t1 join spark_29231_2 t2 on (t1.c1 = t2.c1 and t1.c1 = 1)").explain == Physical Plan == *(2) Project [c1#0L, c2#1L] +- *(2) BroadcastHashJoin [c1#0L], [cast(c1#2 as bigint)], Inner, BuildRight :- *(2) Project [c1#0L, c2#1L] : +- *(2) Filter (isnotnull(c1#0L) AND (c1#0L = 1)) : +- *(2) ColumnarToRow : +- FileScan parquet default.spark_29231_1[c1#0L,c2#1L] Batched: true, DataFilters: [isnotnull(c1#0L), (c1#0L = 1)], Format: Parquet, Location: InMemoryFileIndex[file:/root/opensource/spark/spark-warehouse/spark_29231_1], PartitionFilters: [], PushedFilters: [IsNotNull(c1), EqualTo(c1,1)], ReadSchema: struct<c1:bigint,c2:bigint> +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint))), [id=#99] +- *(1) Project [c1#2] +- *(1) Filter ((cast(c1#2 as bigint) = 1) AND isnotnull(c1#2)) +- *(1) ColumnarToRow +- FileScan parquet default.spark_29231_2[c1#2] Batched: true, DataFilters: [(cast(c1#2 as bigint) = 1), isnotnull(c1#2)], Format: Parquet, Location: InMemoryFileIndex[file:/root/opensource/spark/spark-warehouse/spark_29231_2], PartitionFilters: [], PushedFilters: [IsNotNull(c1)], ReadSchema: struct<c1:int> ``` ### Why are the changes needed? Improve query performance. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Unit test. Closes #27252 from wangyum/SPARK-29231. Authored-by: Yuming Wang <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>

…constraint ### What changes were proposed in this pull request? This PR add support infer constraints from cast equality constraint. For example: ```scala scala> spark.sql("create table spark_29231_1(c1 bigint, c2 bigint)") res0: org.apache.spark.sql.DataFrame = [] scala> spark.sql("create table spark_29231_2(c1 int, c2 bigint)") res1: org.apache.spark.sql.DataFrame = [] scala> spark.sql("select t1.* from spark_29231_1 t1 join spark_29231_2 t2 on (t1.c1 = t2.c1 and t1.c1 = 1)").explain == Physical Plan == *(2) Project [c1#5L, c2#6L] +- *(2) BroadcastHashJoin [c1#5L], [cast(c1#7 as bigint)], Inner, BuildRight :- *(2) Project [c1#5L, c2#6L] : +- *(2) Filter (isnotnull(c1#5L) AND (c1#5L = 1)) : +- *(2) ColumnarToRow : +- FileScan parquet default.spark_29231_1[c1#5L,c2#6L] Batched: true, DataFilters: [isnotnull(c1#5L), (c1#5L = 1)], Format: Parquet, Location: InMemoryFileIndex[file:/root/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehouse/spark_29231_1], PartitionFilters: [], PushedFilters: [IsNotNull(c1), EqualTo(c1,1)], ReadSchema: struct<c1:bigint,c2:bigint> +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint))), [id=apache#209] +- *(1) Project [c1#7] +- *(1) Filter isnotnull(c1#7) +- *(1) ColumnarToRow +- FileScan parquet default.spark_29231_2[c1#7] Batched: true, DataFilters: [isnotnull(c1#7)], Format: Parquet, Location: InMemoryFileIndex[file:/root/spark-3.0.0-preview2-bin-hadoop2.7/spark-warehouse/spark_29231_2], PartitionFilters: [], PushedFilters: [IsNotNull(c1)], ReadSchema: struct<c1:int> ``` After this PR: ```scala scala> spark.sql("select t1.* from spark_29231_1 t1 join spark_29231_2 t2 on (t1.c1 = t2.c1 and t1.c1 = 1)").explain == Physical Plan == *(2) Project [c1#0L, c2#1L] +- *(2) BroadcastHashJoin [c1#0L], [cast(c1#2 as bigint)], Inner, BuildRight :- *(2) Project [c1#0L, c2#1L] : +- *(2) Filter (isnotnull(c1#0L) AND (c1#0L = 1)) : +- *(2) ColumnarToRow : +- FileScan parquet default.spark_29231_1[c1#0L,c2#1L] Batched: true, DataFilters: [isnotnull(c1#0L), (c1#0L = 1)], Format: Parquet, Location: InMemoryFileIndex[file:/root/opensource/spark/spark-warehouse/spark_29231_1], PartitionFilters: [], PushedFilters: [IsNotNull(c1), EqualTo(c1,1)], ReadSchema: struct<c1:bigint,c2:bigint> +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint))), [id=apache#99] +- *(1) Project [c1#2] +- *(1) Filter ((cast(c1#2 as bigint) = 1) AND isnotnull(c1#2)) +- *(1) ColumnarToRow +- FileScan parquet default.spark_29231_2[c1#2] Batched: true, DataFilters: [(cast(c1#2 as bigint) = 1), isnotnull(c1#2)], Format: Parquet, Location: InMemoryFileIndex[file:/root/opensource/spark/spark-warehouse/spark_29231_2], PartitionFilters: [], PushedFilters: [IsNotNull(c1)], ReadSchema: struct<c1:int> ``` ### Why are the changes needed? Improve query performance. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Unit test. Closes apache#27252 from wangyum/SPARK-29231. Authored-by: Yuming Wang <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>

Revert to HTTP for Maven Central repo, as it seems HTTPS does not nec…

bb7be47

…essarily work

asfgit closed this in abf6714 Mar 23, 2014

srowen deleted the SPARK-1254Fix branch March 25, 2014 15:18

mccheah pushed a commit to mccheah/spark that referenced this pull request Oct 12, 2017

metrics 3.2.2 (apache#209)

7e4bccd

* metrics 3.2.2 * update deps

Igosuki pushed a commit to Adikteev/spark that referenced this pull request Jul 31, 2018

test beta stub universes (apache#209)

a5ffe59

* test beta stub universes * use hdfs in test_hdfs

bzhaoopenstack pushed a commit to bzhaoopenstack/spark that referenced this pull request Sep 11, 2019

Merge pull request apache#209 from animationzl/volume-cleanup

38ea1a3

Clean volume resource after k8s jobs finished to avoid volume quota error

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SPARK-1254. Supplemental fix for HTTPS on Maven Central #209

SPARK-1254. Supplemental fix for HTTPS on Maven Central #209

Uh oh!

srowen commented Mar 23, 2014

Uh oh!

AmplabJenkins commented Mar 23, 2014

Uh oh!

AmplabJenkins commented Mar 23, 2014

Uh oh!

AmplabJenkins commented Mar 23, 2014

Uh oh!

AmplabJenkins commented Mar 23, 2014

Uh oh!

pwendell commented Mar 23, 2014

Uh oh!

pwendell commented Mar 23, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

SPARK-1254. Supplemental fix for HTTPS on Maven Central #209

SPARK-1254. Supplemental fix for HTTPS on Maven Central #209

Uh oh!

Conversation

srowen commented Mar 23, 2014

Uh oh!

AmplabJenkins commented Mar 23, 2014

Uh oh!

AmplabJenkins commented Mar 23, 2014

Uh oh!

AmplabJenkins commented Mar 23, 2014

Uh oh!

AmplabJenkins commented Mar 23, 2014

Uh oh!

pwendell commented Mar 23, 2014

Uh oh!

pwendell commented Mar 23, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants