Skip to content

Conversation

@markhamstra
Copy link

No description provided.

yinxusen and others added 18 commits January 27, 2016 00:32
…vaList

Backport of SPARK-12834 for branch-1.6

Original PR: apache#10772

Original commit message:
We use `SerDe.dumps()` to serialize `JavaArray` and `JavaList` in `PythonMLLibAPI`, then deserialize them with `PickleSerializer` in Python side. However, there is no need to transform them in such an inefficient way. Instead of it, we can use type conversion to convert them, e.g. `list(JavaArray)` or `list(JavaList)`. What's more, there is an issue to Ser/De Scala Array as I said in https://issues.apache.org/jira/browse/SPARK-12780

Author: Xusen Yin <[email protected]>

Closes apache#10941 from jkbradley/yinxusen-SPARK-12834-1.6.
…ith `None` triggers cryptic failure

The error message is now changed from "Do not support type class scala.Tuple2." to "Do not support type class org.json4s.JsonAST$JNull$" to be more informative about what is not supported. Also, StructType metadata now handles JNull correctly, i.e., {'a': None}. test_metadata_null is added to tests.py to show the fix works.

Author: Jason Lee <[email protected]>

Closes apache#8969 from jasoncl/SPARK-10847.

(cherry picked from commit edd4737)
Signed-off-by: Yin Huai <[email protected]>
…#10559 to branch-1.6

SPARK-13082 actually fixed by  apache#10559. However, it's a big PR and not backported to 1.6. This PR just backported the fix of 'read.json(rdd)' to branch-1.6.

Author: Shixiong Zhu <[email protected]>

Closes apache#10988 from zsxwing/json-rdd.
Apparently chrome removed `SVGElement.prototype.getTransformToElement`, which is used by our JS library dagre-d3 when creating edges. The real diff can be found here: andrewor14/dagre-d3@7d6c000, which is taken from the fix in the main repo: cpettitt/dagre-d3@1ef067f

Upstream issue: https://github.com/cpettitt/dagre-d3/issues/202

Author: Andrew Or <[email protected]>

Closes apache#10986 from andrewor14/fix-dag-viz.

(cherry picked from commit 70e69fc)
Signed-off-by: Andrew Or <[email protected]>
…uildPartitionedTableScan

Hello Michael & All:

We have some issues to submit the new codes in the other PR(apache#10299), so we closed that PR and open this one with the fix.

The reason for the previous failure is that the projection for the scan when there is a filter that is not pushed down (the "left-over" filter) could be different, in elements or ordering, from the original projection.

With this new codes, the approach to solve this problem is:

Insert a new Project if the "left-over" filter is nonempty and (the original projection is not empty and the projection for the scan has more than one elements which could otherwise cause different ordering in projection).

We create 3 test cases to cover the otherwise failure cases.

Author: Kevin Yu <[email protected]>

Closes apache#10388 from kevinyu98/spark-12231.

(cherry picked from commit fd50df4)
Signed-off-by: Cheng Lian <[email protected]>
JIRA: https://issues.apache.org/jira/browse/SPARK-12989

In the rule `ExtractWindowExpressions`, we simply replace alias by the corresponding attribute. However, this will cause an issue exposed by the following case:

```scala
val data = Seq(("a", "b", "c", 3), ("c", "b", "a", 3)).toDF("A", "B", "C", "num")
  .withColumn("Data", struct("A", "B", "C"))
  .drop("A")
  .drop("B")
  .drop("C")

val winSpec = Window.partitionBy("Data.A", "Data.B").orderBy($"num".desc)
data.select($"*", max("num").over(winSpec) as "max").explain(true)
```
In this case, both `Data.A` and `Data.B` are `alias` in `WindowSpecDefinition`. If we replace these alias expression by their alias names, we are unable to know what they are since they will not be put in `missingExpr` too.

Author: gatorsmile <[email protected]>
Author: xiaoli <[email protected]>
Author: Xiao Li <[email protected]>

Closes apache#10963 from gatorsmile/seletStarAfterColDrop.

(cherry picked from commit 33c8a49)
Signed-off-by: Michael Armbrust <[email protected]>
ISTM `lib` is better because `datanucleus` jars are located in `lib` for release builds.

Author: Takeshi YAMAMURO <[email protected]>

Closes apache#10901 from maropu/DocFix.

(cherry picked from commit da9146c)
Signed-off-by: Michael Armbrust <[email protected]>
Changed a target at branch-1.6 from apache#10635.

Author: Takeshi YAMAMURO <[email protected]>

Closes apache#10915 from maropu/pr9935-v3.
It is not valid to call `toAttribute` on a `NamedExpression` unless we know for sure that the child produced that `NamedExpression`.  The current code worked fine when the grouping expressions were simple, but when they were a derived value this blew up at execution time.

Author: Michael Armbrust <[email protected]>

Closes apache#11011 from marmbrus/groupByFunction.
Author: Michael Armbrust <[email protected]>

Closes apache#11014 from marmbrus/seqEncoders.

(cherry picked from commit 29d9218)
Signed-off-by: Michael Armbrust <[email protected]>
…ML python models' properties

Backport of [SPARK-12780] for branch-1.6

Original PR for master: apache#10724

This fixes StringIndexerModel.labels in pyspark.

Author: Xusen Yin <[email protected]>

Closes apache#10950 from jkbradley/yinxusen-spark-12780-backport.
I've tried to solve some of the issues mentioned in: https://issues.apache.org/jira/browse/SPARK-12629
Please, let me know what do you think.
Thanks!

Author: Narine Kokhlikyan <[email protected]>

Closes apache#10580 from NarineK/sparkrSavaAsRable.

(cherry picked from commit 8a88e12)
Signed-off-by: Shivaram Venkataraman <[email protected]>
java mapwithstate with Function3 has wrong conversion of java `Optional` to scala `Option`, fixed code uses same conversion used in the mapwithstate call that uses Function4 as an input. `Optional.fromNullable(v.get)` fails if v is `None`, better to use `JavaUtils.optionToOptional(v)` instead.

Author: Gabriele Nizzoli <[email protected]>

Closes apache#11007 from gabrielenizzoli/branch-1.6.
…lumn name duplication

Fixes problem and verifies fix by test suite.
Also - adds optional parameter: nullable (Boolean) to: SchemaUtils.appendColumn
and deduplicates SchemaUtils.appendColumn functions.

Author: Grzegorz Chilkiewicz <[email protected]>

Closes apache#10741 from grzegorz-chilkiewicz/master.

(cherry picked from commit b1835d7)
Signed-off-by: Joseph K. Bradley <[email protected]>
Jira:
https://issues.apache.org/jira/browse/SPARK-13056

Create a map like
{ "a": "somestring", "b": null}
Query like
SELECT col["b"] FROM t1;
NPE would be thrown.

Author: Daoyuan Wang <[email protected]>

Closes apache#10964 from adrian-wang/npewriter.

(cherry picked from commit 358300c)
Signed-off-by: Michael Armbrust <[email protected]>

Conflicts:
	sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
…es missing in the child plan"

This reverts commit 2e99064.
markhamstra added a commit that referenced this pull request Feb 2, 2016
Revert SPARK-13087 workaround and merge upstream fixes
@markhamstra markhamstra merged commit 6a526bf into alteryx:csd-1.6 Feb 2, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.