Skip to content

Conversation

@gdiet
Copy link
Contributor

@gdiet gdiet commented May 14, 2018

This PR is continuing the idea from PR #303 as follows:

When using kryo/chill for long term storage, it is necessary that classes are serialized in a compatible way when updating to a newer Kryo or Scala version. To ensure this, I have added a spec ensuring that the binary representation of serialized classes is not changed unintendedly due to changes in Kryo or Scala. If changes to the binary representation can't be avoided, at least a compatibility note should be added to the release notes so projects using the library know beforehand whether it's safe to upgrade or not.

The spec I have added is still incomplete and fails for Scala 2.10. This PR is to discuss the spec and its findings:

  1. Does this spec in general make sense to you? Otherwise, I would not continue to work on it, would be wasted time.

  2. For Scala 2.10, this spec fails, because (see lines 47ff) the binary representation of Wrappers$IteratorWrapper has changed between Scala 2.10.7 and Scala 2.11.12. The binary representation in line 51 was created with Scala 2.12.4 and can't be decoded with Scala 2.10.7, a forward compatibility problem. (Note that I have not yet been able to find any backward compatibility problem up to now.) For this specific case, I could write the test in a way that this example is skipped when running on Scala < 2.11. Would that be OK from your point of view?

Please comment ...

@johnynek
Copy link
Collaborator

Thanks for the PR. This looks like a fine thing to add to me.

That said, using kryo for long term storage but depending on a remote library to set the order of the tokens seems really dangerous.

I wouldn't use kryo for long term storage full stop....

But I am happy to take a PR so we don't needlessly break people.

@johnynek
Copy link
Collaborator

Thanks! this looks good.

@johnynek johnynek merged commit 84ab358 into twitter:develop Jun 18, 2018
@gdiet
Copy link
Contributor Author

gdiet commented Jun 19, 2018

Here are some findings of this PR - for future reference. Currently, there are already some known issues when using kryo/chill for long term persisting data:

  1. Serializing SimpleDateFormat("") yields about 40.000 bytes of binary representation, and creating another instance and serializing it yields a slightly different binary representation - there might be more issues hidden there...

  2. The serialized form of the following classes differs between Scala 2.12.x and earlier Scala versions:

Compard to Scala 2.10:

scala.collection.convert.Wrappers$IteratorWrapper
Example: JavaConverters.asJavaIteratorConverter(Iterator(2)).asJava

scala.collection.immutable.Range$Inclusive
Example: new Range.Inclusive(3, 6, 1)

scala.collection.immutable.Range
Example: collection.immutable.Range(1, 3)

Compard to Scala 2.11:

scala.collection.immutable.Range$Inclusive
Example: new Range.Inclusive(3, 6, 1)

scala.collection.immutable.Range
Example: collection.immutable.Range(1, 3)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants