Skip to content

cleanup msgpack str/bytes type mess #968

@ThomasWaldmann

Description

@ThomasWaldmann

old msgpack

  • pack with use_bin_type=False: str -> raw, bytes -> raw
  • unpack with raw=True: raw -> bytes (so we never get str back automatically)
  • used by borg 0.x .. 1.2 (and attic)

new msgpack (2.0 spec)

  • pack with use_bin_type=True: str -> text, bytes -> bin
  • unpack with raw=False: text -> str, bin -> bytes
  • text is identical to raw!
  • encoding_errors defaults to strict for both pack and unpack, but guess it could be surrogateescapealso.

msgpack and borg

using old msgpack format is the cause for the ugly bytes-type keys in item dict (item[b'source']), see also #926.

by using an Item class it got a bit prettier a while ago, see #981.

borg 1.2 added borg.helpers.msgpack compatibility wrapper.

Breaking:

  • obviously, serializer output can not be the same as before (repo, manifest, archives, items, key files, RPC stream)
  • old code can't deal with new output
  • PR WIP: implement msgpack wrapper so that bytes and str roundtrip #973 (not merged) could not deal with old repo
  • if same new method is implemented for key files and RPC stream, the new code maybe can't deal with such old key files and old remote borg.

are we lucky?

new packb serialises 'foo' in same way as old packb serialized b'foo', so we can just drop the b everywhere.

>>> packb('string', use_bin_type=False) # old
b'\xa6string'

>>> packb(b'bytes!', use_bin_type=False) # old
b'\xa6bytes!'

>>> packb('string', use_bin_type=True) # new
b'\xa6string'

>>> packb(b'bytes!', use_bin_type=True) # new
b'\xc4\x06bytes!'

we just need to be careful:

  • new packb with bytes: generates different output.
  • new unpackb used with old data: would try to decode to str, which isn't wanted for binary data.
>>> m = packb(b'bytes!', use_bin_type=False)  # m == old bytes data (same for str)

>>> unpackb(m, raw=False) # new, "usual way" -> decodes to str
'bytes!'

>>> unpackb(m, raw=True) # new, "unusual way" -> does not decode, keeps bytes
b'bytes!'

borg 1.3

  • pack use_bin_type = True to generate new data in the future format
  • unpack raw=True + we know what we want and decode to str manually if desired

borg 2.0

  • we only have new data, change to raw=False

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions