Skip to content

rename "ndjson" format to "json" for -f and -i#3555

Merged
nwt merged 4 commits intomainfrom
rename-ndjson-to-json
Feb 3, 2022
Merged

rename "ndjson" format to "json" for -f and -i#3555
nwt merged 4 commits intomainfrom
rename-ndjson-to-json

Conversation

@nwt
Copy link
Member

@nwt nwt commented Feb 2, 2022

The -f and -i format selection flags for Zed tools recognize both "json"
and "ndjson". While the exact behavior for the "json" format is not
easy to describe, "-f json" is roughly equivalent to "-f ndjson"
together with "summarize collect(this) | yield this" while "-i json" is
roughly equivalent to "-i ndjson" together with "over this". These Zed
language constructs weren't available when the "json" was originally
implemented, but now that they are, its specialzed behavior unneeded.

Simplify by renaming the "ndjson" format to "json". More specifically,

  1. Give "json" the current behavior of "ndjson" for the -f, -i, and
    zio/anyio.

  2. Remove "ndjson" as a format recognized by -f, -i, and zio/anyio.

  3. Remove zio/ndjsonio.

  4. Replace zio/jsonio.WriterOpts and its ForceArray option with
    zio/zjsonio.ArrayWriter (for application/json HTTP responses).

The -f and -i format selection flags for Zed tools recognize both "json"
and "ndjson".  While the exact behavior for the "json" format is not
easy to describe, "-f json" is roughly equivalent to "-f ndjson"
together with "summarize collect(this) | yield this" while "-i json" is
roughly equivalent to "-i ndjson" together with "over this".  These Zed
language constructs weren't available when the "json" was originally
implemented, but now that they are, its specialzed behavior unneeded.

Simplify by renaming the "ndjson" format to "json".  More specifically,

1. Give "json" the current behavior of "ndjson" for the -f, -i, and
   zio/anyio.

2. Remove "ndjson" as a format recognized by -f, -i, and zio/anyio.

3. Remove zio/ndjsonio.

4. Replace zio/jsonio.WriterOpts and its ForceArray option with
   zio/zjsonio.ArrayWriter (for application/json HTTP responses).
@nwt nwt requested a review from a team February 2, 2022 22:32
@nwt nwt merged commit 4706509 into main Feb 3, 2022
@nwt nwt deleted the rename-ndjson-to-json branch February 3, 2022 20:40
brim-bot added a commit to brimdata/brimcap that referenced this pull request Feb 3, 2022
…y nwt

This is an auto-generated commit with a Zed dependency update. The Zed PR
brimdata/super#3555, authored by @nwt,
has been merged.

rename "ndjson" format to "json" for -f and -i

The -f and -i format selection flags for Zed tools recognize both "json"
and "ndjson".  While the exact behavior for the "json" format is not
easy to describe, "-f json" is roughly equivalent to "-f ndjson"
together with "summarize collect(this) | yield this" while "-i json" is
roughly equivalent to "-i ndjson" together with "over this".  These Zed
language constructs weren't available when the "json" was originally
implemented, but now that they are, its specialzed behavior unneeded.

Simplify by renaming the "ndjson" format to "json".  More specifically,

1. Give "json" the current behavior of "ndjson" for the -f, -i, and
   zio/anyio.

2. Remove "ndjson" as a format recognized by -f, -i, and zio/anyio.

3. Remove zio/ndjsonio.

4. Replace zio/jsonio.WriterOpts and its ForceArray option with
   zio/zjsonio.ArrayWriter (for application/json HTTP responses).
brim-bot added a commit to brimdata/brimcap that referenced this pull request Feb 3, 2022
This is an auto-generated commit with a Zed dependency update. The Zed PR
brimdata/super#3558, authored by @nwt,
has been merged.

speed up zio/jsonio.Reader with a builder

zio/jsonio.Reader is slow for a few reasons.

1. To normalize record field order, it unmarshals JSON to Go values and
   then marshals back to JSON (via encoding/json).

2. To unmarshal JSON to Zed values, it uses a zio/zsonio.Reader, which
   is not optimized.

3. Each call to its Read method calls zsonio.NewReader, which does a
   number of allocations.

Speed it up by replacing all of that with an optimized builder that
minimizes the number of allocations (though encoding/json.Decoder.Token
still does too many).

Note that unmarshaling via encoding/json can discard precision for
numbers because it unmarshals them into a float64.  Removing this step
accounts for the output change in zio/jsonio/ztests/reader.yaml.

Reading NDJSON with `-f json` is about three times faster for me with this branch compared to main.

Depends on brimdata/super#3555.
brim-bot added a commit to brimdata/brimcap that referenced this pull request Feb 3, 2022
This is an auto-generated commit with a Zed dependency update. The Zed PR
brimdata/super#3558, authored by @nwt,
has been merged.

speed up zio/jsonio.Reader with a builder

zio/jsonio.Reader is slow for a few reasons.

1. To normalize record field order, it unmarshals JSON to Go values and
   then marshals back to JSON (via encoding/json).

2. To unmarshal JSON to Zed values, it uses a zio/zsonio.Reader, which
   is not optimized.

3. Each call to its Read method calls zsonio.NewReader, which does a
   number of allocations.

Speed it up by replacing all of that with an optimized builder that
minimizes the number of allocations (though encoding/json.Decoder.Token
still does too many).

Note that unmarshaling via encoding/json can discard precision for
numbers because it unmarshals them into a float64.  Removing this step
accounts for the output change in zio/jsonio/ztests/reader.yaml.

Reading NDJSON with `-f json` is about three times faster for me with this branch compared to main.

Depends on brimdata/super#3555.
nwt added a commit that referenced this pull request Aug 29, 2022
#3300 moved zio/anyio/ztests/json-array.yaml to zio/anyio/array.yaml,
disabling it.  Move it back to zio/anyio/ztests under the old name and
update it to reflect the change in #3555 to the handling of top-level
arrays in the JSON reader.

#3824 moved compiler/ztests/is-proc-field.yaml and
compiler/ztests/is-proc.yaml to is-field.yaml and is.yaml, disabling
them.  Move them back into compiler/ztests but keep the new names.
nwt added a commit that referenced this pull request Aug 29, 2022
#3300 moved zio/anyio/ztests/json-array.yaml to zio/anyio/array.yaml,
disabling it.  Move it back to zio/anyio/ztests under the old name and
update it to reflect the change in #3555 to the handling of top-level
arrays in the JSON reader.

#3824 moved compiler/ztests/is-proc-field.yaml and
compiler/ztests/is-proc.yaml to is-field.yaml and is.yaml, disabling
them.  Move them back into compiler/ztests but keep the new names.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants