well formed unicode string (#147)#151
Conversation
mcollina
left a comment
There was a problem hiding this comment.
How much does this impact our benchmarks?
Overall this sounds correct, even if we don’t have the fixed V8 yet.
|
In $asStringSmall there is one for that loops over input string. I have put surrogate detector condition in this loop. So I think it will has no impact on benchmarks. |
|
Just run https://github.com/fastify/fast-json-stringify/blob/master/bench.js before and after your change. |
|
befor: |
|
after: |
|
You were likely running something else on your machines, as in the “after” all JSON.stringify benchmarks went down ad well. Can you rerun? |
| // magically escape strings for json | ||
| // relying on their charCodeAt | ||
| // everything below 32 needs JSON.stringify() | ||
| // every string that contain surrogate needs JSON.stringify() |
There was a problem hiding this comment.
This is not accurate. Every lone surrogate needs escaping. Surrogates that appear in valid surrogate pairs must not be escaped.
There was a problem hiding this comment.
I know @mathiasbynens
I read your Implementation In c. But I think implementing it in JavaScript would be heavy.
The approach of @mcollina is an heuristic approach not deterministic.
Using of surrogate in keys of object in js is not common.
So I decided to use standard stringify when we encounter such character
There was a problem hiding this comment.
I think we should clarify that comment and write down why we use that heuristics: doing the full algorithm here would be too expensive. However the penalty to try this for lower length strings and falling back to JSON.stringify is ok.
There was a problem hiding this comment.
FWIW, I think this approach makes total sense. Lone surrogates are an edge case after all.
|
Hi @mcollina |
test/surrogate.test.js
Outdated
| const validate = validator(schema) | ||
| const stringify = build(schema) | ||
| const output = stringify('\uDF06\uD834') | ||
| // t.equal(output, '"\\udf06\\ud834"') |
There was a problem hiding this comment.
It would be great to use a mirror test for this.
t.equal(output, JSON.stringify('\uDF06\uD834'))There was a problem hiding this comment.
What do you mean by mirror test?(Excuse me if this question is simple :)) )
There was a problem hiding this comment.
What I put above. Basically you have a target system A. You want to verify that B calls A and returns that output. You call B() and then A() and you compare results. In this way any change of output in A() will be automatically reflected in B() output.
There was a problem hiding this comment.
Thanks. I got it.
|
It seems we are losing a bit: ~/repositories/fast-json-stringify (master) |
I changed $asStringSmall function so It can detect surrogate character.
But in surrogate.test.js if you uncomment line 50 and 65 , test will fail.
I am using node 11.12.0.
Is JSON.Stringify() is up to date or what @mathiasbynens has said in issue 147 is in test phase?