Skip to content

Commit 6fa3fe8

Browse files
authored
URL: upstream some WebKit IDNA tests
I wanted to add these as I don't think U+1E9E (ẞ) for instance is currently covered. At least one output is Unicode 16 aligned. Tiny bit unclear if they all are. If not, this will be resolved soonish.
1 parent a9fe2e7 commit 6fa3fe8

File tree

1 file changed

+172
-1
lines changed

1 file changed

+172
-1
lines changed

url/resources/toascii.json

+172-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
[
2-
"This resource is focused on highlighting issues with UTS #46 ToASCII",
2+
"This contains assorted IDNA tests that IdnaTestV2 might not cover.",
3+
"Feel free to deduplicate with a clear commit message.",
34
{
45
"comment": "Label with hyphens in 3rd and 4th position",
56
"input": "aa--",
@@ -198,5 +199,175 @@
198199
{
199200
"input": ">\u00AD\u0338",
200201
"output": "xn--hdh"
202+
},
203+
"Tests below are from WebKit (fast/url/idna2003.html & fast/url/idna2008.html; contributed by Chris Weber back in 2011).",
204+
{
205+
"input": "fa\u00DF.de",
206+
"output": "xn--fa-hia.de"
207+
},
208+
{
209+
"input": "\u03B2\u03CC\u03BB\u03BF\u03C2.com",
210+
"output": "xn--nxasmm1c.com"
211+
},
212+
{
213+
"input": "\u0DC1\u0DCA\u200D\u0DBB\u0DD3.com",
214+
"output": "xn--10cl1a0b660p.com"
215+
},
216+
{
217+
"input": "\u0646\u0627\u0645\u0647\u200C\u0627\u06CC.com",
218+
"output": "xn--mgba3gch31f060k.com"
219+
},
220+
{
221+
"input": "www.loo\u0138out.net",
222+
"output": "www.xn--looout-5bb.net"
223+
},
224+
{
225+
"input": "\u15EF\u15EF\u15EF.lookout.net",
226+
"output": "xn--1qeaa.lookout.net"
227+
},
228+
{
229+
"input": "www.lookout.\u0441\u043E\u043C",
230+
"output": "www.lookout.xn--l1adi"
231+
},
232+
{
233+
"input": "www\u2025lookout.net",
234+
"output": null
235+
},
236+
{
237+
"input": "www.lookout\u2027net",
238+
"output": "www.xn--lookoutnet-406e"
239+
},
240+
{
241+
"input": "www.lookout.net\u2A7480",
242+
"output": null
243+
},
244+
{
245+
"input": "www\u00A0.lookout.net",
246+
"output": null
247+
},
248+
{
249+
"input": "\u1680lookout.net",
250+
"output": null
251+
},
252+
{
253+
"input": "\u001flookout.net",
254+
"output": null
255+
},
256+
{
257+
"input": "look\u06DDout.net",
258+
"output": null
259+
},
260+
{
261+
"input": "look\u180Eout.net",
262+
"output": null
263+
},
264+
{
265+
"input": "look\u2060out.net",
266+
"output": "lookout.net"
267+
},
268+
{
269+
"input": "look\uFEFFout.net",
270+
"output": "lookout.net"
271+
},
272+
{
273+
"input": "look\uD83F\uDFFEout.net",
274+
"output": null
275+
},
276+
{
277+
"input": "look\uFFFAout.net",
278+
"output": null
279+
},
280+
{
281+
"input": "look\u2FF0out.net",
282+
"output": null
283+
},
284+
{
285+
"input": "look\u0341out.net",
286+
"output": "xn--looout-kp7b.net"
287+
},
288+
{
289+
"input": "look\u202Eout.net",
290+
"output": null
291+
},
292+
{
293+
"input": "look\u206Bout.net",
294+
"output": null
295+
},
296+
{
297+
"input": "look\uDB40\uDC01out.net",
298+
"output": null
299+
},
300+
{
301+
"input": "look\uDB40\uDC20out.net",
302+
"output": null
303+
},
304+
{
305+
"input": "look\u05BEout.net",
306+
"output": null
307+
},
308+
{
309+
"input": "B\u00FCcher.de",
310+
"output": "xn--bcher-kva.de"
311+
},
312+
{
313+
"input": "\u2665.net",
314+
"output": "xn--g6h.net"
315+
},
316+
{
317+
"input": "\u0378.net",
318+
"output": null
319+
},
320+
{
321+
"input": "\u04C0.com",
322+
"output": null
323+
},
324+
{
325+
"comment": "This is U+2F868 (which is mapped to U+36FC starting with Unicode 16.0)",
326+
"input": "\uD87E\uDC68.com",
327+
"output": "xn--snl.com"
328+
},
329+
{
330+
"input": "\u2183.com",
331+
"output": null
332+
},
333+
{
334+
"input": "look\u034Fout.net",
335+
"output": "lookout.net"
336+
},
337+
{
338+
"input": "gOoGle.com",
339+
"output": "google.com"
340+
},
341+
{
342+
"input": "\u09dc.com",
343+
"output": "xn--15b8c.com"
344+
},
345+
{
346+
"input": "\u1E9E.com",
347+
"output": "xn--zca.com"
348+
},
349+
{
350+
"input": "\u1E9E.foo.com",
351+
"output": "xn--zca.foo.com"
352+
},
353+
{
354+
"input": "-foo.bar.com",
355+
"output": "-foo.bar.com"
356+
},
357+
{
358+
"input": "foo-.bar.com",
359+
"output": "foo-.bar.com"
360+
},
361+
{
362+
"input": "ab--cd.com",
363+
"output": "ab--cd.com"
364+
},
365+
{
366+
"input": "xn--0.com",
367+
"output": null
368+
},
369+
{
370+
"input": "foo\u0300.bar.com",
371+
"output": "xn--fo-3ja.bar.com"
201372
}
202373
]

0 commit comments

Comments
 (0)