Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> To fix the ambiguity, brackets were introduced

literals were introduced because the order of parsing for an email host is first "Domain" for any non literal, then literal which defaults to IPv4 [127.0.0.1], then a literal prefix was added for IPv6 and any future registered protocol "[IPv6:::]"

the order for parsing for a URI is:

// host = IP-literal / IPv4address / reg-name

// IP-literal = "[" ( IPv6address / IPvFuture ) "]"

ipv6 just happens to use a colon which conflicts with the port delimiter from authority in a URI so it's a literal and not a registered name

// [ userinfo "@" ] host [ ":" port ]

> why not re-use the dot from IPv4 notation

because you have conflicts from "0.0 -> 0.0.0.0" to "255.16777215 -> 255.255.255.255"

0-9 conflicts with an IPv4 decimal

a-f conflicts with GTLDs

the only reason your blobs don't have a conflict with an IPv4 Historic is because hexadecimal notation starts with 0x

> try double clicking on those

try double clicking on any of these valid characters from "reg-name"

// unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"

// sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "" / "+" / "," / ";" / "="

or these from IPvFuture

// IPvFuture = "v" 1HEXDIG "." 1( unreserved / sub-delims / ":" )

// unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"

// sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "" / "+" / "," / ";" / "="

if you want to develop your own "standard" either use the literal IPvFuture, or use a Registered Name

non literals IPv4, IPv4Historic, and Domain names are valid registered names, but domain names aren't even part of the URI standard

the only reason you would have conflicts with domain names is because they're de facto parsed after an IP, so a double dot would probably be discarded as invalid, which is why punycode exists for unicode

if at that point you didn't have any conflicts it would be a registered name, but you wouldn't have any way to resolve them

lastly, if you want to fix the nonissue of double clicking use a registered name, if you chose to use underscore you may have conflicts with dns

edit trying to figure out newline parsing



> ipv6 just happens to use a colon which conflicts with the port delimiter from authority in a URI

This is exactly what the article means by

> To fix the ambiguity, brackets were introduced

The addition of brackets disambiguates the grammar.

> the order for parsing for a URI is:

> // host = IP-literal / IPv4address / reg-name

No, that's a part of the grammar; it only means that a host is either an IP-literal, an IPv4address, or a reg-name; it does not imply any sort of ordering to those rules. Normally, the grammar should be unambiguous. Unfortunately here, the grammar for IPv4address and reg-name actually are ambiguous; I'll get to that.

> the only reason you would have conflicts with domain names is because they're de facto parsed after an IP

It's not defacto. It's in the same standard,

> The syntax rule for host is ambiguous because it does not completely distinguish between an IPv4address and a reg-name. In order to disambiguate the syntax, we apply the "first-match-wins" algorithm: If host matches the rule for IPv4address, then it should be considered an IPv4 address literal and not a reg-name.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: