Empty parts of an URI and returned strings

Let´s say the URI has this pattern:

scheme:// userinfo@ host /path ?query

each part can be seen individualy(seperated here by spaces ).

We now have the instruction to return null if no userinfo is given and null if no query part is given(for getUserinfo and getQuery) but I am unsure what to return in the following examples:

should getUserinfo now return null bc there is no info or “” bc the @ is there?

should getQuery now return null bc there is no queryor “” bc the ? is there?

should getpath now return “” or “/”

this should return “” if i use gethost and tostring.
“scheme://0001.001.001.001” would this return “0001.001.001.001” ?

Finally is the assumtion correct, that every delimiter is just for matching and will not appear in any returned string except for the / delimiter in path?

The URI scheme://@blabla is valid and so is scheme://blabla. The first userinfo will be the empty string and the second userinfo will be null.

The URI scheme://? is valid and so is scheme://. The first query will be the empty string and the second query will be null.

The path in your example will be /.

On the question of the host. It might be important what type your host is, the toString method of Host and IPv4Address might have different behaviours.

Reading the doc for Host again it seems a bit fishy: @Marcel.Ullrich2 In the interface of “host” it says that toString should:

@return the parsed "host" name or a normalized version for {@link IPv4Address}

Consider this:

Host host = new HostImplemenation("");
IPv4 ip = new IPv4AddressImplementation("");
String s1 = host.toString();
String s2 = ip.toString();

Then s1 should be and s2 should be if the implementation classes follow the specification, right? I.e. no normalization should take place when I call toString on an object of type Host, only on objects of type IPv4Address.


This is correct.

For the explanation of why this is:
Authority can contain userinfo followed by @ or no @ at all.
Therefore, there is userinfo iff. there is an @.
The userinfo on the other hand only consists of the kleene closure of characters or colons. Thus, it can be (and is in this case) empty.

Same for query.

Regarding IPv4:
You can only test using the interfaces and IPv4 addresses are normalized.
Other hosts are not normalized.


So, the string representation of the host of this URI scheme:// should be "", i.e. the empty string, right?

But this fails on a correct implementation:


Yes, the host should be the empty string.
Try to expand your tests more to see the exact error.
Also, try to use more expressive assertions like assertEquals.