feature request #9114
closedHandle malformed URIs in user type
Added by Andreas Müller almost 4 years ago. Updated about 2 years ago.
70%
Description
The fix for #9111 should be moved to a new URI user type wrapper that is used instead of the current URI class. Otherwise the workaround needs to be applied whereever URIs are used.
Related issues
Updated by Andreas Müller almost 4 years ago
- Related to feature request #9111: Handle Malformed URIs added
Updated by Andreas Müller over 3 years ago
- Description updated (diff)
- % Done changed from 0 to 40
Updated by Andreas Müller over 3 years ago
- Status changed from New to In Progress
I implemented a new class eu.etaxonomy.cdm.common.URI and use this in model but also at other places like import sources.
Unfortunately, simply subclassing java.net.URI was not possible as it is final.
A method getJavaUri() exists to transform the above class to standard java.net.URI.
Currently there are still 2 open issues:
- httpinvoker: within httpinvoker there is still a ClassCastException and therefore TaxEditor fails to build. First I thought this is because CdmModelCache needs to be build anew, but running the script does not change the model cache file so this does not seem to be the reason
RdfViewTest.testMarshalRdf does not run successfully and therefore I set it to ignore. The children type dc:identifier/ do not seem to be build correctly.
Updated by Andreas Müller over 3 years ago
- Assignee changed from Andreas Müller to Katja Luther
- Target version changed from Release 5.18 to Release 5.19
Katja, can you please check if you can find out what the httpinvoker problem is. I currently can't start the TaxEditor from the IDE for debugging.
Updated by Andreas Müller over 3 years ago
In #9111#note-1 AK suggests to also encode fragment and query parts. This has not been implemented in the original code in #9111 which has been copied to cdmlib.
I think we should also implement it for fragment and query. Probably not for authority and scheme?
Also the handling of backslash needs to be discussed. In the original version backslash was encoded, too. This is probably not wanted, at least not for the path part. Currently a URISyntaxException is thrown if path contains a backslash.
Updated by Andreas Müller about 3 years ago
- Target version changed from Release 5.19 to Release 5.21
Updated by Katja Luther about 3 years ago
- Target version changed from Release 5.21 to Release 5.22
Updated by Andreas Müller about 3 years ago
- Tags set to fast
This ticket seems to be done or needs to be splitted into an open issues ticket. It can be moved back to 5.19(?) afterwards.
Updated by Katja Luther almost 3 years ago
- Status changed from In Progress to Feedback
- Assignee changed from Katja Luther to Andreas Müller
Andreas Müller wrote:
In #9111#note-1 AK suggests to also encode fragment and query parts. This has not been implemented in the original code in #9111 which has been copied to cdmlib.
I think we should also implement it for fragment and query. Probably not for authority and scheme?
Also the handling of backslash needs to be discussed. In the original version backslash was encoded, too. This is probably not wanted, at least not for the path part. Currently a URISyntaxException is thrown if path contains a backslash.
Did you implement this already?
Updated by Andreas Müller almost 3 years ago
- Assignee changed from Andreas Müller to Katja Luther
Katja Luther wrote:
Andreas Müller wrote:
In #9111#note-1 AK suggests to also encode fragment and query parts. This has not been implemented in the original code in #9111 which has been copied to cdmlib.
I think we should also implement it for fragment and query. Probably not for authority and scheme?
Also the handling of backslash needs to be discussed. In the original version backslash was encoded, too. This is probably not wanted, at least not for the path part. Currently a URISyntaxException is thrown if path contains a backslash.
Did you implement this already?
No, nees to be finished
Updated by Katja Luther almost 3 years ago
- Assignee changed from Katja Luther to Andreas Müller
Andreas Müller wrote:
Katja Luther wrote:
Andreas Müller wrote:
In #9111#note-1 AK suggests to also encode fragment and query parts. This has not been implemented in the original code in #9111 which has been copied to cdmlib.
I think we should also implement it for fragment and query. Probably not for authority and scheme?
Also the handling of backslash needs to be discussed. In the original version backslash was encoded, too. This is probably not wanted, at least not for the path part. Currently a URISyntaxException is thrown if path contains a backslash.
Did you implement this already?
No, nees to be finished
Should I do this? Because you worked on this ticket.
Updated by Andreas Müller almost 3 years ago
- Assignee changed from Andreas Müller to Katja Luther
I only moved the original algorithm to the user type. The remaining issues are about the algorithm. As you implemented the algorithm I thought you will do this.
Or at least give hints why e.g. fragments and query parts were not yet encoded and why backslash was encoded. I didn't want to change the algorithm without feedback on these issues.
Maybe also Andreas K. can give feedback on this.
Updated by Andreas Müller almost 3 years ago
- Target version changed from Release 5.22 to Release 5.25
This still needs feedback so I move to current milestone. Once it is fixed we can move to 5.19 where most of the work was done or keep it in current milestone.
Updated by Andreas Müller almost 3 years ago
- Target version changed from Release 5.25 to Release 5.45
Updated by Katja Luther about 2 years ago
- Assignee changed from Katja Luther to Andreas Kohlbecker
Implemented encoding of fragment part, query part was already implemented. @AK could you have a look and give feedback whether this implementation is ok?
Updated by Andreas Kohlbecker about 2 years ago
- Assignee changed from Andreas Kohlbecker to Katja Luther
According to the URI and URL Schemes defined in the corresponding RFCs:
URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
|------------------ Schema-spezifischer Teil ------------------| https://max:muster@www.example.com:8080/index.html?p1=A&p2=B#ressource \___/ \_/ \____/ \_____________/ \__/\_________/ \_______/ \_______/ | | | | | | | | Schema⁺ | Kennwort Host Port Pfad Query Fragment Benutzer
There is always only one fragment. So it makes not much sense to handle multiple fragments as in our code.
A better algorithm would be to split the path+query+fragment string at the first #
character and to tread the
right part as fragment.
BTW: There is no test method in eu.etaxonomy.cdm.common.UriTest
to test URI strings with fragment.
Updated by Katja Luther about 2 years ago
- Status changed from Feedback to Resolved
- Assignee changed from Katja Luther to Andreas Kohlbecker
Thanks for your comment, I adapted the code and added a test.
Please add a testcase with multiple fragments, like https://max:muster@www.example.com:8080/index.html?p1=A&p2=B#ressource&#another-frag?
Updated by Andreas Kohlbecker about 2 years ago
- Status changed from Resolved to Feedback
- Assignee changed from Andreas Kohlbecker to Katja Luther
Updated by Katja Luther about 2 years ago
- Status changed from Feedback to Resolved
- % Done changed from 40 to 70
Applied in changeset cdmlib|09da5a1bd580baae6490435bc52dddd6d2aab353.
Updated by Katja Luther about 2 years ago
- Status changed from Resolved to Closed
Added a testcase for more than one fragment, the rest was already reviewed, so we can close this ticket.
Updated by Andreas Müller about 2 years ago
- Target version changed from Release 5.45 to Release 5.30
This was fixed in 5.30 already