-
Notifications
You must be signed in to change notification settings - Fork 6.1k
8359808: JavaRuntimeURLConnection should only connect to non-directory resources #25871
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
👋 Welcome back david-beaumont! A progress list of the required criteria for merging this PR into |
❗ This change is not yet ready to be integrated. |
@david-beaumont The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preloading a few explanatory comments.
@@ -45,99 +44,79 @@ | |||
*/ | |||
public class JavaRuntimeURLConnection extends URLConnection { | |||
|
|||
// ImageReader to access resources in jimage | |||
private static final ImageReader reader = ImageReaderFactory.getImageReader(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Examination of code and other users shows that this could never be null. Other users also name it "READER".
private final String name; | ||
|
||
// the Resource when connected | ||
private volatile Resource resource; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Resource API was never needed here and adds uncalled methods such as getURL()
which are then implemented with code that could never be invoked.
throw new MalformedURLException(url + " missing path or /"); | ||
if (path.length() == 1) { | ||
this.module = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no reason for the module==null case, since the only thing caring that this might be null just converts null to the empty string anyway.
throw new IOException(module + "/" + name + " not found"); | ||
} | ||
this.resource = node; | ||
super.connected = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know I don't need to use super here, but it documents the fact that this is not a field of this subclass, without readers having to go check. Mutable protected fields are weird and error prone, so I felt calling it out a bit was worth it. Happy to replace with a comment if people feel that's better though.
} | ||
|
||
@Override | ||
public long getContentLengthLong() { | ||
try { | ||
connect(); | ||
return resource.getContentLength(); | ||
return getResourceNode().size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having getResourceNode() return the (lazily fetched) node avoids the reader needing to know/reason about how "connect()" has the side-effect of making the "resource" field non-null.
// URL's path part, and there's no requirement for there to be distinct rules | ||
// about percent encoding. However, we choose to treat the module name is if | ||
// it were a URL authority (since Java package/module names are historically | ||
// strongly associated with internet domain names). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"However, we choose to treat the module name is if URL authority" - I don't think we should be put in the comment as it is confusing to suggest the URL authority component. The URL scheme documented in JEP 220 always put the module and resource name in the URL path component. It's just historical that it allowed for encoding of the resource name, wasn't given enough thought in JDK 9 when this url stream handler was added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to accept rewording here. I do want to pull out that there is a conceptual reason for treating module names like domain authorities though, or just make the code treat the whole path the same.
Having unexplained weirdness like this just ends up being a drain on future maintainers otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to accept rewording here. I do want to pull out that there is a conceptual reason for treating module names like domain authorities though, or just make the code treat the whole path the same. Having unexplained weirdness like this just ends up being a drain on future maintainers otherwise.
I don't disagree on the weirdness but I don't want to mislead readers that this has anything to do with the URL authority component (the jrt scheme does not have this component). However to explain the weirdness requires digging into history, probably the jake repo where the changes for JEP 220 were accumulated before the JEP was integrated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alternative is to permit module names to use percent encoding, since package names (and thus module names) can include non-ASCII characters, and doing so would simplify the code, but it also allows a level of obfuscateability to the URLs which we might not desire (unless we reject any over-encoding of ASCII).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the comment and some TODOs. I'll likely raise a separate PR for it since I suspect we need to end up:
- allowing percent encoding everywhere (to support non-ASCII Unicode).
- disallowing over-encoding (e.g. of '/' or '$', or just any ASCII) to prevent obfuscation.
But as it's a behaviour change, I'd rather separate it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes looks mostly okay, just minor comments on comments, TODOs and naming.
// URL's path part, and there's no requirement for there to be distinct rules | ||
// about percent encoding, and it is likely that any differences between how | ||
// module names and resource names are treated is unintentional. The rules | ||
// about percent encoding may well be tightened up in the future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to drop this paragraph from the comment, it just begs too many questions.
|
||
// the module and resource name in the URL | ||
// ImageReader to access resources in jimage (never null). | ||
private static final ImageReader READER = ImageReaderFactory.getImageReader(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it simpler if you drop "(never null)".
String path = url.getPath(); | ||
if (path.isEmpty() || path.charAt(0) != '/') | ||
// TODO: Allow percent encoding in module names. | ||
// TODO: Consider rejecting URLs with fragments, queries or authority. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer not include all the TODO messages.
// ImageReader to access resources in jimage (never null). | ||
private static final ImageReader READER = ImageReaderFactory.getImageReader(); | ||
|
||
// The module and resource name in the URL ("jrt:/<module-name>/<resource-name>"). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In JEP 220 we use jrt:/[$MODULE[/$PATH]] to make it clear that the resource is optionally.
if (pathSep == -1) { | ||
// No trailing resource path. This can never "connect" or return a | ||
// resource, but might be useful as a representation to pass around. | ||
// The module name *can* be empty here (e.g. "jrt:/") but not null. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the URL scheme from JEP 220 goes into the first comment then it will be clear that the resource is optionally. This will allow the comment here to be reduced.
// the Resource when connected | ||
private volatile Resource resource; | ||
// The resource node (when connected). | ||
private volatile Node resource; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe better to rename to node, resourceNode, or imageNode here.
Simplifying JavaRuntimeURLConnection to avoid accidentally returning non-resource data to users.
This change has the following distinct parts:
Progress
Issue
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25871/head:pull/25871
$ git checkout pull/25871
Update a local copy of the PR:
$ git checkout pull/25871
$ git pull https://git.openjdk.org/jdk.git pull/25871/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 25871
View PR using the GUI difftool:
$ git pr show -t 25871
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25871.diff
Using Webrev
Link to Webrev Comment