-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8365604: Null pointer dereference in src/hotspot/share/adlc/output_h.cpp ArchDesc::declareClasses() #26798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…pp ArchDesc::declareClasses() Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Artem Semenov <[email protected]>.
…cpp LibraryCallKit::inline_vector_gather_scatter() Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Artem Semenov <[email protected]>.
…jfrThreadSampler.cpp JfrThreadSampler::set_period() Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Artem Semenov <[email protected]>.
…p MallocSiteTable::malloc_site() Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Artem Semenov <[email protected]>.
…nterval::split() Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Artem Semenov <[email protected]>.
…CICompileState::JVMCICompileState() Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Artem Semenov <[email protected]>.
…eezeThaw.cpp FreezeBase::finalize_freeze() Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Artem Semenov <[email protected]>.
👋 Welcome back asemenov! A progress list of the required criteria for merging this PR into |
❗ This change is not yet ready to be integrated. |
Webrevs
|
/label add hotspot-jfr |
@dholmes-ora |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some alignment nits where you have added additional condition clauses.
Some of these are difficult to evaluate in isolation and will need review from the specific component areas.
index++, head = (MallocSiteHashtableEntry*)head->next()) {} | ||
assert(head != nullptr, "Invalid position index"); | ||
index++, head = ((MallocSiteHashtableEntry*)head->next() == nullptr) ? head : | ||
(MallocSiteHashtableEntry*)head->next()) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look right to me. We check head != nullptr
in the loop condition so we cannot reach the assignment if it is null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A situation is possible where head becomes nullptr when head->next() returns nullptr on the last iteration. Then, after the loop finishes, assert(head != nullptr) will trigger (only in debug mode), and return head->data() will cause a program error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, is it possible?
Perhaps you could explain how pos_idx is being used in this loop to guard against that happening and why that does not make this safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
head->next()
returns a pointer to _next without any checks.
In turn, the _next pointer is marked as volatile, which means it can be modified at any moment, for example, in another thread.
From this, I conclude that a check in this location is desirable. Moreover, pos_idx is also not being checked. It is quite possible that head->next()
could turn out to be nullptr.
But I don’t mind. If you are sure that there can’t be a nullptr in this place, I will withdraw this patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moreover, pos_idx is also not being checked
I don't know what you mean by this comment. pos_idx
is being checked in the loop test before the call to head->next()
in that same test.
The important question you need to address is why and what that check guarantees. I say you need to address it because you are the one claiming that there is a possible nullptr dereference here without any evidence that it has occurred in practice. If that is based on a correct analysis of the code then you need to explain how we can arrive at a situtation where we hit a null pointer that takes into account the logic of the loop test. So far you have not done so.
n.b. I am not claiming there is no possibility of a nullptr dereference here (although I can form my own opinion). I'm asking you to tell me why I should take your claim that it is possible seriously. Your answers so far are not convincing me that you have understood how this code works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pos_idx receives its value when calling a certain function pos_idx_from_marker(marker), and there is no check before the loop to ensure that it is within the bounds of the _table size.
I mentioned above that I am not insisting on this particular patch. This issue was detected by a static analyzer. After that, based on my limited knowledge, I considered it confirmed… If you have any refutation, please share your thoughts. In that case, I will revert this patch and mark the trigger as “NO FIX REQUIRED”.
As far as I have checked, there are no checks anywhere in the calls to this function to compare the marker with the table or any other entities in any form.
I certainly do not claim to understand this code as well as you or any other member of the hotspot team.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this leads right to the root of the problem I have with this report. As you say, pos_idx does indeed come out of a marker object. It took me about a minute to identify that this marker object is created in the function that sits right above the one your code assistant flagged as problematic -- even though I am not at all familiar with this code. It looks clear to me that, given the right call sequence for calls that create a marker and then consume it here, the check on pos_idx will ensure that we don't drop off the end of the list with a null pointer. So, it looks very liek this code has been designed so that the presence of a marker with a suitable pos_idx is intended to ensure this loop terminates before that happens. I am sure someone in this project knows whether that is the case but it is not you or your coding assistant.
I'm not suggesting that that calling sequence is actually right and that the check for pos_idx will definitely avoid dropping off the end. Indeed, I would welcome a bug report that proved it to be wrong. However, what is clear that both you and your coding assistant have failed to appreciate how some relatively obvious parts of this design actually operate. That renders your (or your tool's) analysis a shallow and unhelpful distraction; using it as an excuse to raise a purported 'issue' in the absence of any evidence of an actual issue is very much a waste of time for this project's reviewers.
Your error is compounded by the fact that you (or more likely your coding assistant) are suggesting changes which, because they are not founded in a correct understanding of the design, could potentially lead to worse outcomes than the speculative nullptr dereference they are intended to remedy -- as I explained when discussing your change to the control flow logic in the ALDC code. So, not only is this report unhelpful it is potentially harmful.
Ultimately the takeaway here is that the OpenJDK bug system is not here to report, review and add patches to remedy issues that you or your code assistant tool invents on the basis of misinformed assumptions. It is here to report, review and add patches to remedy issues that can be shown to actually affect the correct operation of the JVM and JDK,either by a reproducible test or by well-reasoned argument. So, please do not continue to spam the project with bug reports like this simply because a potentially bogus patch will improve your experience with what is clearly a decidedly fallible tool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this leads right to the root of the problem I have with this report. As you say, pos_idx does indeed come out of a marker object. It took me about a minute to identify that this marker object is created in the function that sits right above the one your code assistant flagged as problematic -- even though I am not at all familiar with this code. It looks clear to me that, given the right call sequence for calls that create a marker and then consume it here, the check on pos_idx will ensure that we don't drop off the end of the list with a null pointer. So, it looks very liek this code has been designed so that the presence of a marker with a suitable pos_idx is intended to ensure this loop terminates before that happens. I am sure someone in this project knows whether that is the case but it is not you or your coding assistant.
I'm not suggesting that that calling sequence is actually right and that the check for pos_idx will definitely avoid dropping off the end. Indeed, I would welcome a bug report that proved it to be wrong. However, what is clear that both you and your coding assistant have failed to appreciate how some relatively obvious parts of this design actually operate. That renders your (or your tool's) analysis a shallow and unhelpful distraction; using it as an excuse to raise a purported 'issue' in the absence of any evidence of an actual issue is very much a waste of time for this project's reviewers.
Your error is compounded by the fact that you (or more likely your coding assistant) are suggesting changes which, because they are not founded in a correct understanding of the design, could potentially lead to worse outcomes than the speculative nullptr dereference they are intended to remedy -- as I explained when discussing your change to the control flow logic in the ALDC code. So, not only is this report unhelpful it is potentially harmful.
Ultimately the takeaway here is that the OpenJDK bug system is not here to report, review and add patches to remedy issues that you or your code assistant tool invents on the basis of misinformed assumptions. It is here to report, review and add patches to remedy issues that can be shown to actually affect the correct operation of the JVM and JDK,either by a reproducible test or by well-reasoned argument. So, please do not continue to spam the project with bug reports like this simply because a potentially bogus patch will improve your experience with what is clearly a decidedly fallible tool.
I’m sorry to have taken up your time.
if (arr_type == nullptr || (arr_type != nullptr && !elem_consistent_with_arr(elem_bt, arr_type, false))) { | ||
if (arr_type != nullptr && !elem_consistent_with_arr(elem_bt, arr_type, false)) { | ||
log_if_needed(" ** not supported: arity=%d op=%s vlen=%d etype=%s atype=%s ismask=no", | ||
is_scatter, is_scatter ? "scatter" : "gather", | ||
num_elem, type2name(elem_bt), type2name(arr_type->elem()->array_element_basic_type())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a bug here but I'm not sure it is what you think it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addr_type->isa_aryptr();
might return nullptr, while in elem_consistent_with_arr(elem_bt, arr_type, false)
, arr_type is only checked with an assert.
Moreover, the presence of a check in the original version indicates that arr_type can be null, and there is no protection against this.
/label add hotspot-compiler |
@dholmes-ora |
@dholmes-ora |
I've added some additional mailing lists to ensure better coverage here. Also I think you need to update the JBS (and PR) title to reflect the broader scope of the changes. |
I'm not clear that the original issue is necessarily a bug that needs fixing with a skip to the next else if case. The justification for adding So, the prior test offers no guarantee that in this restricted case a null pointer should or should not be possible. The original design may assume that a successful test for So, rather than just skip to the next case we might need to handle this with an assert and fix whatever code is producing an ideal copy with null fields. Given the level of analysis offered for this case I am suspicious as to whether the other cases tacked onto this issue ought to be accepted at face value without some justification as to why a null check and skip to the next case is correct. I'm also wondering how and why all these cases and associated amendments were arrived at? Is this perhaps based on output from a code analysis tool (perhaps even a genAI tool). If so then I think 1) we ought to be told and 2) we ought to treat its recommendations with a very healthy dose of skepticism. |
Co-authored-by: David Holmes <[email protected]>
Co-authored-by: David Holmes <[email protected]>
n.b. Before accepting any of the changes in this PR I'd really like to know whether they have arisen from reports of an actual null pointer dereference or they are simply derived from some theoretical analysis. In the latter case then I think we would need a better explanation of why an error can happen than we have seen so far. Given that requirement I also think each of the changes should be submitted in its own PR with its own justification. We should not modify control flow logic on the nod. |
Please provide an example of an updated title… |
Most of these mitigations to guard against a possible null pointer dereference are inside |
The defect has been detected and confirmed in the function ArchDesc::declareClasses() located in the file src/hotspot/share/adlc/output_h.cpp with static code analysis. This defect can potentially lead to a null pointer dereference.
The pointer instr->_matrule is dereferenced in line 1952 without checking for nullptr, although earlier in line 1858 the same pointer is checked for nullptr, which indicates that it can be null.
According to this comment, this PR contains fixes for similar cases in other places.
Progress
Issue
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/26798/head:pull/26798
$ git checkout pull/26798
Update a local copy of the PR:
$ git checkout pull/26798
$ git pull https://git.openjdk.org/jdk.git pull/26798/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 26798
View PR using the GUI difftool:
$ git pr show -t 26798
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/26798.diff
Using Webrev
Link to Webrev Comment