Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NFC][analyzer] OOB test consolidation II: constraint checking #126748

Merged
merged 4 commits into from
Feb 12, 2025

Conversation

NagyDonat
Copy link
Contributor

This commit heavily refactors out-of-bounds-constraint-check.c:

  1. The complex combinations of several clang_analyzer_eval calls were replaced by clang_analyzer_value, which can directly query the range of a symbol.
  2. Testcases were renamed to a (hopefully) more consistent scheme.
  3. The use of size_t as an argument type was replaced by unsigned long long, which is usually a no-op, but seems to be a better choice if I look for 64u in the output of clang_analyzer_value.
  4. The single "dynamic extent" case was generalized into a full set of tests that use malloc.
  5. Half of the testcases (the ones that don't use malloc) were changed to use an int[5] array instead of a string literal. After this change the tests in this file cover every functionality that was tested by the testcases test_assume_after_access{,2} in the file out-of-bounds.c so I was able to delete those two testcases (and therefore consolidate the validation of these constraints within a single test file).

This is the second commit in a series that reorganizes the tests of security.ArrayBound to system that's easier to understand and maintain. (Note that this file wasn't significantly modified by the recent commit 6e17ed9 which renamed alpha.security.ArrayBoundV2 to security.ArrayBound; but I still felt that this cleanup may be useful.)

This commit heavily refactors `out-of-bounds-constraint-check.c`:
1. The complex combinations of several `clang_analyzer_eval` calls were
   replaced by `clang_analyzer_value`, which can directly query the
   range of a symbol.
2. Testcases were renamed to a (hopefully) more consistent scheme.
3. The use of `size_t` as an argument type was replaced by `unsigned
   long long`, which is usually a no-op, but seems to be a better choice
   if I look for `64u` in the output of `clang_analyzer_value`.
4. The single "dynamic extent" case was generalized into a full set of
   tests that use `malloc`.
5. Half of the testcases (the ones that don't use `malloc`) were changed
   to use an `int[5]` array instead of a string literal. After this
   change the tests in this file cover every functionality that was
   tested by the testcases `test_assume_after_access{,2}` in the file
   `out-of-bounds.c` so I was able to delete those two testcases (and
   therefore consolidate the validation of these constraints within a
   single test file).

This is the second commit in a series that reorganizes the tests of
`security.ArrayBound` to system that's easier to understand and
maintain. (Note that this file wasn't significantly modified by the
recent commit 6e17ed9 which renamed
`alpha.security.ArrayBoundV2` to `security.ArrayBound`; but I still felt
that this cleanup may be useful.)
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Feb 11, 2025
@llvmbot
Copy link
Member

llvmbot commented Feb 11, 2025

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-static-analyzer-1

Author: Donát Nagy (NagyDonat)

Changes

This commit heavily refactors out-of-bounds-constraint-check.c:

  1. The complex combinations of several clang_analyzer_eval calls were replaced by clang_analyzer_value, which can directly query the range of a symbol.
  2. Testcases were renamed to a (hopefully) more consistent scheme.
  3. The use of size_t as an argument type was replaced by unsigned long long, which is usually a no-op, but seems to be a better choice if I look for 64u in the output of clang_analyzer_value.
  4. The single "dynamic extent" case was generalized into a full set of tests that use malloc.
  5. Half of the testcases (the ones that don't use malloc) were changed to use an int[5] array instead of a string literal. After this change the tests in this file cover every functionality that was tested by the testcases test_assume_after_access{,2} in the file out-of-bounds.c so I was able to delete those two testcases (and therefore consolidate the validation of these constraints within a single test file).

This is the second commit in a series that reorganizes the tests of security.ArrayBound to system that's easier to understand and maintain. (Note that this file wasn't significantly modified by the recent commit 6e17ed9 which renamed alpha.security.ArrayBoundV2 to security.ArrayBound; but I still felt that this cleanup may be useful.)


Full diff: https://github.com/llvm/llvm-project/pull/126748.diff

2 Files Affected:

  • (modified) clang/test/Analysis/out-of-bounds-constraint-check.c (+155-104)
  • (modified) clang/test/Analysis/out-of-bounds.c (+1-15)
diff --git a/clang/test/Analysis/out-of-bounds-constraint-check.c b/clang/test/Analysis/out-of-bounds-constraint-check.c
index df48c8c1707138..b89af08cff2229 100644
--- a/clang/test/Analysis/out-of-bounds-constraint-check.c
+++ b/clang/test/Analysis/out-of-bounds-constraint-check.c
@@ -1,112 +1,163 @@
 // RUN: %clang_analyze_cc1 -analyzer-checker=core,unix.Malloc,security.ArrayBound,debug.ExprInspection \
 // RUN:   -analyzer-config eagerly-assume=false -verify %s
 
-void clang_analyzer_eval(int);
-void clang_analyzer_printState(void);
-
-typedef typeof(sizeof(int)) size_t;
-const char a[] = "abcd"; // extent: 5 bytes
-
-void symbolic_size_t_and_int0(size_t len) {
-  (void)a[len + 1]; // no-warning
-  // We infered that the 'len' must be in a specific range to make the previous indexing valid.
-  // len: [0,3]
-  clang_analyzer_eval(len <= 3); // expected-warning {{TRUE}}
-  clang_analyzer_eval(len <= 2); // expected-warning {{UNKNOWN}}
-}
-
-void symbolic_size_t_and_int1(size_t len) {
-  (void)a[len]; // no-warning
-  // len: [0,4]
-  clang_analyzer_eval(len <= 4); // expected-warning {{TRUE}}
-  clang_analyzer_eval(len <= 3); // expected-warning {{UNKNOWN}}
-}
-
-void symbolic_size_t_and_int2(size_t len) {
-  (void)a[len - 1]; // no-warning
-  // len: [1,5]
-  clang_analyzer_eval(1 <= len && len <= 5); // expected-warning {{TRUE}}
-  clang_analyzer_eval(2 <= len);             // expected-warning {{UNKNOWN}}
-  clang_analyzer_eval(len <= 4);             // expected-warning {{UNKNOWN}}
-}
-
-void symbolic_uint_and_int0(unsigned len) {
-  (void)a[len + 1]; // no-warning
-  // len: [0,3]
-  clang_analyzer_eval(0 <= len && len <= 3); // expected-warning {{TRUE}}
-  clang_analyzer_eval(1 <= len);             // expected-warning {{UNKNOWN}}
-  clang_analyzer_eval(len <= 2);             // expected-warning {{UNKNOWN}}
-}
-
-void symbolic_uint_and_int1(unsigned len) {
-  (void)a[len]; // no-warning
-  // len: [0,4]
-  clang_analyzer_eval(0 <= len && len <= 4); // expected-warning {{TRUE}}
-  clang_analyzer_eval(1 <= len);             // expected-warning {{UNKNOWN}}
-  clang_analyzer_eval(len <= 3);             // expected-warning {{UNKNOWN}}
-}
-void symbolic_uint_and_int2(unsigned len) {
-  (void)a[len - 1]; // no-warning
-  // len: [1,5]
-  clang_analyzer_eval(1 <= len && len <= 5); // expected-warning {{TRUE}}
-  clang_analyzer_eval(2 <= len);             // expected-warning {{UNKNOWN}}
-  clang_analyzer_eval(len <= 4);             // expected-warning {{UNKNOWN}}
-}
-
-void symbolic_int_and_int0(int len) {
-  (void)a[len + 1]; // no-warning
-  // len: [-1,3]
-  clang_analyzer_eval(-1 <= len && len <= 3); // expected-warning {{TRUE}}
-  clang_analyzer_eval(0 <= len);              // expected-warning {{UNKNOWN}}
-  clang_analyzer_eval(len <= 2);              // expected-warning {{UNKNOWN}}
-}
-void symbolic_int_and_int1(int len) {
-  (void)a[len]; // no-warning
-  // len: [0,4]
-  clang_analyzer_eval(0 <= len && len <= 4); // expected-warning {{TRUE}}
-  clang_analyzer_eval(1 <= len);             // expected-warning {{UNKNOWN}}
-  clang_analyzer_eval(len <= 3);             // expected-warning {{UNKNOWN}}
-}
-void symbolic_int_and_int2(int len) {
-  (void)a[len - 1]; // no-warning
-  // len: [1,5]
-  clang_analyzer_eval(1 <= len && len <= 5); // expected-warning {{TRUE}}
-  clang_analyzer_eval(2 <= len);             // expected-warning {{UNKNOWN}}
-  clang_analyzer_eval(len <= 4);             // expected-warning {{UNKNOWN}}
-}
-
-void symbolic_longlong_and_int0(long long len) {
-  (void)a[len + 1]; // no-warning
-  // len: [-1,3]
-  clang_analyzer_eval(-1 <= len && len <= 3); // expected-warning {{TRUE}}
-  clang_analyzer_eval(0 <= len);              // expected-warning {{UNKNOWN}}
-  clang_analyzer_eval(len <= 2);              // expected-warning {{UNKNOWN}}
+// When the checker security.ArrayBound encounters an array subscript operation
+// that _may be_ in bounds, it assumes that indexing _is_ in bound. This test
+// file validates these assumptions.
+
+void clang_analyzer_value(int);
+
+// Simple case: memory area with a static extent.
+
+int FiveInts[5] = {1, 2, 3, 4, 5};
+
+void int_plus_one(int len) {
+  (void)FiveInts[len + 1]; // no-warning
+  clang_analyzer_value(len); // expected-warning {{32s:{ [-1, 3] }}}
+}
+
+void int_neutral(int len) {
+  (void)FiveInts[len]; // no-warning
+  clang_analyzer_value(len); // expected-warning {{32s:{ [0, 4] }}}
+}
+
+void int_minus_one(int len) {
+  (void)FiveInts[len - 1]; // no-warning
+  clang_analyzer_value(len); // expected-warning {{32s:{ [1, 5] }}}
+}
+
+void unsigned_plus_one(unsigned len) {
+  (void)FiveInts[len + 1]; // no-warning
+  clang_analyzer_value(len); // expected-warning {{32u:{ [0, 3] }}}
+}
+
+void unsigned_neutral(unsigned len) {
+  (void)FiveInts[len]; // no-warning
+  clang_analyzer_value(len); // expected-warning {{32u:{ [0, 4] }}}
+}
+
+void unsigned_minus_one(unsigned len) {
+  (void)FiveInts[len - 1]; // no-warning
+  clang_analyzer_value(len); // expected-warning {{32u:{ [1, 5] }}}
+}
+
+void ll_plus_one(long long len) {
+  (void)FiveInts[len + 1]; // no-warning
+  clang_analyzer_value(len); // expected-warning {{64s:{ [-1, 3] }}}
+}
+
+void ll_neutral(long long len) {
+  (void)FiveInts[len]; // no-warning
+  clang_analyzer_value(len); // expected-warning {{64s:{ [0, 4] }}}
+}
+
+void ll_minus_one(long long len) {
+  (void)FiveInts[len - 1]; // no-warning
+  clang_analyzer_value(len); // expected-warning {{64s:{ [1, 5] }}}
+}
+
+void ull_plus_one(unsigned long long len) {
+  (void)FiveInts[len + 1]; // no-warning
+  clang_analyzer_value(len); // expected-warning {{64u:{ [0, 3] }}}
+}
+
+void ull_neutral(unsigned long long len) {
+  (void)FiveInts[len]; // no-warning
+  clang_analyzer_value(len); // expected-warning {{64u:{ [0, 4] }}}
+}
+
+void ull_minus_one(unsigned long long len) {
+  (void)FiveInts[len - 1]; // no-warning
+  clang_analyzer_value(len); // expected-warning {{64u:{ [1, 5] }}}
 }
 
+// Also try the same with a dynamically allocated memory block, because in the
+// past there were issues with the type/signedness of dynamic extent symbols.
+
+typedef __typeof(sizeof(int)) size_t;
 void *malloc(size_t);
 void free(void *);
-void symbolic_longlong_and_int0_dynamic_extent(long long len) {
-  char *b = malloc(5);
-  (void)b[len + 1]; // no-warning
-  // len: [-1,3]
-  clang_analyzer_eval(-1 <= len && len <= 3); // expected-warning {{TRUE}}
-  clang_analyzer_eval(0 <= len);              // expected-warning {{UNKNOWN}}
-  clang_analyzer_eval(len <= 2);              // expected-warning {{UNKNOWN}}
-  free(b);
-}
-
-void symbolic_longlong_and_int1(long long len) {
-  (void)a[len]; // no-warning
-  // len: [0,4]
-  clang_analyzer_eval(0 <= len && len <= 4); // expected-warning {{TRUE}}
-  clang_analyzer_eval(1 <= len);             // expected-warning {{UNKNOWN}}
-  clang_analyzer_eval(len <= 3);             // expected-warning {{UNKNOWN}}
-}
-
-void symbolic_longlong_and_int2(long long len) {
-  (void)a[len - 1]; // no-warning
-  // len: [1,5]
-  clang_analyzer_eval(1 <= len && len <= 5); // expected-warning {{TRUE}}
-  clang_analyzer_eval(2 <= len);             // expected-warning {{UNKNOWN}}
-  clang_analyzer_eval(len <= 4);             // expected-warning {{UNKNOWN}}
+
+void dyn_int_plus_one(int len) {
+  char *p = malloc(5);
+  p[len + 1] = 1; // no-warning
+  clang_analyzer_value(len); // expected-warning {{32s:{ [-1, 3] }}}
+  free(p);
+}
+
+void dyn_int_neutral(int len) {
+  char *p = malloc(5);
+  p[len] = 1; // no-warning
+  clang_analyzer_value(len); // expected-warning {{32s:{ [0, 4] }}}
+  free(p);
+}
+
+void dyn_int_minus_one(int len) {
+  char *p = malloc(5);
+  p[len - 1] = 1; // no-warning
+  clang_analyzer_value(len); // expected-warning {{32s:{ [1, 5] }}}
+  free(p);
+}
+
+void dyn_unsigned_plus_one(unsigned len) {
+  char *p = malloc(5);
+  p[len + 1] = 1; // no-warning
+  clang_analyzer_value(len); // expected-warning {{32u:{ [0, 3] }}}
+  free(p);
+}
+
+void dyn_unsigned_neutral(unsigned len) {
+  char *p = malloc(5);
+  p[len] = 1; // no-warning
+  clang_analyzer_value(len); // expected-warning {{32u:{ [0, 4] }}}
+  free(p);
+}
+
+void dyn_unsigned_minus_one(unsigned len) {
+  char *p = malloc(5);
+  p[len - 1] = 1; // no-warning
+  clang_analyzer_value(len); // expected-warning {{32u:{ [1, 5] }}}
+  free(p);
+}
+
+void dyn_ll_plus_one(long long len) {
+  char *p = malloc(5);
+  p[len + 1] = 1; // no-warning
+  clang_analyzer_value(len); // expected-warning {{64s:{ [-1, 3] }}}
+  free(p);
+}
+
+void dyn_ll_neutral(long long len) {
+  char *p = malloc(5);
+  p[len] = 1; // no-warning
+  clang_analyzer_value(len); // expected-warning {{64s:{ [0, 4] }}}
+  free(p);
+}
+
+void dyn_ll_minus_one(long long len) {
+  char *p = malloc(5);
+  p[len - 1] = 1; // no-warning
+  clang_analyzer_value(len); // expected-warning {{64s:{ [1, 5] }}}
+  free(p);
+}
+
+void dyn_ull_plus_one(unsigned long long len) {
+  char *p = malloc(5);
+  p[len + 1] = 1; // no-warning
+  clang_analyzer_value(len); // expected-warning {{64u:{ [0, 3] }}}
+  free(p);
+}
+
+void dyn_ull_neutral(unsigned long long len) {
+  char *p = malloc(5);
+  p[len] = 1; // no-warning
+  clang_analyzer_value(len); // expected-warning {{64u:{ [0, 4] }}}
+  free(p);
+}
+
+void dyn_ull_minus_one(unsigned long long len) {
+  char *p = malloc(5);
+  p[len - 1] = 1; // no-warning
+  clang_analyzer_value(len); // expected-warning {{64u:{ [1, 5] }}}
+  free(p);
 }
diff --git a/clang/test/Analysis/out-of-bounds.c b/clang/test/Analysis/out-of-bounds.c
index 923797200d0b40..9f410e884d7633 100644
--- a/clang/test/Analysis/out-of-bounds.c
+++ b/clang/test/Analysis/out-of-bounds.c
@@ -1,6 +1,4 @@
-// RUN: %clang_analyze_cc1 -Wno-array-bounds -analyzer-checker=core,security.ArrayBound,debug.ExprInspection -verify %s
-
-void clang_analyzer_eval(int);
+// RUN: %clang_analyze_cc1 -Wno-array-bounds -analyzer-checker=core,security.ArrayBound -verify %s
 
 // Tests doing an out-of-bounds access after the end of an array using:
 // - constant integer index
@@ -142,12 +140,6 @@ void test4(int x) {
     buf[x] = 1; // expected-warning{{Out of bound access to memory}}
 }
 
-void test_assume_after_access(unsigned long x) {
-  int buf[100];
-  buf[x] = 1;
-  clang_analyzer_eval(x <= 99); // expected-warning{{TRUE}}
-}
-
 // Don't warn when indexing below the start of a symbolic region's whose
 // base extent we don't know.
 int *get_symbolic(void);
@@ -180,12 +172,6 @@ void test_extern_void(void) {
   p[1] = 42; // no-warning
 }
 
-void test_assume_after_access2(unsigned long x) {
-  char buf[100];
-  buf[x] = 1;
-  clang_analyzer_eval(x <= 99); // expected-warning{{TRUE}}
-}
-
 struct incomplete;
 char test_comparison_with_extent_symbol(struct incomplete *p) {
   // Previously this was reported as a (false positive) overflow error because

@NagyDonat
Copy link
Contributor Author

🤔 This commit would increase the length of this test file. If you feel that some tests are redundant, and could be left out, feel free to suggest so.

Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked at every test, but I like the new format.
To me, it already looks really good.

clang/test/Analysis/out-of-bounds-constraint-check.c Outdated Show resolved Hide resolved
clang/test/Analysis/out-of-bounds-constraint-check.c Outdated Show resolved Hide resolved
clang/test/Analysis/out-of-bounds.c Show resolved Hide resolved
@NagyDonat NagyDonat requested a review from steakhal February 12, 2025 11:10
Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid matching the types of the clang_analyzer_value dumps, but be sure to match the whole range sets there, including their curlies. It's still important to know that it would have a single range associated.

@NagyDonat
Copy link
Contributor Author

Please avoid matching the types of the clang_analyzer_value dumps, but be sure to match the whole range sets there, including their curlies. It's still important to know that it would have a single range associated.

Yes, that's what I was planning. Pushed a commit that removes the type markers, but keeps the curlies.

Also, my previous commit accidentally removed '-Wno-array-bounds', I re-added it.

@NagyDonat NagyDonat merged commit 2577540 into llvm:main Feb 12, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:static analyzer clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants