Skip to content

[analyzer] Fix crash when modelling 'getline' function in checkers #145229

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 24, 2025

Conversation

vbvictor
Copy link
Contributor

@vbvictor vbvictor commented Jun 22, 2025

Fixed crash caused by types mismatch of expected getline and getdelim signatures and actual user-defined signatures.

There are 2 ways to fix the issue:

  1. Make sure that all parameters of CallEvent underlying FunctionDecl has the exact types of unix getline function. This way bloats the code by rigorous checking but may eliminate unnecessary transitions. I didn't find a way to eliminate code duplication, is there a Utils directory for CSA to place such utility functions?

  2. Add more error-handling to CheckGetDelim methods to avoid crushes (as for now, it crushed trying to dereference an invalid optional). However, we will still try to model "incorrect" getline methods, wasting cpu.

I currently implemented 1st solution, but I'm happy to implement 2nd if it's more appropriate for CSA, WDYT?
I will create more PRs for other potential crushes of unix-function if we settle on the way of fixing it.

Addresses #144884.

@vbvictor vbvictor requested review from steakhal and NagyDonat June 22, 2025 11:29
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels Jun 22, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 22, 2025

@llvm/pr-subscribers-clang

Author: Baranov Victor (vbvictor)

Changes

Fixed crush caused by types mismatch of expected getline and getdelim signatures and actual user-defined signatures.

There are 2 ways to fix the issue:

  1. Make sure that all parameters of CallEvent underlying FunctionDecl has the exact types of unix getline function. This way bloats the code by rigorous checking but may eliminate unnecessary transitions. I didn't find a way to eliminate code duplication, is there a Utils directory for CSA to place such utility functions?

  2. Add more error-handling to CheckGetDelim methods to avoid crushes (as for now, it crushed trying to dereference an invalid optional). However, we will still try to model "incorrect" getline methods, wasting cpu.

I currently implemented 1st solution, but I'm happy to implement 2nd if it's more appropriate for CSA, WDYT?
I will create more PRs for other potential crushes of unix-function if we settle on the way of fixing it.

Addresses #144884.


Full diff: https://github.com/llvm/llvm-project/pull/145229.diff

4 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+3)
  • (modified) clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp (+54-2)
  • (modified) clang/lib/StaticAnalyzer/Checkers/UnixAPIChecker.cpp (+50-1)
  • (added) clang/test/Analysis/getline-unixapi-invalid-signatures.c (+130)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 12816eed2e8b5..b6f6dce29a87b 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -1060,6 +1060,9 @@ impact the linker behaviour like the other `-static-*` flags.
 Crash and bug fixes
 ^^^^^^^^^^^^^^^^^^^
 
+- Fixed a crash in ``UnixAPIMisuseChecker`` and ``MallocChecker`` when analyzing
+  code with non-standard ``getline`` or ``getdelim`` function signatures.
+
 Improvements
 ^^^^^^^^^^^^
 
diff --git a/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
index 35e98a5e2719a..b3f0110539ffa 100644
--- a/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
@@ -1482,11 +1482,63 @@ static bool isFromStdNamespace(const CallEvent &Call) {
   return FD->isInStdNamespace();
 }
 
+static bool IsGetDelim(const CallEvent &Call, CheckerContext &C) {
+  const FunctionDecl *FD = dyn_cast_if_present<FunctionDecl>(Call.getDecl());
+  if (!FD || FD->getKind() != Decl::Function)
+    return false;
+
+  const StringRef FName = C.getCalleeName(FD);
+  const bool IsDelim = FName == "getdelim";
+  const bool IsLine = FName == "getline";
+
+  unsigned ExpectedNumArgs = 0;
+  if (IsDelim)
+    ExpectedNumArgs = 4;
+  else if (IsLine)
+    ExpectedNumArgs = 3;
+  else
+    return false;
+
+  if (FD->getNumParams() != ExpectedNumArgs)
+    return false;
+
+  // Should be char**
+  const QualType CharPtrType = FD->getParamDecl(0)->getType()->getPointeeType();
+  if (CharPtrType.isNull())
+    return false;
+  const QualType CharType = CharPtrType->getPointeeType();
+  if (CharType.isNull() || !CharType->isCharType())
+    return false;
+
+  // Should be size_t*
+  const QualType SizePtrType = FD->getParamDecl(1)->getType();
+  if (!SizePtrType->isPointerType())
+    return false;
+  const QualType SizeArgType = SizePtrType->getPointeeType();
+  const ASTContext &Ctx = C.getASTContext();
+  if (SizeArgType.isNull() ||
+      !Ctx.hasSameUnqualifiedType(SizeArgType, Ctx.getSizeType()))
+    return false;
+
+  // For getdelim, should be int
+  if (IsDelim)
+    if (!FD->getParamDecl(2)->getType()->isIntegerType())
+      return false;
+
+  // Last parameter should be FILE*
+  const QualType Pointee =
+      FD->getParamDecl(IsDelim ? 3 : 2)->getType()->getPointeeType();
+  if (Pointee.isNull() || Pointee.getAsString() != "FILE")
+    return false;
+
+  return true;
+}
+
 void MallocChecker::preGetdelim(ProgramStateRef State, const CallEvent &Call,
                                 CheckerContext &C) const {
   // Discard calls to the C++ standard library function std::getline(), which
   // is completely unrelated to the POSIX getline() that we're checking.
-  if (isFromStdNamespace(Call))
+  if (isFromStdNamespace(Call) || !IsGetDelim(Call, C))
     return;
 
   const auto LinePtr = getPointeeVal(Call.getArgSVal(0), State);
@@ -1509,7 +1561,7 @@ void MallocChecker::checkGetdelim(ProgramStateRef State, const CallEvent &Call,
                                   CheckerContext &C) const {
   // Discard calls to the C++ standard library function std::getline(), which
   // is completely unrelated to the POSIX getline() that we're checking.
-  if (isFromStdNamespace(Call))
+  if (isFromStdNamespace(Call) || !IsGetDelim(Call, C))
     return;
 
   // Handle the post-conditions of getline and getdelim:
diff --git a/clang/lib/StaticAnalyzer/Checkers/UnixAPIChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/UnixAPIChecker.cpp
index 79d10d99e11d0..11d82f6de5e62 100644
--- a/clang/lib/StaticAnalyzer/Checkers/UnixAPIChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/UnixAPIChecker.cpp
@@ -147,6 +147,55 @@ ProgramStateRef UnixAPIMisuseChecker::EnsurePtrNotNull(
   return PtrNotNull;
 }
 
+static bool IsGetDelim(const CallEvent &Call, const FunctionDecl *FD,
+                       CheckerContext &C) {
+  const StringRef FName = C.getCalleeName(FD);
+  const bool IsDelim = FName == "getdelim";
+  const bool IsLine = FName == "getline";
+
+  unsigned ExpectedNumArgs = 0;
+  if (IsDelim)
+    ExpectedNumArgs = 4;
+  else if (IsLine)
+    ExpectedNumArgs = 3;
+  else
+    return false;
+
+  if (FD->getNumParams() != ExpectedNumArgs)
+    return false;
+
+  // Should be char**
+  const QualType CharPtrType = FD->getParamDecl(0)->getType()->getPointeeType();
+  if (CharPtrType.isNull())
+    return false;
+  const QualType CharType = CharPtrType->getPointeeType();
+  if (CharType.isNull() || !CharType->isCharType())
+    return false;
+
+  // Should be size_t*
+  const QualType SizePtrType = FD->getParamDecl(1)->getType();
+  if (!SizePtrType->isPointerType())
+    return false;
+  const QualType SizeArgType = SizePtrType->getPointeeType();
+  const ASTContext &Ctx = C.getASTContext();
+  if (SizeArgType.isNull() ||
+      !Ctx.hasSameUnqualifiedType(SizeArgType, Ctx.getSizeType()))
+    return false;
+
+  // For getdelim, should be int
+  if (IsDelim)
+    if (!FD->getParamDecl(2)->getType()->isIntegerType())
+      return false;
+
+  // Last parameter should be FILE*
+  const QualType Pointee =
+      FD->getParamDecl(IsDelim ? 3 : 2)->getType()->getPointeeType();
+  if (Pointee.isNull() || Pointee.getAsString() != "FILE")
+    return false;
+
+  return true;
+}
+
 //===----------------------------------------------------------------------===//
 // "open" (man 2 open)
 //===----------------------------------------------------------------------===/
@@ -176,7 +225,7 @@ void UnixAPIMisuseChecker::checkPreCall(const CallEvent &Call,
   else if (FName == "pthread_once")
     CheckPthreadOnce(C, Call);
 
-  else if (is_contained({"getdelim", "getline"}, FName))
+  else if (IsGetDelim(Call, FD, C))
     CheckGetDelim(C, Call);
 }
 void UnixAPIMisuseChecker::ReportOpenBug(CheckerContext &C,
diff --git a/clang/test/Analysis/getline-unixapi-invalid-signatures.c b/clang/test/Analysis/getline-unixapi-invalid-signatures.c
new file mode 100644
index 0000000000000..ad218937f1184
--- /dev/null
+++ b/clang/test/Analysis/getline-unixapi-invalid-signatures.c
@@ -0,0 +1,130 @@
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix -verify %s -DTEST_CORRECT
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix -verify %s -DTEST_GETLINE_1
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix -verify %s -DTEST_GETLINE_2
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix -verify %s -DTEST_GETLINE_3
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix -verify %s -DTEST_GETLINE_4
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix -verify %s -DTEST_GETLINE_5
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix -verify %s -DTEST_GETLINE_GH144884
+
+// emulator of "system-header-simulator.h" because of redefinition of 'getline' function
+typedef struct _FILE FILE;
+typedef __typeof(sizeof(int)) size_t;
+typedef long ssize_t;
+#define NULL 0
+
+int fclose(FILE *fp);
+FILE *tmpfile(void);
+
+#ifdef TEST_CORRECT
+ssize_t getline(char **lineptr, size_t *n, FILE *stream);
+ssize_t getdelim(char **lineptr, size_t *n, int delimiter, FILE *stream);
+
+void test_correct() {
+  FILE *F1 = tmpfile();
+  if (!F1)
+    return;
+  char *buffer = NULL;
+  getline(&buffer, NULL, F1); // expected-warning {{Size pointer might be NULL}}
+  fclose(F1);
+}
+
+void test_delim_correct() {
+  FILE *F1 = tmpfile();
+  if (!F1)
+    return;
+  char *buffer = NULL;
+  getdelim(&buffer, NULL, ',', F1); // expected-warning {{Size pointer might be NULL}}
+  fclose(F1);
+}
+#endif
+
+#ifdef TEST_GETLINE_1
+// expected-no-diagnostics
+ssize_t getline(int lineptr);
+
+void test() {
+  FILE *F1 = tmpfile();
+  if (!F1)
+    return;
+  int buffer = 0;
+  getline(buffer);
+  fclose(F1);
+}
+#endif
+
+#ifdef TEST_GETLINE_2
+// expected-no-diagnostics
+ssize_t getline(char **lineptr, size_t *n);
+
+void test() {
+  FILE *F1 = tmpfile();
+  if (!F1)
+    return;
+  char *buffer = NULL;
+  getline(&buffer, NULL);
+  fclose(F1);
+}
+#endif
+
+#ifdef TEST_GETLINE_3
+// expected-no-diagnostics
+ssize_t getline(char **lineptr, size_t n, FILE *stream);
+
+void test() {
+  FILE *F1 = tmpfile();
+  if (!F1)
+    return;
+  char *buffer = NULL;
+  getline(&buffer, 0, F1);
+  fclose(F1);
+}
+#endif
+
+#ifdef TEST_GETLINE_4
+// expected-no-diagnostics
+ssize_t getline(char **lineptr, size_t *n, int stream);
+ssize_t getdelim(char **lineptr, size_t *n, int delimiter, int stream);
+
+void test() {
+  FILE *F1 = tmpfile();
+  if (!F1)
+    return;
+  char *buffer = NULL;
+  getline(&buffer, NULL, 1);
+  fclose(F1);
+}
+
+void test_delim() {
+  FILE *F1 = tmpfile();
+  if (!F1)
+    return;
+  char *buffer = NULL;
+  getdelim(&buffer, NULL, ',', 1);
+  fclose(F1);
+}
+#endif
+
+#ifdef TEST_GETLINE_5
+// expected-no-diagnostics
+ssize_t getdelim(char **lineptr, size_t *n, const char* delimiter, FILE *stream);
+
+void test_delim() {
+  FILE *F1 = tmpfile();
+  if (!F1)
+    return;
+  char *buffer = NULL;
+  getdelim(&buffer, NULL, ",", F1);
+  fclose(F1);
+}
+#endif
+
+#ifdef TEST_GETLINE_GH144884
+// expected-no-diagnostics
+struct AW_string {};
+void getline(int *, struct AW_string);
+void top() {
+  struct AW_string line;
+  int getline_file_info;
+  getline(&getline_file_info, line);
+}
+#endif

@llvmbot
Copy link
Member

llvmbot commented Jun 22, 2025

@llvm/pr-subscribers-clang-static-analyzer-1

Author: Baranov Victor (vbvictor)

Changes

Fixed crush caused by types mismatch of expected getline and getdelim signatures and actual user-defined signatures.

There are 2 ways to fix the issue:

  1. Make sure that all parameters of CallEvent underlying FunctionDecl has the exact types of unix getline function. This way bloats the code by rigorous checking but may eliminate unnecessary transitions. I didn't find a way to eliminate code duplication, is there a Utils directory for CSA to place such utility functions?

  2. Add more error-handling to CheckGetDelim methods to avoid crushes (as for now, it crushed trying to dereference an invalid optional). However, we will still try to model "incorrect" getline methods, wasting cpu.

I currently implemented 1st solution, but I'm happy to implement 2nd if it's more appropriate for CSA, WDYT?
I will create more PRs for other potential crushes of unix-function if we settle on the way of fixing it.

Addresses #144884.


Full diff: https://github.com/llvm/llvm-project/pull/145229.diff

4 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+3)
  • (modified) clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp (+54-2)
  • (modified) clang/lib/StaticAnalyzer/Checkers/UnixAPIChecker.cpp (+50-1)
  • (added) clang/test/Analysis/getline-unixapi-invalid-signatures.c (+130)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 12816eed2e8b5..b6f6dce29a87b 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -1060,6 +1060,9 @@ impact the linker behaviour like the other `-static-*` flags.
 Crash and bug fixes
 ^^^^^^^^^^^^^^^^^^^
 
+- Fixed a crash in ``UnixAPIMisuseChecker`` and ``MallocChecker`` when analyzing
+  code with non-standard ``getline`` or ``getdelim`` function signatures.
+
 Improvements
 ^^^^^^^^^^^^
 
diff --git a/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
index 35e98a5e2719a..b3f0110539ffa 100644
--- a/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
@@ -1482,11 +1482,63 @@ static bool isFromStdNamespace(const CallEvent &Call) {
   return FD->isInStdNamespace();
 }
 
+static bool IsGetDelim(const CallEvent &Call, CheckerContext &C) {
+  const FunctionDecl *FD = dyn_cast_if_present<FunctionDecl>(Call.getDecl());
+  if (!FD || FD->getKind() != Decl::Function)
+    return false;
+
+  const StringRef FName = C.getCalleeName(FD);
+  const bool IsDelim = FName == "getdelim";
+  const bool IsLine = FName == "getline";
+
+  unsigned ExpectedNumArgs = 0;
+  if (IsDelim)
+    ExpectedNumArgs = 4;
+  else if (IsLine)
+    ExpectedNumArgs = 3;
+  else
+    return false;
+
+  if (FD->getNumParams() != ExpectedNumArgs)
+    return false;
+
+  // Should be char**
+  const QualType CharPtrType = FD->getParamDecl(0)->getType()->getPointeeType();
+  if (CharPtrType.isNull())
+    return false;
+  const QualType CharType = CharPtrType->getPointeeType();
+  if (CharType.isNull() || !CharType->isCharType())
+    return false;
+
+  // Should be size_t*
+  const QualType SizePtrType = FD->getParamDecl(1)->getType();
+  if (!SizePtrType->isPointerType())
+    return false;
+  const QualType SizeArgType = SizePtrType->getPointeeType();
+  const ASTContext &Ctx = C.getASTContext();
+  if (SizeArgType.isNull() ||
+      !Ctx.hasSameUnqualifiedType(SizeArgType, Ctx.getSizeType()))
+    return false;
+
+  // For getdelim, should be int
+  if (IsDelim)
+    if (!FD->getParamDecl(2)->getType()->isIntegerType())
+      return false;
+
+  // Last parameter should be FILE*
+  const QualType Pointee =
+      FD->getParamDecl(IsDelim ? 3 : 2)->getType()->getPointeeType();
+  if (Pointee.isNull() || Pointee.getAsString() != "FILE")
+    return false;
+
+  return true;
+}
+
 void MallocChecker::preGetdelim(ProgramStateRef State, const CallEvent &Call,
                                 CheckerContext &C) const {
   // Discard calls to the C++ standard library function std::getline(), which
   // is completely unrelated to the POSIX getline() that we're checking.
-  if (isFromStdNamespace(Call))
+  if (isFromStdNamespace(Call) || !IsGetDelim(Call, C))
     return;
 
   const auto LinePtr = getPointeeVal(Call.getArgSVal(0), State);
@@ -1509,7 +1561,7 @@ void MallocChecker::checkGetdelim(ProgramStateRef State, const CallEvent &Call,
                                   CheckerContext &C) const {
   // Discard calls to the C++ standard library function std::getline(), which
   // is completely unrelated to the POSIX getline() that we're checking.
-  if (isFromStdNamespace(Call))
+  if (isFromStdNamespace(Call) || !IsGetDelim(Call, C))
     return;
 
   // Handle the post-conditions of getline and getdelim:
diff --git a/clang/lib/StaticAnalyzer/Checkers/UnixAPIChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/UnixAPIChecker.cpp
index 79d10d99e11d0..11d82f6de5e62 100644
--- a/clang/lib/StaticAnalyzer/Checkers/UnixAPIChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/UnixAPIChecker.cpp
@@ -147,6 +147,55 @@ ProgramStateRef UnixAPIMisuseChecker::EnsurePtrNotNull(
   return PtrNotNull;
 }
 
+static bool IsGetDelim(const CallEvent &Call, const FunctionDecl *FD,
+                       CheckerContext &C) {
+  const StringRef FName = C.getCalleeName(FD);
+  const bool IsDelim = FName == "getdelim";
+  const bool IsLine = FName == "getline";
+
+  unsigned ExpectedNumArgs = 0;
+  if (IsDelim)
+    ExpectedNumArgs = 4;
+  else if (IsLine)
+    ExpectedNumArgs = 3;
+  else
+    return false;
+
+  if (FD->getNumParams() != ExpectedNumArgs)
+    return false;
+
+  // Should be char**
+  const QualType CharPtrType = FD->getParamDecl(0)->getType()->getPointeeType();
+  if (CharPtrType.isNull())
+    return false;
+  const QualType CharType = CharPtrType->getPointeeType();
+  if (CharType.isNull() || !CharType->isCharType())
+    return false;
+
+  // Should be size_t*
+  const QualType SizePtrType = FD->getParamDecl(1)->getType();
+  if (!SizePtrType->isPointerType())
+    return false;
+  const QualType SizeArgType = SizePtrType->getPointeeType();
+  const ASTContext &Ctx = C.getASTContext();
+  if (SizeArgType.isNull() ||
+      !Ctx.hasSameUnqualifiedType(SizeArgType, Ctx.getSizeType()))
+    return false;
+
+  // For getdelim, should be int
+  if (IsDelim)
+    if (!FD->getParamDecl(2)->getType()->isIntegerType())
+      return false;
+
+  // Last parameter should be FILE*
+  const QualType Pointee =
+      FD->getParamDecl(IsDelim ? 3 : 2)->getType()->getPointeeType();
+  if (Pointee.isNull() || Pointee.getAsString() != "FILE")
+    return false;
+
+  return true;
+}
+
 //===----------------------------------------------------------------------===//
 // "open" (man 2 open)
 //===----------------------------------------------------------------------===/
@@ -176,7 +225,7 @@ void UnixAPIMisuseChecker::checkPreCall(const CallEvent &Call,
   else if (FName == "pthread_once")
     CheckPthreadOnce(C, Call);
 
-  else if (is_contained({"getdelim", "getline"}, FName))
+  else if (IsGetDelim(Call, FD, C))
     CheckGetDelim(C, Call);
 }
 void UnixAPIMisuseChecker::ReportOpenBug(CheckerContext &C,
diff --git a/clang/test/Analysis/getline-unixapi-invalid-signatures.c b/clang/test/Analysis/getline-unixapi-invalid-signatures.c
new file mode 100644
index 0000000000000..ad218937f1184
--- /dev/null
+++ b/clang/test/Analysis/getline-unixapi-invalid-signatures.c
@@ -0,0 +1,130 @@
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix -verify %s -DTEST_CORRECT
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix -verify %s -DTEST_GETLINE_1
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix -verify %s -DTEST_GETLINE_2
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix -verify %s -DTEST_GETLINE_3
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix -verify %s -DTEST_GETLINE_4
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix -verify %s -DTEST_GETLINE_5
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix -verify %s -DTEST_GETLINE_GH144884
+
+// emulator of "system-header-simulator.h" because of redefinition of 'getline' function
+typedef struct _FILE FILE;
+typedef __typeof(sizeof(int)) size_t;
+typedef long ssize_t;
+#define NULL 0
+
+int fclose(FILE *fp);
+FILE *tmpfile(void);
+
+#ifdef TEST_CORRECT
+ssize_t getline(char **lineptr, size_t *n, FILE *stream);
+ssize_t getdelim(char **lineptr, size_t *n, int delimiter, FILE *stream);
+
+void test_correct() {
+  FILE *F1 = tmpfile();
+  if (!F1)
+    return;
+  char *buffer = NULL;
+  getline(&buffer, NULL, F1); // expected-warning {{Size pointer might be NULL}}
+  fclose(F1);
+}
+
+void test_delim_correct() {
+  FILE *F1 = tmpfile();
+  if (!F1)
+    return;
+  char *buffer = NULL;
+  getdelim(&buffer, NULL, ',', F1); // expected-warning {{Size pointer might be NULL}}
+  fclose(F1);
+}
+#endif
+
+#ifdef TEST_GETLINE_1
+// expected-no-diagnostics
+ssize_t getline(int lineptr);
+
+void test() {
+  FILE *F1 = tmpfile();
+  if (!F1)
+    return;
+  int buffer = 0;
+  getline(buffer);
+  fclose(F1);
+}
+#endif
+
+#ifdef TEST_GETLINE_2
+// expected-no-diagnostics
+ssize_t getline(char **lineptr, size_t *n);
+
+void test() {
+  FILE *F1 = tmpfile();
+  if (!F1)
+    return;
+  char *buffer = NULL;
+  getline(&buffer, NULL);
+  fclose(F1);
+}
+#endif
+
+#ifdef TEST_GETLINE_3
+// expected-no-diagnostics
+ssize_t getline(char **lineptr, size_t n, FILE *stream);
+
+void test() {
+  FILE *F1 = tmpfile();
+  if (!F1)
+    return;
+  char *buffer = NULL;
+  getline(&buffer, 0, F1);
+  fclose(F1);
+}
+#endif
+
+#ifdef TEST_GETLINE_4
+// expected-no-diagnostics
+ssize_t getline(char **lineptr, size_t *n, int stream);
+ssize_t getdelim(char **lineptr, size_t *n, int delimiter, int stream);
+
+void test() {
+  FILE *F1 = tmpfile();
+  if (!F1)
+    return;
+  char *buffer = NULL;
+  getline(&buffer, NULL, 1);
+  fclose(F1);
+}
+
+void test_delim() {
+  FILE *F1 = tmpfile();
+  if (!F1)
+    return;
+  char *buffer = NULL;
+  getdelim(&buffer, NULL, ',', 1);
+  fclose(F1);
+}
+#endif
+
+#ifdef TEST_GETLINE_5
+// expected-no-diagnostics
+ssize_t getdelim(char **lineptr, size_t *n, const char* delimiter, FILE *stream);
+
+void test_delim() {
+  FILE *F1 = tmpfile();
+  if (!F1)
+    return;
+  char *buffer = NULL;
+  getdelim(&buffer, NULL, ",", F1);
+  fclose(F1);
+}
+#endif
+
+#ifdef TEST_GETLINE_GH144884
+// expected-no-diagnostics
+struct AW_string {};
+void getline(int *, struct AW_string);
+void top() {
+  struct AW_string line;
+  int getline_file_info;
+  getline(&getline_file_info, line);
+}
+#endif

@steakhal
Copy link
Contributor

Ideally, we could reuse the signature matching of the StdLibraryFunctionsChecker. It shouldn't be particularly difficult to hoist it, but also a considerable work by the size of the change.
I think the Checkers library component (or directory) is suitable for it. We could call it Signature.{cpp,h}.

@NagyDonat
Copy link
Contributor

Personally I feel that option 2 ("add more error handling to avoid crashes in checkGetdelim") would be more "natural" for the static analyzer, because checking the exact types of the parameters is rare in the analyzer (I don't know any significant example apart from StdLibraryFunctionsChecker). Not checking the exact types is also useful because this makes the analyzer more robust against slight differences between different standard library implementations (e.g. MSVC headers are notorious for minor awkward differences).

Your concern that "However, we will still try to model "incorrect" getline methods, wasting cpu." is very premature optimization, the runtime change would be totally negligible -- and probably less than the (also negligible) cost of detailed type checking. The majority of the analyzer runtime is focused in a few hot spots, while the majority of the logic is mostly irrelevant from a performance POV.

@vbvictor
Copy link
Contributor Author

Your concern that "However, we will still try to model "incorrect" getline methods, wasting cpu." is very premature optimization, the runtime change would be totally negligible

Indeed, without measurements, this argument is not viable.

I've investigated the code of StdLibraryFunctionsChecker and the part that deals with function signatures can be extracted in separate file without many problems.

For now, I can implement more error-handling as in option 2 and refactor StdLibraryFunctionsChecker later if needed.

@steakhal
Copy link
Contributor

Modeling a function should only happen if that's the intended function. Sometimes this matching is deliberately fuzzy, but most often it's incidentally so and they should have really check the parameter and return types for exact match. Or at least the parameters that the model cares about. Such incidental matches are usually not a big deal, unless they lead to false assumptions thus crashes like in this case.
It's also true that it's just so cumbersome and ugly to match these types that we had an implicit tendency to simply avoid matching these. So if we had convenient tools for matching these types and signatures, the story would look different.

I've investigated the code of StdLibraryFunctionsChecker and the part that deals with function signatures can be extracted in separate file without many problems.

For now, I can implement more error-handling as in option 2 and refactor StdLibraryFunctionsChecker later if needed.

I don't want to push you work, but I'm curious. If it was easy to hoist the Signature class, would it be easier to use this Signature class for implementing the matching code instead of hand-rolling out something custom?

Copy link

github-actions bot commented Jun 23, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@vbvictor
Copy link
Contributor Author

would it be easier to use this Signature class for implementing the matching code instead of hand-rolling out something custom?

It's "easier" to some extent, I think it's more readable and maintainable with full dedicated signature matching. I created a small PR in my LLVM fork to demonstrate how it can look like vbvictor#1.

I updated this PR and implemented option 2. It has more gentle error handling, just avoid crushes. But it still accept all functions that "look like" getline. Note how there are false-positives with absurd functions, like in TEST_GETLINE_2.

I think it's still helpful to have strict function signature matching. Even in option 2 I needed to explicitly check types to avoid crush, so having a dedicated instrument would make it better.

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update 🙂 !

I think the current revision of this PR is a good patch for fixing the crash in a clean and self-contained way. I also looked at the Signature interface (on the private branch that you linked), and it seems to be promising architectural improvement, but I would prefer keeping this bugfix and that internal architectural improvement in two separate commits if that isn't too much trouble.

I marked a few opportunities for minor simplification in inline comments, but once those are handled, I'd happily support merging this commit.

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but wait a bit for @steakhal.

@vbvictor
Copy link
Contributor Author

I would prefer keeping this bugfix and that internal architectural improvement in two separate commits if that isn't too much trouble.

Yes, making Signature class should be a separate refactor change. I think we should come to a consensus in which checks (or a subset of functions inside checks) we want to have strict type-matching

Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good. I think we can simplify some code by using dyn_cast_or_null at a couple of places. I also found a bug.
I think this is heading to the right direction.

@vbvictor vbvictor requested a review from steakhal June 24, 2025 16:02
Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one question remaining, but it's already great.
Thank you for fixing this crash!

@steakhal steakhal changed the title [clang][analyzer] fix crash when modelling 'getline' function in checkers [analyzer] Fix crash when modelling 'getline' function in checkers Jun 24, 2025
@vbvictor
Copy link
Contributor Author

Also updated names of function in MallocChecker to match corresponding functions names in UnixApiChecker

@steakhal steakhal merged commit de2ec22 into llvm:main Jun 24, 2025
6 of 8 checks passed
DrSergei pushed a commit to DrSergei/llvm-project that referenced this pull request Jun 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:static analyzer clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants