In this repository, we host all the data and code related to our paper titled "Replacing Training with Reasoning: Reinterpreting Classic ML Pipelines with LLMs".
Large Language Models (LLMs) are increasingly used in software engineering tasks due to their strong performance across diverse applications. In this paper, we ask a fundamental and novel question: To what extent can LLMs replace traditional machine learning pipelines that rely on labeled data, feature engineering, and retraining? Our intuition is that many long-standing approaches in software engineering can be reimagined through the lens of reasoning rather than training. Unlike conventional pipelines that learn statistical patterns from data, LLMs can directly reason about contextual consistency using their pretrained knowledge. To illustrate this idea, we revisit a well-known anomaly detection pipeline (CHABADA for Android apps) and show how its clustering and retraining stages can be replaced with a simple prompting strategy. The result is a streamlined, zero-shot workflow that leverages semantic reasoning without labeled datasets, feature extraction, or retraining. Our goal is not to propose a new tool, but to highlight a broader paradigm: LLMs open the door to reinterpreting established ML-based workflows as reasoning pipelines. This perspective suggests a path toward lighter-weight, training-free alternatives for many specialized software engineering tasks.
The repository is organized into main directories:
-
π 0_Data
This directory contains all the data needed to run our experiments.
-
π 1_Code
Contains all the code relative to our approach. The code is provided into the form of multiple Jupyter Notebooks to facilitate execution.
To launch the Jupyter Notebooks, you will need various libraries. We provide a requirements.txt file which you can use to create a conda environment.
Follow the steps below:
-
Create a conda environment named
demoEnv:conda create --name demoEnv python=3.8
-
Activate the newly created environment:
conda activate demoEnv
-
Install the required packages using
pipandrequirements.txt:pip install -r requirements.txt
Once these steps are complete, your environment will be set up with all the necessary libraries.
To decompile APKs, ApkTool must be installed on your system. Follow the steps below to set it up:
-
Download ApkTool:
Visit the official ApkTool page at https://ibotpeaches.github.io/Apktool/ and download the latest version. -
Install ApkTool:
Follow the installation instructions for your operating system, which typically involve:- Placing the downloaded JAR file in a suitable directory.
- Adding the ApkTool executable to your system's PATH for easier access.
-
Verify Installation:
Ensure ApkTool is installed correctly by running the following command in your terminal:apktool
This should display the ApkTool usage instructions if the installation was successful.
To execute the entire code, two API keys are required: one for AndroZoo and another for the OpenAI API. These keys should be set in an environment file named .env, which should be placed in the main folder of the provided repository.
The API Keys should be named ANDROZOO_API_KEY and OPENAI_API_KEY.
-
ANDROZOO_API_KEY: This key is necessary to download apps from the AndroZoo Repository, as various operations on the APK files are performed "on-the-fly," such as app download, extraction, and deletion. It can be requested here: https://androzoo.uni.lu/access
-
OPENAI_API_KEY: This key is required to utilize the Embedding models from OpenAI through their official API (https://platform.openai.com/overview).
πΈ Note: Please be aware that using OpenAIβs models may incur costs depending on the volume and type of API usage. Refer to OpenAI's pricing page for details.
The provided Jupyter Notebooks facilitate the execution of our approach. The notebooks should be executed in the order listed here to ensure correct data processing and dependencies are met.
-
1_SensitiveAPIsExtraction.ipynb
Run this notebook first to extract all sensitive APIs invoked by each app using ApkTool and Androguard, based on a permission-to-method mapping from prior work. -
2_AnalysisWithLLM.ipynb
This notebook performs anomaly detection by prompting an LLM to assess whether each sensitive API call aligns with the app's inferred functionality, producing context mismatch scores from 1 (benign) to 5 (suspicious). -
3_AblationStudy.ipynb
This notebook repeats the detection process without generating an intermediate summary of app functionalities, directly evaluating APIs using the raw app description to study the impact of summarization.