Skip to content

Move autoharness_analyzer to scripts/ #350

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 6 additions & 4 deletions .github/workflows/kani.yml
Original file line number Diff line number Diff line change
Expand Up @@ -176,12 +176,14 @@ jobs:
# Step 3: Add output to job summary
- name: Add Autoharness Analyzer output to job summary
run: |
pushd scripts/autoharness_analyzer
echo "# Autoharness Failure Summary" >> "$GITHUB_STEP_SUMMARY"
echo "## Crate core, all functions" >> "$GITHUB_STEP_SUMMARY"
cat autoharness_analyzer/core_autoharness_data.md >> "$GITHUB_STEP_SUMMARY"
cat core_autoharness_data.md >> "$GITHUB_STEP_SUMMARY"
echo "## Crate core, unsafe functions" >> "$GITHUB_STEP_SUMMARY"
cat autoharness_analyzer/core_autoharness_data.md >> "$GITHUB_STEP_SUMMARY"
cat core_autoharness_data.md >> "$GITHUB_STEP_SUMMARY"
echo "## Crate std, all functions" >> "$GITHUB_STEP_SUMMARY"
cat autoharness_analyzer/std_autoharness_data.md >> "$GITHUB_STEP_SUMMARY"
cat std_autoharness_data.md >> "$GITHUB_STEP_SUMMARY"
echo "## Crate std, unsafe functions" >> "$GITHUB_STEP_SUMMARY"
cat autoharness_analyzer/std_unsafe_autoharness_data.md >> "$GITHUB_STEP_SUMMARY"
cat std_unsafe_autoharness_data.md >> "$GITHUB_STEP_SUMMARY"
popd
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ Session.vim
/kani_build/
/target
library/target
scripts/autoharness_analyzer/target/
*.rlib
*.rmeta
*.mir
Expand Down
13 changes: 13 additions & 0 deletions scripts/autoharness_analyzer/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
[package]
name = "autoharness_analyzer"
version = "0.1.0"
edition = "2021"

[dependencies]
anyhow = "1.0.97"
clap = {version = "4.5.37", features = ["derive"] }
serde = { version = "1.0.219", features = ["derive"] }
serde_json = "1.0.140"
strum = "0.27.1"
strum_macros = "0.27.1"
to_markdown_table = "0.1.5"
16 changes: 16 additions & 0 deletions scripts/autoharness_analyzer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
Invoke with:
```
cargo run metadata/ scanner_results/
```

The metadata/ folder contains the kani-metadata.json files produced by running:

```
kani autoharness --std ./library -Z function-contracts -Z mem-predicates -Z float-lib -Z c-ffi -Z loop-contracts --only-codegen --no-assert-contracts -j --output-format=terse -Z unstable-options -Z autoharness --cbmc-args --object-bits 12
```

on the standard library. The scanner_results/ directory contains the CSV files that the [scanner tool in Kani](https://github.com/model-checking/kani/tree/main/tools/scanner) produces.

The output is `autoharness_data.md`, which contains Markdown tables summarizing the autoharness application across all the crates in the standard library.

One of the tables has a column for "Skipped Type Categories." Generally speaking, "precise types" are what we think of as actual Rust types, and "type categories" are my subjective sense of how to group those types further. For example, `&mut i32` and `&mut u32` are two precise types, but they're in the same type category `&mut`. See the code for exact details on how we create type categories; the TL;DR is that we have a few hardcoded ones for raw pointers and references, and the rest we create using a fully-qualified path splitting heuristic.
167 changes: 167 additions & 0 deletions scripts/autoharness_analyzer/src/main.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
use anyhow::Result;
use clap::Parser;
use make_tables::compute_metrics;
use parse_scanner_output::process_scan_fns;
use serde::{Deserialize, Serialize};
use serde_json::Value;
use std::{
collections::{BTreeMap, HashMap},
fs,
path::Path,
};
use strum_macros::{Display, EnumString};

mod make_tables;
mod parse_scanner_output;

#[derive(Parser, Debug)]
struct Args {
/// Path to directory with kani_metadata.json files
#[arg(required = true)]
metadata_dir_path: String,

/// Scanner results directory path
#[arg(required = true)]
scanner_results_path: String,

/// Generate per-crate tables instead of combined data
#[arg(long)]
per_crate: bool,

/// Process only the specified crate
#[arg(long, value_name = "CRATE")]
for_crate: Option<String>,

/// Show precise types in the output tables
#[arg(long)]
show_precise_types: bool,

/// Only output data for unsafe functions
#[arg(long)]
unsafe_fns_only: bool,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct AutoHarnessMetadata {
/// Functions we generated automatic harnesses for.
pub chosen: Vec<String>,
/// Map function names to the reason why we did not generate an automatic harness for that function.
/// We use an ordered map so that when we print the data, it is ordered alphabetically by function name.
pub skipped: BTreeMap<String, AutoHarnessSkipReason>,
}

/// Reasons that Kani does not generate an automatic harness for a function.
#[derive(Debug, Clone, Serialize, Deserialize, Display, EnumString)]
pub enum AutoHarnessSkipReason {
/// The function is generic.
#[strum(serialize = "Generic Function")]
GenericFn,
/// A Kani-internal function: already a harness, implementation of a Kani associated item or Kani contract instrumentation functions).
#[strum(serialize = "Kani implementation")]
KaniImpl,
/// At least one of the function's arguments does not implement kani::Arbitrary
/// (The Vec<(String, String)> contains the list of (name, type) tuples for each argument that does not implement it
#[strum(serialize = "Missing Arbitrary implementation for argument(s)")]
MissingArbitraryImpl(Vec<(String, String)>),
/// The function does not have a body.
#[strum(serialize = "The function does not have a body")]
NoBody,
/// The function doesn't match the user's provided filters.
#[strum(serialize = "Did not match provided filters")]
UserFilter,
}

impl Default for AutoHarnessMetadata {
fn default() -> Self {
Self::new()
}
}

impl AutoHarnessMetadata {
pub fn new() -> Self {
Self {
chosen: Vec::new(),
skipped: BTreeMap::new(),
}
}

pub fn extend(&mut self, other: AutoHarnessMetadata) {
self.chosen.extend(other.chosen);
self.skipped.extend(other.skipped);
}
}

fn main() -> Result<()> {
let args = Args::parse();

// Collection of data from all crates
let mut cross_crate_fn_to_row_data = HashMap::new();
let mut cross_crate_autoharness_md = AutoHarnessMetadata::new();

// Iterate over all kani-metadata.json files; one per crate
for entry in fs::read_dir(&args.metadata_dir_path)? {
let entry = entry?;
let path = entry.path();
if !path.to_string_lossy().contains("kani-metadata.json") {
continue;
}

let kani_md_file_data =
std::fs::read_to_string(&path).unwrap_or_else(|_| panic!("Unable to read {:?}", path));
let v: Value = serde_json::from_str(&kani_md_file_data)?;
let crate_name = v["crate_name"].as_str().unwrap();

// Skip if a specific crate was requested and this isn't it
if let Some(ref target_crate) = args.for_crate {
if target_crate != crate_name {
continue;
}
}

println!("Processing crate {crate_name}");

let scanner_fn_csv = format!(
"{}/{}_scan_functions.csv",
args.scanner_results_path, crate_name
);
let scanner_fn_csv_path = Path::new(&scanner_fn_csv);
let fn_to_row_data = process_scan_fns(scanner_fn_csv_path)?;

let autoharness_md: AutoHarnessMetadata =
serde_json::from_value(v["autoharness_md"].clone())?;

if args.per_crate {
// Process each crate separately
compute_metrics(
crate_name,
&autoharness_md,
&fn_to_row_data,
args.show_precise_types,
args.unsafe_fns_only,
)?;
} else if args.for_crate.is_some() {
return compute_metrics(
crate_name,
&autoharness_md,
&fn_to_row_data,
args.show_precise_types,
args.unsafe_fns_only,
);
} else {
cross_crate_fn_to_row_data.extend(fn_to_row_data);
cross_crate_autoharness_md.extend(autoharness_md);
}
}

// Process combined data if not doing per-crate or single-crate analysis
if !args.per_crate {
compute_metrics(
"all_crates",
&cross_crate_autoharness_md,
&cross_crate_fn_to_row_data,
args.show_precise_types,
args.unsafe_fns_only,
)?;
}

Ok(())
}
Loading
Loading