Skip to content

Commit 22a3372

Browse files
committed
add scripts and yml files
Signed-off-by: Venky Ganesh <[email protected]>
1 parent 2b27810 commit 22a3372

26 files changed

+9964
-0
lines changed

cpp/CMakeLists.txt

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -328,6 +328,11 @@ endif()
328328

329329
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DBUILD_SYSTEM=cmake_oss ")
330330

331+
# Generate dependency files (.d) to track all header dependencies This creates
332+
# .d files alongside .o files showing all headers used
333+
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -MD -MP")
334+
set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} -MD -MP")
335+
331336
# note: cmake expr generation $<BOOL:${ENABLE_MULTI_DEVICE}> is a build time
332337
# evaluation so hard to debug at cmake time
333338
if(ENABLE_MULTI_DEVICE)

cpp/dependency_scan/README.md

Lines changed: 203 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,203 @@
1+
# CPP Dependency Scanner
2+
3+
Scans TensorRT-LLM build artifacts (headers, libraries, binaries) and maps them to source dependencies.
4+
5+
## Quick Start
6+
7+
```bash
8+
# Run scanner (scans ../build by default)
9+
python scan_build_artifacts.py
10+
11+
# Output: scan_output/known.yml, scan_output/unknown.yml
12+
```
13+
14+
## Usage
15+
16+
```bash
17+
# Custom build directory
18+
python scan_build_artifacts.py --build-dir /path/to/build
19+
20+
# Custom output directory
21+
python scan_build_artifacts.py --output-dir reports/
22+
23+
# Validate YAML files
24+
python scan_build_artifacts.py --validate
25+
```
26+
27+
## Resolution Strategy
28+
29+
1. **dpkg-query**: System packages via Debian package manager
30+
2. **YAML patterns**: Non-dpkg packages (CUDA, TensorRT, PyTorch, etc.)
31+
32+
## Output Format
33+
34+
### known.yml
35+
36+
```yaml
37+
summary:
38+
total_artifacts: 6198
39+
mapped: 6198
40+
unmapped: 0
41+
coverage: "100.0%"
42+
43+
dependencies:
44+
cuda-cudart:
45+
- /usr/local/cuda-12.9/include/cuda_runtime.h
46+
- /usr/local/cuda-12.9/include/cuda.h
47+
48+
libc6:
49+
- /usr/include/stdio.h
50+
- -lpthread
51+
- -ldl
52+
53+
pytorch:
54+
- /usr/local/lib/python3.12/dist-packages/torch/include/torch/torch.h
55+
- -ltorch
56+
```
57+
58+
### unknown.yml
59+
60+
```yaml
61+
summary:
62+
count: 36
63+
action_required: "Add patterns to YAML files in metadata/"
64+
65+
artifacts:
66+
- /build/3rdparty/newlib/include/foo.h
67+
- /build/unknown/libmystery.so
68+
```
69+
70+
## Iterative Workflow
71+
72+
1. **Run scanner** on build directory
73+
2. **Review** `scan_output/unknown.yml` for unmapped artifacts
74+
3. **Add patterns** to `metadata/*.yml` files
75+
4. **Re-run** to verify improved coverage
76+
5. **Repeat** until all artifacts mapped
77+
78+
## Pattern Matching
79+
80+
### Strategy Priority (High → Low)
81+
82+
1. **Exact match**: `libcudart.so.12` → `cuda-cudart`
83+
2. **Path alias**: `/build/pytorch/include/` → `pytorch`
84+
3. **Generic inference**: `libfoobar.so` → `foobar`
85+
86+
### Adding Patterns
87+
88+
Edit existing or create new YAML file in `metadata/`:
89+
90+
```yaml
91+
name: newlib
92+
version: "4.0"
93+
description: Newlib C library for embedded systems
94+
95+
patterns:
96+
- libnewlib.so
97+
98+
linker_flags:
99+
- -lnewlib
100+
101+
path_components:
102+
- newlib
103+
- 3rdparty/newlib
104+
```
105+
106+
See `metadata/_template.yml` and `metadata/README.md` for details.
107+
108+
## YAML Dependencies
109+
110+
Each dependency file contains:
111+
112+
```yaml
113+
name: pytorch
114+
version: "2.0"
115+
description: PyTorch machine learning framework
116+
license: BSD-3-Clause
117+
copyright: Copyright (c) PyTorch Contributors
118+
homepage: https://pytorch.org/
119+
source: pip
120+
121+
patterns:
122+
- libtorch.so
123+
- libc10.so
124+
125+
linker_flags:
126+
- -ltorch
127+
- -lc10
128+
129+
path_components:
130+
- pytorch
131+
- torch
132+
133+
aliases:
134+
- torch
135+
```
136+
137+
Multiple dependencies can be grouped in list format (see `metadata/dpkg.yml`, `metadata/cuda.yml`).
138+
139+
## Testing
140+
141+
```bash
142+
cd tests
143+
python -m pytest test_scan_build_artifacts.py -v
144+
# Expected: 34 passed
145+
```
146+
147+
## Troubleshooting
148+
149+
**Low dpkg coverage**
150+
- Running on non-Debian system
151+
- YAML dependencies will handle more as fallback
152+
153+
**Many unknown artifacts**
154+
1. Review `scan_output/unknown.yml`
155+
2. Add patterns to `metadata/*.yml`
156+
3. Run `--validate` to check syntax
157+
4. Re-scan to verify
158+
159+
**Wrong mappings**
160+
- Check pattern priorities in YAML files
161+
- More specific patterns should be listed first
162+
163+
**Slow performance**
164+
- Use `--build-dir` to target specific subdirectories
165+
- Reduce build artifacts scope
166+
167+
## Architecture
168+
169+
```
170+
scan_build_artifacts.py (1,000 lines)
171+
├── DpkgResolver - dpkg-query for system packages
172+
├── ArtifactCollector - Parse D files, link files, wheels
173+
├── PatternMatcher - 3-tier YAML pattern matching
174+
└── OutputGenerator - Generate YAML reports
175+
```
176+
177+
**Artifact Sources:**
178+
- D files: CMake dependency files with headers
179+
- link.txt: Linker commands with libraries
180+
- Wheels: Python binaries via readelf
181+
182+
**Resolution Flow:**
183+
1. Collect artifacts from build directory
184+
2. Try dpkg-query resolution (PRIMARY)
185+
3. Fall back to YAML patterns (FALLBACK)
186+
4. Generate known.yml and unknown.yml reports
187+
188+
## Files
189+
190+
- `scan_build_artifacts.py` - Main scanner script
191+
- `metadata/*.yml` - Dependency patterns (8 dependencies defined)
192+
- `metadata/_template.yml` - Template for new dependencies
193+
- `metadata/_schema.yml` - YAML validation schema
194+
- `metadata/README.md` - Pattern documentation
195+
- `tests/test_scan_build_artifacts.py` - Unit tests
196+
197+
## Requirements
198+
199+
Python 3.8+ with stdlib only. No external dependencies required.
200+
201+
## License
202+
203+
Same as TensorRT-LLM parent project.

0 commit comments

Comments
 (0)