Local evals tutorial fails with generic 'Engine core initialization failed' when Python dev headers are missing

## Summary
Running `tutorial_notebooks/rag-contexteng/rf-tutorial-rag-fiqa.ipynb` locally (Linux, non-Colab) fails at `experiment.run_evals(...)` with:

- `RayTaskError(RuntimeError)`
- `Failed to initialize pipeline: RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}`

The top-level error is not actionable and hides the real root cause.

## Repro
1. Open and run `tutorial_notebooks/rag-contexteng/rf-tutorial-rag-fiqa.ipynb` locally.
2. Execute:
```python
results = experiment.run_evals(
    config_group=config_group,
    dataset=fiqa_dataset,
    num_actors=1,
    num_shards=4,
    seed=42,
)
```
3. Observe actor init failure with generic engine-core message.

## Actual Root Cause (from Ray worker logs)
In Ray worker stderr, the underlying failure is:

`fatal error: Python.h: No such file or directory`

and then Triton/vLLM fails while compiling runtime CUDA helper code:

`CalledProcessError ... /usr/bin/gcc ... -I/usr/include/python3.12 ... returned non-zero exit status 1`

Machine was missing Python dev headers package (`python3.12-dev` / `python3-dev`).

## Why this happens
`vllm` + `torch._inductor` + Triton can compile native/CUDA helper modules at runtime. On Linux, this requires Python C headers (`Python.h`) and compiler toolchain. If headers are missing, model engine init fails.

## Confirmed Fix
Installing Python dev headers resolved the issue:

```bash
sudo apt-get update
sudo apt-get install -y python3.12-dev
```

(Equivalent distro package names apply, e.g. `python3-dev`.)

## Requested Improvements
1. **Preflight dependency checks** before actor/model init in evals mode:
   - verify `Python.h` exists (`sysconfig.get_paths()["include"]/Python.h`)
   - verify compiler exists (`gcc`/`cc`)
   - fail fast with actionable install instructions.
2. **Improve surfaced error message** in `QueryProcessingActor.initialize_for_pipeline` path:
   - include root cause snippet from worker stderr (not only generic “Engine core initialization failed”).
3. **Docs update** for local/tutorial setup (non-Colab):
   - Linux prerequisites should explicitly include Python dev headers and build essentials.
4. **Add CI smoke test** for missing headers scenario:
   - assert user-facing error is clear and actionable.

## Environment
- OS: Linux
- Python: 3.12
- vLLM: 0.10.2
- GPU: NVIDIA L4

## User Impact
This is a first-run blocker for local users and can be mistaken for GPU/vLLM incompatibility. Better guardrails and messaging would significantly improve onboarding and support load.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Local evals tutorial fails with generic 'Engine core initialization failed' when Python dev headers are missing #180

Summary

Repro

Actual Root Cause (from Ray worker logs)

Why this happens

Confirmed Fix

Requested Improvements

Environment

User Impact

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Local evals tutorial fails with generic 'Engine core initialization failed' when Python dev headers are missing #180

Description

Summary

Repro

Actual Root Cause (from Ray worker logs)

Why this happens

Confirmed Fix

Requested Improvements

Environment

User Impact

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions