Skip to content

Conversation

@Walnut356
Copy link

@Walnut356 Walnut356 commented Nov 18, 2025

Adds a pretty substantial debug section. There are pages for GDB, but those contain mostly links to GDB's documentation because I don't know a ton about it at the moment. I'm currently working on the debuginfo test rewrite, which is starts with tests for LLDB. Once I get around to writing tests for GDB, I will end up sufficiently knowledgeable to fill out the GDB sections (if they haven't been filled out already by then).

There are 2 somewhat questionable changes:

  1. I have included the raw PDF file of the CodeView 1.0 specification, which is linked to in the intro page. AFAICT Microsoft does not host it anywhere. The sources that do host it have dubious longevity. The direct source of this PDF had their website die for a short while just in the last year. I think it's important that this is preserved and accessible, I used it extensively when working with PDB debug info. If you'd rather not have the full PDF included, I can go ahead and take it out.
  2. The old debug info page was simply moved into the new debug info section (and labelled as lecture notes). Some of the info is outdated, some of it isn't. I didn't feel super comfortable chopping it up and interspersing it in what I wrote since it's directly from a lecture.

Probably resolves #1661

@rustbot
Copy link
Collaborator

rustbot commented Nov 18, 2025

Thanks for the PR. If you have write access, feel free to merge this PR if it does not need reviews. You can request a review using r? rustc-dev-guide or r? <username>.

@rustbot rustbot added the S-waiting-on-review Status: this PR is waiting for a reviewer to verify its content label Nov 18, 2025
@ehuss
Copy link
Contributor

ehuss commented Nov 18, 2025

This is amazing!

May I suggest that when this is merged that you share it, maybe in the #t-compiler Zulip channel to let people know it exists? I imagine there will be people who will find it interesting (even if they aren't directly involved with it).

@Walnut356
Copy link
Author

r? rustc-dev-guide

@Walnut356
Copy link
Author

also probably @Kobzol

@Kobzol
Copy link
Member

Kobzol commented Nov 22, 2025

This is really incredible, thank you very much for writing this down! Tbh I would maybe send this to #t-compiler even before it gets merged, to maybe even find more people that could take a look and review this 😅

Copy link
Contributor

@madsmtm madsmtm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting read, thanks! Always in favour of more documentation!

Maybe CC @tromey, since you held the talk that most of the existing documentation comes from.

optimizations. In short, LLDB attempts to cache the child-values of variables (e.g. struct fields,
array elements) when stepping through code. A heuristic is used to determine which values are safely
cache-able, and `const` is part of that heuristic. Research has not been done into how this would
interact with things like Rust's interrior mutability constructs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
interact with things like Rust's interrior mutability constructs.
interact with things like Rust's interior mutability constructs.

| `(T1, T2)` | `tuple$<T1, T2>`|
| `*const T` | `ptr_const$<T>` |
| `*mut T` | `ptr_mut$<T>` |
| `usize` | `size_t`** |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Maybe use footnotes here ([^footnotename]), I would confuse this for a double pointer to size_t. Or at least some other symbol than *.

Comment on lines +181 to +183
# Source Information

TODO No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Highlighting this, unsure if you intend to finish this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll do my best to get around to the incomplete sections eventually, but it'll probably be in the order of months since I have other priorities.

My major focus was more or less "document everything necessary to understand the visualizer scripts" since right now nobody feels confident reviewing changes to them, and we'll need people who are. These docs are a precursor to rewriting the debug info test suite, which will likely require moderate-to-substantial changes to the scripts.

Most of the WIP sections require a significant time investment for me to do more research before i'd feel comfortable writing about them. Having them marked also means if anyone does know about them, it's obvious it needs to be filled out (rather than it being omitted intentionally).

In this case, visualizers deal almost exclusively with type information, so the source mapping documentation isnt super important in the short term.

@@ -0,0 +1,12 @@
# (WIP) LLVM Codegen
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, highlighting an incomplete section.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I tend towards not having this section at all, and just folding it into rust-codegen.md as a small subsection? And then strongly recommend that people read https://llvm.org/docs/SourceLevelDebugging.html if they want to work with the LLVM parts?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM's docs mostly cover how the IR works, but once this section is complete, it'll ideally cover the actual structure of the relevant code sections. LLVM's code is pretty obtuse and doesnt have a ton of comments. Trying to figure out what happens between rust calling a DIBuilder and lldb handing you a deserialozed node from a SymbolFile is a real headache.

It's not often that we'd need to make changes to LLVM's codegen, but it does come up. It's also relevant for writing TypeSystemRust if we ever end up doing that.

It could be bundled with rustc-codegen, but it's so much less necessary for the day-to-day maintenance i figured it should be split

@@ -0,0 +1,62 @@
# Debugger Visualizers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: I think I'd be interested in a high-level section here about how rust-lldb is configured with the visualizers, as well as a brief overview of how the #![debugger_visualizer] attribute works.

Rust will almost always need to override `unsigned char`, `signed char`, `char`, `u8`, and `i8`, to
(unsigned) decimal format.

## Synthetic Providers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: I'll admit that I skimmed it, but it feels like most of the information in here isn't actually specific to Rust? Maybe it'd make sense to submit this upstream to LLVM instead (in some shape or form)?

(Excepting perhaps the Vec<T> example, but even that could probably be a useful real-life example to have in the LLVM docs.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be completely honest, LLDB is in such deep documentation debt I don't really want to touch it. I am by no means an expert on LLDB. There's lots of weird behavior and huge sections of the codebase that I'm completely oblivious about. I don't think I have it in me right now to write/research with the kind of rigor and formality I'd deem necessary for official documentation.

This section also subtly serves as a "style guide" of sorts for our visualizer scripts going forward. I can make that more explicit though.

workarounds can help, but at the end of the day Rust's needs are secondary compared to making sure
C and C++ compilation and debugging work correctly.

LLDB is receptive to adding a `TypeSystemRust`, but it is a massive undertaking. This section serves
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really true? I wrote one of these and my recollection is that they didn't want it.

If it is true then maybe that work can be resurrected. It's all still on a branch somewhere.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK it is. I've heard second hand from Greg Clayton by way of @davidbarsky that they'd be interested in a TypeSystemRust, even if that means including a rust subdirectory in LLVM's repo to allow for direct compiler integration.


The `TypeSystem` is typically written to have a counterpart that can handle expression parsing. It
requires implementing a few extra functions in the `TypeSystem` interface. The bulk of the
expression parsing code should live in [lldb/source/Plugins/ExpressionParser][expr].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

### Generics

Rust outputs generic *type* information (`T` in `ArrayVec<T, N: usize>`), but not generic *value*
information (`N` in `ArrayVec<T, N: usize>`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a Rust issue for this? DWARF can represent this so it seems like rustc ought to emit it.

@tromey
Copy link
Contributor

tromey commented Nov 23, 2025

Thank you for doing this.

# PDB/CodeView

The primary debug info format for `*-msvc` targets. PDB is a proprietary container format created by
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dunno if worth mentioning but PDB can be also created using for windows-gnu* targets when linking with LLD, by passing --pdb to LLD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-review Status: this PR is waiting for a reviewer to verify its content

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Talk about debuginfo somewhere in the codegen chapter

8 participants