Can this model  produce layout-aware JSON (blocks, bbox, polygons, hierarchy)  ?

Hi everyone 👋

I’m evaluating Chandra OCR for document OCR on scanned PDFs and images, and I had a question about the structure of the output it can produce.

### What I’m trying to achieve

I’m looking for an output format similar to a *layout-aware document DOM*, for example:

- Page → blocks → children hierarchy  
- Explicit `block_type` (Page, SectionHeader, Text, Table, TableCell, etc.)  
- Bounding boxes / polygons for each block  
- HTML serialization per block (paragraphs, tables, headers)  
- Stable IDs like `/page/0/Table/4`  
- Section hierarchy tracking

Example :

```json
{
  "id": "/page/0/Table/4",
  "block_type": "Table",
  "html": "<table>...</table>",
  "bbox": [x1, y1, x2, y2],
  "polygon": [[...]],
  "children": [...]
}
```

Or is Chandra intended to provide semantic OCR only (Markdown / HTML / raw text) without explicit geometry, requiring a separate layout-detection step?
Just want to confirm the intended scope and best practice here.
Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can this model produce layout-aware JSON (blocks, bbox, polygons, hierarchy) ? #60

What I’m trying to achieve

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Can this model produce layout-aware JSON (blocks, bbox, polygons, hierarchy) ? #60

Description

What I’m trying to achieve

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions