Commit 5e30cda
authored
feat: Parameter to send custom page range when splitting pdf (#125)
# New parameter
Add a client side param called `split_pdf_page_range` which takes a list
of two integers, `[start_page, end_page]`. If `split_pdf_page` is `True`
and a range is set, slice the doc from `start_page` up to and including
`end_page`. Only this page range will be sent to the API. The subset of
pages is still split up as needed.
# Other changes
Allow our custom hooks to properly access list parameters, so we're able
to intercept `split_pdf_page_range`. We need extra handling to get list
params out of the request in `parse_form_data`, and to rebuild the
payload in `create_request_body`.
# Testing
Check out this branch and set up a request to your local API:
```
client = UnstructuredClient(api_key_auth="", server_url="localhost:8000")
filename = "_sample_docs/layout-parser-paper.pdf"
with open(filename, "rb") as f:
files = shared.Files(
content=f.read(),
file_name=filename,
)
req = shared.PartitionParameters(
files=files,
strategy="fast",
split_pdf_page=True,
split_pdf_page_range=[1, 16],
)
resp = client.general.partition(req)
```
Test out various page ranges and confirm that the returned elements are
within the range. Invalid ranges should throw a ValueError (pages are
out of bounds, or end_page < start_page).1 parent 2d50bdf commit 5e30cda
File tree
11 files changed
+290
-44
lines changed- _test_unstructured_client
- integration
- unit
- src/unstructured_client
- _hooks/custom
11 files changed
+290
-44
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
55 | 55 | | |
56 | 56 | | |
57 | 57 | | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
58 | 62 | | |
59 | 63 | | |
60 | 64 | | |
| |||
110 | 114 | | |
111 | 115 | | |
112 | 116 | | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
113 | 121 | | |
114 | 122 | | |
115 | 123 | | |
| |||
139 | 147 | | |
140 | 148 | | |
141 | 149 | | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
142 | 154 | | |
143 | 155 | | |
144 | 156 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
17 | 21 | | |
18 | 22 | | |
19 | 23 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
110 | 110 | | |
111 | 111 | | |
112 | 112 | | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| 12 | + | |
12 | 13 | | |
13 | 14 | | |
14 | 15 | | |
| |||
122 | 123 | | |
123 | 124 | | |
124 | 125 | | |
125 | | - | |
| 126 | + | |
| 127 | + | |
126 | 128 | | |
127 | 129 | | |
128 | 130 | | |
| |||
133 | 135 | | |
134 | 136 | | |
135 | 137 | | |
| 138 | + | |
136 | 139 | | |
137 | 140 | | |
138 | 141 | | |
139 | 142 | | |
140 | 143 | | |
141 | | - | |
142 | | - | |
143 | | - | |
144 | | - | |
145 | | - | |
146 | | - | |
147 | 144 | | |
148 | 145 | | |
149 | | - | |
150 | | - | |
151 | | - | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
152 | 154 | | |
153 | 155 | | |
154 | 156 | | |
155 | | - | |
156 | | - | |
| 157 | + | |
| 158 | + | |
157 | 159 | | |
158 | 160 | | |
159 | 161 | | |
| |||
164 | 166 | | |
165 | 167 | | |
166 | 168 | | |
167 | | - | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
168 | 173 | | |
169 | 174 | | |
170 | 175 | | |
| |||
191 | 196 | | |
192 | 197 | | |
193 | 198 | | |
194 | | - | |
| 199 | + | |
| 200 | + | |
195 | 201 | | |
196 | 202 | | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
197 | 228 | | |
198 | | - | |
| 229 | + | |
199 | 230 | | |
200 | 231 | | |
201 | 232 | | |
| |||
204 | 235 | | |
205 | 236 | | |
206 | 237 | | |
| 238 | + | |
207 | 239 | | |
208 | 240 | | |
209 | 241 | | |
| |||
212 | 244 | | |
213 | 245 | | |
214 | 246 | | |
| 247 | + | |
215 | 248 | | |
216 | 249 | | |
217 | 250 | | |
| |||
366 | 399 | | |
367 | 400 | | |
368 | 401 | | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
| 417 | + | |
| 418 | + | |
| 419 | + | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
0 commit comments