mirror of
https://github.com/Unstructured-IO/unstructured.git
synced 2025-09-25 16:29:53 +00:00

# Description: Passing `max_pages` argument allows rejecting pdf files which exceeds this page number limit while `high_res` strategy is chosen. By default it will allow parsing pdf files with unlimited number of pages. # Testing: ```python from unstructured.partition.auto import partition elements = partition(filename="unstructured/example-docs/pdf/reliance.pdf", strategy='hi_res') # should pass elements = partition(filename="unstructured/example-docs/pdf/reliance.pdf", strategy='hi_res', max_pages=4) # should pass elements = partition(filename="unstructured/example-docs/pdf/reliance.pdf", strategy='hi_res', max_pages=2) # should raise PdfMaxPagesExceededError ```