This section describes two methods for extracting tables from PDF files.
..note::
To extract tables from any documents, set the ``strategy`` parameter to ``hi_res`` for both methods below.
Method 1: Using `partition_pdf`
-------------------------------
To extract the tables from PDF files using the `partition_pdf <https://unstructured-io.github.io/unstructured/core/partition.html#partition-pdf>`__, set the ``infer_table_structure`` parameter to ``True`` and ``strategy`` parameter to ``hi_res``.
**Usage**
..code-block:: python
from unstructured.partition.pdf import partition_pdf
fname = "example-docs/layout-parser-paper.pdf"
elements = partition_pdf(filename=fname,
infer_table_structure=True,
strategy='hi_res',
)
tables = [el for el in elements if el.category == "Table"]
print(tables[0].text)
print(tables[0].metadata.text_as_html)
Method 2: Using Auto Partition or Unstructured API
By default, table extraction from all file types is enabled. To extract tables from PDFs and images using `Auto Partition <https://unstructured-io.github.io/unstructured/core/partition.html#partition>`__ or `Unstructured API parameters <https://unstructured-io.github.io/unstructured/apis/api_parameters.html>`__ simply set ``strategy`` parameter to ``hi_res``.