52 lines
1.4 KiB
ReStructuredText
Raw Normal View History

Unstructured Core Library
=========================
2022-06-29 14:35:19 -04:00
The ``unstructured`` library is designed to help preprocess structure unstructured text documents
2023-08-28 14:05:48 +02:00
for use in downstream machine learning tasks. Examples of documents that can be processed
2022-06-29 14:35:19 -04:00
using the ``unstructured`` library include PDFs, XML and HTML documents.
Library Documentation
---------------------
:doc:`installing`
Instructions on how to install the ``unstructured`` library on your system.
:doc:`api`
Access all the power of ``unstructured`` through the ``unstructured-api`` or learn to host it locally.
:doc:`core`
Learn more about the core partitioning, chunking, cleaning, and staging functionality within the
Unstructured library.
:doc:`ingest/index`
2023-08-28 14:05:48 +02:00
Connect to your favorite data storage platforms for an effortless batch processing of your files.
:doc:`metadata`
Learn more about how metadata is tracked in the ``unstructured`` library.
2022-06-29 14:35:19 -04:00
:doc:`examples`
Examples of other types of workflows within the ``unstructured`` package.
2022-06-29 14:35:19 -04:00
:doc:`integrations`
We make it easy for you to connect your output with other popular ML services.
2022-06-29 14:35:19 -04:00
2023-09-09 18:54:01 -07:00
:doc:`best_practices`
Learn best practices to optimize document information extraction using ``unstructured`` library.
2022-06-29 14:35:19 -04:00
.. Hidden TOCs
.. toctree::
:caption: Documentation
2022-06-29 14:35:19 -04:00
:maxdepth: 2
:hidden:
introduction
2022-06-29 14:35:19 -04:00
installing
api
core
ingest/index
metadata
2022-06-29 14:35:19 -04:00
examples
integrations
best_practices