unstructured/docs/source/index.rst

Unstructured Core Library
=========================

The ``unstructured`` library is designed to help preprocess and structure unstructured text documents for use in downstream machine learning tasks. Examples of documents that can be processed
using the ``unstructured`` library include PDFs, XML and HTML documents.

Library Documentation
---------------------

:doc:`installing`
  Instructions on how to install the ``unstructured`` library on your system.

:doc:`api`
  Access all the power of ``unstructured`` through the ``unstructured-api`` or learn to host it locally.

:doc:`core`
  Learn more about the core partitioning, chunking, cleaning, and staging functionality within the
  Unstructured library.

:doc:`ingest/index`
  Connect to your favorite data storage platforms for an effortless batch processing of your files.

:doc:`metadata`
  Learn more about how metadata is tracked in the ``unstructured`` library.

:doc:`examples`
  Examples of other types of workflows within the ``unstructured`` package.

:doc:`integrations`
  We make it easy for you to connect your output with other popular ML services.

:doc:`best_practices`
  Learn best practices to optimize document information extraction using ``unstructured`` library.

.. Hidden TOCs

.. toctree::
   :caption: Documentation
   :maxdepth: 2
   :hidden:

   introduction
   installing
   api
   core
   ingest/index
   metadata
   examples
   integrations
   best_practices