Matt Robinson d9c035edb1
docs: no more bricks (#1967)
### Summary

We no longer use the "bricks" terminology for partioning functions, etc
in the library. This PR updates various references to bricks within the
repo and the docs. This is just an initial pass to swap the terminology
out, it'll likely be helpful to reorganize the docs a bit as well.

---------

Co-authored-by: qued <64741807+qued@users.noreply.github.com>
Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com>
2023-11-02 09:43:26 -05:00

23 lines
795 B
ReStructuredText

Core Functionality
==================
The ``unstructured`` library includes functions to partition, chunk, clean, and stage
raw source documents. These functions serve as the primary public interfaces within the library.
After reading this section, you should understand the following:
* How to partition a document into json or csv.
* How to remove unwanted content from document elements using cleaning functions.
* How to extract content from a document using the extraction functions.
* How to prepare data for downstream use cases using staging functions
* How to chunk partitioned documents for use cases such as Retrieval Augmented Generation (RAG).
.. toctree::
:maxdepth: 1
core/partition
core/cleaning
core/extracting
core/staging
core/chunking
core/embedding