mirror of
https://github.com/Unstructured-IO/unstructured.git
synced 2025-07-06 16:42:42 +00:00

### Summary We no longer use the "bricks" terminology for partioning functions, etc in the library. This PR updates various references to bricks within the repo and the docs. This is just an initial pass to swap the terminology out, it'll likely be helpful to reorganize the docs a bit as well. --------- Co-authored-by: qued <64741807+qued@users.noreply.github.com> Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com>
23 lines
795 B
ReStructuredText
23 lines
795 B
ReStructuredText
Core Functionality
|
|
==================
|
|
|
|
The ``unstructured`` library includes functions to partition, chunk, clean, and stage
|
|
raw source documents. These functions serve as the primary public interfaces within the library.
|
|
After reading this section, you should understand the following:
|
|
|
|
* How to partition a document into json or csv.
|
|
* How to remove unwanted content from document elements using cleaning functions.
|
|
* How to extract content from a document using the extraction functions.
|
|
* How to prepare data for downstream use cases using staging functions
|
|
* How to chunk partitioned documents for use cases such as Retrieval Augmented Generation (RAG).
|
|
|
|
.. toctree::
|
|
:maxdepth: 1
|
|
|
|
core/partition
|
|
core/cleaning
|
|
core/extracting
|
|
core/staging
|
|
core/chunking
|
|
core/embedding
|