mirror of
https://github.com/Unstructured-IO/unstructured.git
synced 2025-06-27 02:30:08 +00:00

### Description Leverage a similar pattern to what is used for connectors, where there is a nested config dataclass as a field, along with cached content for things like the client and sample embedding for each. This required an update on the embeddings config in ingest and I left a TODO in there because the current approach breaks on other encoders such as bedrock because the parameters in that config don't map to all encoders. But this keeps the existing functionality working. This update makes sure all variables associated with the dataclass exist when it's instantiated rather than being added in the `__post_init__()` method or the `initialize()`, allowing other libraries like pydantic to appropriately generate schemas from it. It also now follows the pattern of the connectors in that each class has a nested config class used to instantiate the client itself as well as a field/property approach used to cache the client.