This website requires JavaScript.
Explore
Help
Register
Sign In
yujunjun
/
unstructured
Watch
1
Star
0
Fork
0
You've already forked unstructured
mirror of
https://github.com/Unstructured-IO/unstructured.git
synced
2025-07-12 19:45:56 +00:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
unstructured
/
requirements
/
ingest
/
embed-huggingface.in
6 lines
96 B
Plaintext
Raw
Permalink
Normal View
History
Unescape
Escape
feat: update dependencies and remove constraint on pydantic (#2841) ### Description * The `consistent-deps.sh` was fixed to take into account the ingest dependencies, causing some errors to show up. New constriants were added to make that script pass. * Update all requirements without constraint on pydantic, allowing the latest version to be pulled in. * `pikepdf` is causing a conflict but there's a fix on their `main` branch, just need for the next release to be published. Opened up a question here to see if we can get that out any sooner: [Do releases happen on a schedule?](https://github.com/pikepdf/pikepdf/discussions/574). For now added `lxml<5` to the constraints. A couple optimizations: * `constraints.in` renamed to `constraints.txt` since the whole point is all dependencies are already pinned and the file never gets compiled * `constraints.txt` moved to a `requirements/deps` directory as this never gets compiled by `pip-compile` * Other dependency files updated to reference the new location of `base.in` and `constraints.txt` * make file updated since it was originally written to avoid the `base.in` and `constraints.in` file
2024-04-04 15:58:23 -04:00
-c ../deps/constraints.txt
fix: make pip compile (#2015) - add missing make file in ingest folder
2023-11-06 16:26:12 -06:00
-c ../base.txt
feat: Adds local embedding model (#1619) This PR adds a local embedding model option as an alternative to using our OpenAI embedding brick. This brick uses LangChain's HuggingFacEmbeddings.
2023-10-19 11:51:36 -05:00
huggingface
refactor: isolate ingest dependencies into local scopes (#2509) This PR: - Moves ingest dependencies into local scopes to be able to import ingest connector classes without the need of installing imported external dependencies. This allows lightweight use of the classes (not the instances. to use the instances as intended you'll still need the dependencies). - Upgrades the embed module dependencies from `langchain` to `langchain-community` module (to pass CI [rather than introducing a pin]) - Does pip-compile - Does minor refactors in other files to pass `ruff 2.0` checks which were introduced by pip-compile
2024-02-06 21:28:55 +00:00
langchain-community
feat: extend ingest options to support multiple embedding modules, add deterministic ingest test for embeddings (#1918) Closes #1782 This PR: - Extends ingest pipeline so that it is possible to select an embedding provider from a range of providers - Modifies the ingest embedding test to be a diff test, since the embedding vectors are reproducible after supporting multiple providers Additional info on the chosen provider for the test: - Found `langchain.embeddings.HuggingFaceEmbeddings` to be deterministic even when there's no seed set - Took 6.84s to pass a unit test with the provider (without cache, including model download) - `langchain.embeddings.HuggingFaceEmbeddings` runs in local, making it zero cost For all these reasons, testing embedding modules with the Huggingface model seems to be making sense --------- Co-authored-by: cragwolfe <crag@unstructured.io> Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com> Co-authored-by: ahmetmeleq <ahmetmeleq@users.noreply.github.com>
2023-11-06 12:26:12 +00:00
sentence_transformers
Reference in New Issue
Copy Permalink