unstructured

mirror of https://github.com/Unstructured-IO/unstructured.git synced 2025-12-31 01:03:30 +00:00

Author	SHA1	Message	Date
cragwolfe	76213ecba7	build(fixtures-update): all CI jobs on smaller worker (#1934 )	2023-10-28 21:19:51 -07:00
cragwolfe	4e669d419f	build(fixtures-update): run on smaller worker (#1932 ) Run fixtures-update workflow on smaller github runner until larger one is available again.	2023-10-27 20:36:05 -07:00
cragwolfe	22b3edb226	build: re-enable ingest on normal CI workers (#1931 ) temporarily, until large workers are working again.	2023-10-27 19:46:40 -07:00
qued	450e7f0614	build: streamline ci (#1909 ) Updated CI to shave time off in some conditions with no real downside. - When the base cache already exists, we don't download it during setup, and we skip all other steps as well. - During ingest setup, we check if the ingest cache exists before downloading the base cache, and if the ingest cache already exists, we skip everything else. - `check-deps` doesn't have to wait on `setup` or download a cache, as the dependencies aren't needed, only `pip`.	2023-10-27 02:15:22 -05:00
Roman Isecke	2d5ffa4581	Fix ingest test CI job (#1864 ) Fixed some syntax errors in the update ingest CI job causing it to always fail	2023-10-25 02:23:00 +00:00
Roman Isecke	4802332de0	Roman/optimize ingest ci (#1799 ) ### Description Currently the CI caches the CI dependencies but uses the hash of all files in `requirements/`. This isn't completely accurate since the ingest dependencies are installed in a later step and don't affect the cached environment. As part of this PR: * ingest dependencies were isolated into their own folder in `requirements/ingest/` * A new cache setup was introduced in the CI to restore the base cache -> install ingest dependencies -> cache it with a new id * new make target created to install all ingest dependencies via `pip install -r ...` * updates to Dockerfile to use `find ...` to install all dependencies, avoiding the need to update this when new deps are added. * update to pip-compile script to run over all `*.in` files in `requirements/`	2023-10-24 14:54:00 +00:00
Yuming Long	ce40cdc55f	Chore (refactor): support table extraction with pre-computed ocr data (#1801 ) ### Summary Table OCR refactor, move the OCR part for table model in inference repo to unst repo. * Before this PR, table model extracts OCR tokens with texts and bounding box and fills the tokens to the table structure in inference repo. This means we need to do an additional OCR for tables. * After this PR, we use the OCR data from entire page OCR and pass the OCR tokens to inference repo, which means we only do one OCR for the entire document. Tech details: * Combined env `ENTIRE_PAGE_OCR` and `TABLE_OCR` to `OCR_AGENT`, this means we use the same OCR agent for entire page and tables since we only do one OCR. * Bump inference repo to `0.7.9`, which allow table model in inference to use pre-computed OCR data from unst repo. Please check in [PR](https://github.com/Unstructured-IO/unstructured-inference/pull/256). * All notebooks lint are made by `make tidy` * This PR also fixes [issue](https://github.com/Unstructured-IO/unstructured/issues/1564), I've added test for the issue in `test_pdf.py::test_partition_pdf_hi_table_extraction_with_languages` * Add same scaling logic to image [similar to previous Table OCR](https://github.com/Unstructured-IO/unstructured-inference/blob/main/unstructured_inference/models/tables.py#L109C1-L113), but now scaling is applied to entire image ### Test * Not much to manually testing expect table extraction still works * But due to change on scaling and use pre-computed OCR data from entire page, there are some slight (better) changes on table output, here is an comparison on test outputs i found from the same test `test_partition_image_with_table_extraction`: screen shot for table in `layout-parser-paper-with-table.jpg`: <img width="343" alt="expected" src="https://github.com/Unstructured-IO/unstructured/assets/63475068/278d7665-d212-433d-9a05-872c4502725c"> before refactor: <img width="709" alt="before" src="https://github.com/Unstructured-IO/unstructured/assets/63475068/347fbc3b-f52b-45b5-97e9-6f633eaa0d5e"> after refactor: <img width="705" alt="after" src="https://github.com/Unstructured-IO/unstructured/assets/63475068/b3cbd809-cf67-4e75-945a-5cbd06b33b2d"> ### TODO (added as a ticket) Still have some clean up to do in inference repo since now unst repo have duplicate logic, but can keep them as a fall back plan. If we want to remove anything OCR related in inference, here are items that is deprecated and can be removed: * [`get_tokens`](https://github.com/Unstructured-IO/unstructured-inference/blob/main/unstructured_inference/models/tables.py#L77) (already noted in code) * parameter `extract_tables` in inference * [`interpret_table_block`](https://github.com/Unstructured-IO/unstructured-inference/blob/main/unstructured_inference/inference/layoutelement.py#L88) * [`load_agent`](https://github.com/Unstructured-IO/unstructured-inference/blob/main/unstructured_inference/models/tables.py#L197) * env `TABLE_OCR` ### Note if we want to fallback for an additional table OCR (may need this for using paddle for table), we need to: * pass `infer_table_structure` to inference with `extract_tables` parameter * stop passing `infer_table_structure` to `ocr.py` --------- Co-authored-by: Yao You <yao@unstructured.io>	2023-10-21 00:24:23 +00:00
cragwolfe	1b90028501	chore: fix paths in ingest-test-fixtures-update-pr.yml (#1815 ) Reference: https://github.com/marketplace/actions/create-pull-request#add-specific-paths	2023-10-20 09:49:02 -07:00
Roman Isecke	63861f537e	Add check for duplicate click options (#1775 ) ### Description Given that many of the options associated with the `Click` based cli ingest commands are added dynamically from a number of configs, a check was incorporated to make sure there were no duplicate entries to prevent new configs from overwriting already added options. ### Issues that were found and fixes: * duplicate api-key option set on Notion command conflicts with api key used for unstructured api. Added notion prefix. * retry logic configs had duplicates in biomed. Removed since this is not handled by the pipeline.	2023-10-20 14:00:19 +00:00
Trevor Bossert	62aa4fc4ed	Move python setup above cache restore on ingest (#1802 ) This moves the setup-python step on ingest job above the cache restore, otherwise cache is restored and setup-python breaks symlinks. This matches pattern on other jobs.	2023-10-19 21:40:06 +00:00
Klaijan	98d54e3184	build: ingest fixtures workflow to include metrics dir (#1789 ) Add `test_unstructured_ingest/metrics` path for evaluation metrics master file.	2023-10-18 11:30:31 -07:00
ryannikolaidis	d9a0bd741a	fix: build test failures (#1748 ) * Fix missing HF_TOKEN when running containerized test for the build process * Fix pytest args when running specific test ## Testing Example run of the HF_TOKEN assgned for the containerized test in the build process: https://github.com/Unstructured-IO/unstructured/actions/runs/6504556437/job/17666669155 Example run of the pytest args working for the arm test (ran in a new workflow for testing on push): https://github.com/Unstructured-IO/unstructured/actions/runs/6504213010	2023-10-13 01:08:27 -07:00
qued	8100f1e7e2	chore: process chipper hierarchy (#1634 ) PR to support schema changes introduced from [PR 232](https://github.com/Unstructured-IO/unstructured-inference/pull/232) in `unstructured-inference`. Specifically what needs to be supported is: * Change to the way `LayoutElement` from `unstructured-inference` is structured, specifically that this class is no longer a subclass of `Rectangle`, and instead `LayoutElement` has a `bbox` property that captures the location information and a `from_coords` method that allows construction of a `LayoutElement` directly from coordinates. * Removal of `LocationlessLayoutElement` since chipper now exports bounding boxes, and if we need to support elements without bounding boxes, we can make the `bbox` property mentioned above optional. * Getting hierarchy data directly from the inference elements rather than in post-processing * Don't try to reorder elements received from chipper v2, as they should already be ordered. #### Testing: The following demonstrates that the new version of chipper is inferring hierarchy. ```python from unstructured.partition.pdf import partition_pdf elements = partition_pdf("example-docs/layout-parser-paper-fast.pdf", strategy="hi_res", model_name="chipper") children = [el for el in elements if el.metadata.parent_id is not None] print(children) ``` Also verify that running the traditional `hi_res` gives different results: ```python from unstructured.partition.pdf import partition_pdf elements = partition_pdf("example-docs/layout-parser-paper-fast.pdf", strategy="hi_res") ``` --------- Co-authored-by: Sebastian Laverde Alfonso <lavmlk20201@gmail.com> Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com> Co-authored-by: christinestraub <christinemstraub@gmail.com>	2023-10-13 01:28:46 +00:00
Ahmet Melek	94836cfad4	feat: add file-based access permissions for SharePoint ingest (#1628 ) This PR: - defines rbac_data as a SourceMetadata field, - manages connections to an external api for obtaining rbac data with ConnectorRBAC class, - serializes rbac data and saves it to the disk, - matches the rbac_data in the disk to each IngestDoc, using a common field, - forwards rbac data to Elements, via the partition() function To test the changes, run `examples/ingest/sharepoint/ingest.sh` with the relevant rbac & connector credentials --------- Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com> Co-authored-by: ahmetmeleq <ahmetmeleq@users.noreply.github.com>	2023-10-13 00:38:08 +00:00
ryannikolaidis	d22044a44c	fix: unstructured-ingest embedding KeyError (#1727 ) Currently adding the embedding flag to any unstructured-ingest call results in this failure: ``` 2023-10-11 22:42:14,177 MainProcess ERROR 'b8a98c5d963a9dd75847a8f110cbf7c9' multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/Users/ryannikolaidis/.pyenv/versions/3.10.11/lib/python3.10/multiprocessing/pool.py", line 125, in worker result = (True, func(args, kwds)) File "/Users/ryannikolaidis/.pyenv/versions/3.10.11/lib/python3.10/multiprocessing/pool.py", line 48, in mapstar return list(map(args)) File "/Users/ryannikolaidis/Development/unstructured/unstructured/unstructured/ingest/pipeline/copy.py", line 14, in run ingest_doc_json = self.pipeline_context.ingest_docs_map[doc_hash] File "<string>", line 2, in __getitem__ File "/Users/ryannikolaidis/.pyenv/versions/3.10.11/lib/python3.10/multiprocessing/managers.py", line 833, in _callmethod raise convert_to_error(kind, result) KeyError: 'b8a98c5d963a9dd75847a8f110cbf7c9' """ ``` This is because the run method for the embedding node is not adding the IngestDoc to the context map. This PR adds that logic and adds a test to validate that the embeddings option works as expected. NOTE: until https://github.com/Unstructured-IO/unstructured/pull/1719 goes in, the expected results include the duplicate element bug, however currently this does at least prove that embeddings are generated and the function doesn't error.	2023-10-12 20:27:30 +00:00
ryannikolaidis	3e101d3e4f	build(test): skip full python matrix for most ingest tests (#1687 ) We’re probably unfairly (to the test) making a large volume of new connections and requests to test services when all of our ingest tests run across the full python test matrix and when a lot of PRs a firing at once. Lets limit the full matrix run to a select few, but still have all ingest tests run on python v3.10. This is done by checking the version and skipping in ingest-test.sh. Bonus: Bumps ingest test fixture workflow to use 3.10. This technically shouldn't make a difference, but since we're making 3.10 the default of the matrix strategy, it probably makes sense to use 3.10 for the ingest fixture generation as well for consistency. ## Testing - [example](https://github.com/Unstructured-IO/unstructured/actions/runs/6460319121/job/17537900978?pr=1687) running all tests in 3.10 - [example](https://github.com/Unstructured-IO/unstructured/actions/runs/6460319121/job/17537899999?pr=1687) skipping/running the expected tests in 3.8	2023-10-10 16:39:34 +00:00
ryannikolaidis	1e32da6389	build: fix merge queue issues (#1654 ) Closes #1482 There are two known issues when attempting to merge PRs via merge queue: - CodeQL fails with: ``` Error: ref 'refs/heads/gh-readonly-queue/main/pr-968-499f37f64b27c66d4fc68446dbea519860d06cf7' not found in this repository ``` - CI.changelog fails with: ``` Get current git ref Error: The process '/usr/bin/git' failed with exit code [1](https://github.com/Unstructured-IO/unstructured/actions/runs/5735977683/job/15544656682#step:2:1)28 ``` The error with CodeQL is a known and still [open issue](https://github.com/github/codeql-action/issues/1572). We don't current enforce branch protection for CodeQL, so probably our best compromise is to simply not run this on the merge queue event. There could be a narrow margin where some issue is introduced via merge, but we'll still see issues on individual branches and on pushes to main, so this is probably acceptable. The changelog job now has a checkout step prior to paths-filter which guarantees the git ref exists before attempting to execute the filter action. ## Testing Prior to this change, I was able to validate both the [CodeQL](https://github.com/ryan-nikolaidis/unstructured/actions/runs/6414128010) and [changelog](https://github.com/ryan-nikolaidis/unstructured/actions/runs/6414128007/job/17414065768) test errors With these changes, validated that the merge queue was able to [successfully run](https://github.com/ryan-nikolaidis/unstructured/actions/runs/6414511843/job/17415024319) the changelog CI job.	2023-10-05 21:58:39 +00:00
Benjamin Torres	e0201e9a11	feat/add sources from unstructured inference (#1538 ) This PR adds support for `source` property from `unstructured_inference`, allowing the user to be able to see the origin of the data under `detection_origin`field environment variable UNSTRUCTURED_INCLUDE_DEBUG_METADATA=true In order to try this feature you can use this code: ``` from unstructured.partition.pdf import partition_pdf_or_image yolox_elements = partition_pdf_or_image(filename='example-docs/loremipsum-flat.pdf', strategy='hi_res', model_name='yolox') sources = [e.detection_origin for e in yolox_elements] print(sources) ``` And will print 'yolox' as source for all the elements	2023-10-05 20:26:47 +00:00
Roman Isecke	b2e997635f	roman/es ingest test fixes (#1610 ) ### Description update elasticsearch docker setup to use docker-compose Would close out https://github.com/Unstructured-IO/unstructured/issues/1609	2023-10-03 10:39:33 -04:00
Roman Isecke	11cdd8d71f	roman/drop downloads in ingest tests (#1614 ) ### Description In an effort to mitigate resource consumption when running CI tests, cleanup download dir for ingest tests after each one.	2023-10-02 20:47:24 +00:00
ryannikolaidis	ed2bf7eb66	build(test): ingest test fixture updates uses larger runners (#1612 )	2023-10-02 17:43:41 +00:00
Roman Isecke	81af879038	roman/increase ingest tests num processes (#1500 ) ### Description In an effort to speed up the ingest tests, bumping the num if processes to the max on the system for each	2023-09-26 16:06:53 -05:00
Roman Isecke	bd49cfbab7	feat: adds Azure Cognitive Search (full text) destination connector (#1459 ) ### Description New [Azure Cognitive Search](https://azure.microsoft.com/en-us/products/ai-services/cognitive-search) destination connector added. Writes each json element from the created json files via partition and writes that content to an index. Bonus bug fix: Due to a recent change where the default version of python used in the repo was bumped to `3.10` from `3.8`, this means running `pip-compile` now runs it against that version rather than the lowest we support which is still `3.8`. This breaks the setup for those lower versions because some of the versions pulled in by `pip-compile` exist for `3.10` but not `3.8`. `pip-compile` was updates to run as a script that checks the version of python being used first, which helps guarantee that all dependencies meet the minimum python version requirement. Closes out https://github.com/Unstructured-IO/unstructured/issues/1466	2023-09-25 10:27:42 -04:00
ryannikolaidis	ca01b30c07	ci: more reliable release version alerts (#1479 )	2023-09-22 21:19:26 +00:00
Yuming Long	f962a1e57d	fix: fix ingest paddle hanging issue (#1441 ) ## Summary Ingest tests are having paddle OOM issue which cause the tests to hang forever. The fix here is to remove paddle from ci and set both OCR env `TABLE_OCR` and `ENTIRE_PAGE_OCR` to `tesseract`. (will have follow up PR to investigate why this is failing) ## Test please check ingest tests in CI	2023-09-19 17:20:23 +00:00
ryannikolaidis	ad69d93d53	ci: add new release version alert (#1413 )	2023-09-15 07:05:00 +00:00
Yao You	12d7628b10	update constraints to pin weaviate during ci (#1408 ) This PR ensures the version for `weaviate` is consistent in CI testing. Latest (3.24.1) is not compatible with our test needs and last version that run successfully in CI is 3.23.2.	2023-09-13 23:19:20 +00:00
Ahmet Melek	09cc4bfa5f	feat: jira connector (cloud) (#1238 ) This connector: - takes a Jira Cloud URL, user email and api token; to authenticate into Jira Cloud - ingests: - either all issues in all projects in a Jira Cloud Organization - or - issues in user specified projects, boards - user specified issues - processes this kind of data: - text fields such as issue summary, description, and comments - dropdown fields such as issue type, status, priority, assignee, reporter, labels, and components - other data such as issue id, issue key, project id, information on subtasks - notes down attachment URLs, however does not process attachments - stores each downloaded issue in a txt file, in a predefined template form (consisting of the data above) - then processes each downloaded issue document into elements using unstructured library - related to: https://github.com/Unstructured-IO/unstructured/issues/263 To test the changes, make the necessary setups and run the relevant ingest test scripts. --------- Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com> Co-authored-by: ahmetmeleq <ahmetmeleq@users.noreply.github.com>	2023-09-06 10:10:48 +00:00
David Potter	b710bafa89	feat: add salesforce connector (#1168 )	2023-09-02 08:50:31 -07:00
Benjamin Torres	5052e6cb3b	Added plain-text comparison for tests (#1180 ) This PR adds a comparison during ingest test for the content of the files in plain text (i.e.: without JSON format)	2023-08-29 23:23:14 +00:00
Roman Isecke	106ee965a6	Roman/delta table connector (#1132 ) ### Description Add delta table connector and test against a delta table generated via delta.io and uploaded to s3. Shows an example of how to use the connection options to leverage s3. I was able to get this to work with s3 if I pass in the access and secret keys as storage options. Even though the s3 bucket being used is public, would not work without those. --------- Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com> Co-authored-by: rbiseck3 <rbiseck3@users.noreply.github.com>	2023-08-22 10:19:46 -04:00
Roman Isecke	db8af4f5de	Roman/notion tests (#1072 ) ### Description * Add ingest test for Notion docs * Update default cache dir for connectors to include connector name. Makes debugging the cached content easier. --------- Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com> Co-authored-by: rbiseck3 <rbiseck3@users.noreply.github.com>	2023-08-21 15:16:50 -04:00
Newel H	e4aa7373e2	test: create CI pipelines for verifying base and extras pass respective tests (#1137 ) Summary Closes #747 * Create CI Pipeline for running text, xml, email, and html doc tests against the library installed without extras * Create CI Pipeline for running each library extra against their respective tests	2023-08-19 12:56:13 -04:00
cragwolfe	d835fb1086	chore: bump pip version in published image (#1111 ) for consistency with the development environment, i.e. the Makefile.	2023-08-14 21:59:31 +00:00
Ahmet Melek	627f78c16f	feat: airtable connector (#1012 ) * add the first version of airtable connector * change imports as inline to fail gracefully in case of lacking dependency * parse tables as csv rather than plain text * add relevant logic to be able to use --airtable-list-of-paths * add script for creation of reseources for testing, add test script (large) for testing with a large number of tables to validate scroll functionality, update test script (diff) based on the new settings * fix ingest test names * add scripts for the large table test * remove large table test from diff test * make base and table ids explicit * add and remove comments * use -ne instead of != * update code based on the recent ingest refactor, update changelog and version * shellcheck fix * update comments * update check-num-rows-and-columns-output error message Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com> * update help comments * update help comments * update help comments * update workflows to set auth tokens and to run make install * add comments on create_scale_test_components * separate component ids from the test script, add comments to document test component creation * add LARGE_BASE test, implement LARGE_BASE component creation, replace component id * shellcheck fixes * shellcheck fixes * update docs * update comment * bump version * add wrongly deleted file * sort columns before saving to process * Update ingest test fixtures (#1098) Co-authored-by: ahmetmeleq <ahmetmeleq@users.noreply.github.com> --------- Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com> Co-authored-by: ahmetmeleq <ahmetmeleq@users.noreply.github.com>	2023-08-11 12:02:51 -07:00
rvztz	dee9b405cd	feat: Sharepoint connector (#918 )	2023-08-10 09:37:58 -07:00
ryannikolaidis	70365ea42d	chore: add Dropbox secrets to CI environments (#1029 )	2023-08-03 02:18:29 +00:00
cragwolfe	499f37f64b	build: enable merge_group checks (#1023 )	2023-08-02 04:52:06 +00:00
David Potter	1542607892	feat: adds Box connector (#996 )	2023-08-01 01:10:10 +00:00
Yuming Long	df1ba39905	Chore: add uns api repo unittests (#954 ) * stage * git clone * ci ignore markdown file * make install * use env instead * remove md * add script * wrong env value * add note * maybe don't rm * no cd../ --------- Co-authored-by: cragwolfe <crag@unstructured.io>	2023-07-26 20:55:35 +00:00
David Potter	f7e46af22f	feat: adds Outlook connector (#939 ) * bonus: fixes issue with email partitioning where From field was being assigned the To field value.	2023-07-26 04:09:26 +00:00
Ahmet Melek	b7674fb97e	feat: confluence connector (cloud) (#906 ) * Add confluence connector and an example script * add test script, add dependency installations * add authentication secret variables for ci tests and actions * add dependency installation commands for workflows * add dependency installation commands for workflows * Update ingest test fixtures (#907) Co-authored-by: ahmetmeleq <ahmetmeleq@users.noreply.github.com> * add add ingest test fixtures update workflow for python 3.10, update example script with dummy values * change workflow name to avoid confusion * change workflow name to avoid confusion * only leave 3.8 in ingest test matrix to test consistent partitioning among python versions, remove 3.10 workflow for the test fixtures update * only leave 3.8 in ingest test matrix to test consistent partitioning among python versions * Update ingest test fixtures (#911) Co-authored-by: ahmetmeleq <ahmetmeleq@users.noreply.github.com> * revert back the test python version matrix * recompile dependencies * modifications for shellcheck * update changelog and version * changelog and version * remove comments * Update ingest test fixtures (#915) Co-authored-by: ahmetmeleq <ahmetmeleq@users.noreply.github.com> * add the option to state the number of spaces to be fetched * add scroll functionality, expose --confluence-num-of-spaces, --confluence-list-of-spaces and --confluence-num-of-docs-from-each-space to users * add help message * add docstrings for two tests, validate grabbing every doc in the fetched spaces, count number of files instead of diffing for confluence2 test * change test names * rename connector arg Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com> * change arg name for connector Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com> * add comment to example * change arg names * add new tests to ingest test * shellcheck remove redundant statement * Update ingest test fixtures (#932) Co-authored-by: ahmetmeleq <ahmetmeleq@users.noreply.github.com> * Update ingest test fixtures (#936) Co-authored-by: ahmetmeleq <ahmetmeleq@users.noreply.github.com> * linting * change file extensions to parse as html * Update ingest test fixtures (#943) Co-authored-by: ahmetmeleq <ahmetmeleq@users.noreply.github.com> * remove old fixtures * update version to 0.8.2-dev3 * change file to trigger CI * change file to trigger CI * change file to trigger CI * change file to trigger CI --------- Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com> Co-authored-by: ahmetmeleq <ahmetmeleq@users.noreply.github.com>	2023-07-18 19:29:41 +01:00
Yuming Long	067eb5701f	Fix: docker build with missing dependency (#931 ) * pip -compile * test trigger * Revert "test trigger" This reverts commit 69d4c8cd9f285f6ef4bf445f5fb27b5c62e1391c. * version conflict and pip compile	2023-07-14 22:20:11 +00:00
Matt Robinson	685e33f890	build: remove docs-build branch (#933 )	2023-07-14 16:23:47 -04:00
rvztz	ce20c3f2bc	feat: add OneDrive connector (#834 )	2023-07-13 20:57:54 +00:00
Matt Robinson	b3936893b8	build: add python 3.11 to CI (#908 ) * remove argilla; bump reqs * enable py 3.11 * add 3.11 to setup.py * make pip-compile * ignore cli mypy errors * install argilla * fix constraints * install argilla * changelog and version * skip argilla in docker * dont import argilla in docker * skip all of argilla if in container * only import argilla if outside docker * more docker skips * remove weird pypi settings	2023-07-10 18:52:25 +00:00
Trevor Bossert	66f2d4b280	Add both arm and amd builds to manifests (#899 )	2023-07-10 10:15:15 -07:00
cragwolfe	209054f0db	build(image): revert docker build tweak for arm64 (#887 ) arm64 Images (and amd64 ones) now building again in CI 😐 .	2023-07-06 06:46:40 +00:00
Ahmet Melek	5ea216cf07	feat: elasticsearch connector (#817 )	2023-07-01 17:45:28 +00:00
cragwolfe	cb2866b159	build(image): docker build tweak for arm64 (#871 ) Fixes issue where arm64 docker builds were failing and preventing images from being published.	2023-06-30 20:49:31 -07:00

1 2 3

105 Commits