* Remove aggregate_df from final coomunities and final text units
* Semver
* Ruff and format
* Format
* Format
* Fix tests, ruff and checks
* Remove some leftover prints
* Removed _final_join method
Corrected a misspelling of 'customizability' in the env_vars.md documentation. This change ensures clarity and accuracy in the description of input data handling configurations.
Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
* Create entypoint for cli and api (#1067)
* Add cli and api entrypoints for update index
* Semver
* Update docs
* Run tests on feature branch main
* Better /main handling in tests
* Incremental indexing/file delta (#1123)
* Calculate new inputs and deleted inputs on update
* Semver
* Clear ruff checks
* Fix pyright
* Fix PyRight
* Ruff again
* Update Final Entities merging in new and existing entities from delta
* Update formatting
* Pyright
* Ruff
* Fix for pyright
* Yet Another Pyright test
* Pyright
* Format
* Migrate towards using static output directories
- Fixes load_config eagering resolving directories.
Directories are only resolved when the output
directories are local.
- Add support for `--output` and `--reporting` flags
for index CLI. To achieve previous output structure
`index --output run1/artifacts --reports run1/reports`.
- Use static output directories when initializing
a new project.
- Maintains backward compatibility for those using
timestamp outputs locally.
* fix smoke tests
* update query cli to work with static directories
* remove eager path resolution from load_config. Support CLI overrides that can be resolved.
* add docs and output logs/artifacts to same directory
* use match statement
* switch back to if statement
---------
Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
* Collapse create_final_communities
* Semver
* Spellcheck
* Clean up filtering
* Add space in title
* Format
* Cleanup imports and format
* Spruce up the tests
* Update dictionary.txt
* Spellcheck
---------
Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
* Setup basic verb test runner
* Replace join_text_units_to_entity_ids with subflow
* Update comments
* Replace join_text_units_to_relationship_ids subflow
* Roll in final select
* Reuse assertion util
* Small fix + format
* Format/typing
* Semver
* Format/typing
* Semver
* Revert format changes
* Fix smoke test subworkflow count
* Edit subworkflows for another smoke test
* Update test parquets for covariates
* Collapse covariate join
* Rework subtasks for per-flow customization
* Format
* Semver
* Fix smoke test
* Setup basic verb test runner
* Replace join_text_units_to_entity_ids with subflow
* Update comments
* Replace join_text_units_to_relationship_ids subflow
* Roll in final select
* Reuse assertion util
* Small fix + format
* Format/typing
* Semver
* Format/typing
* Semver
* Revert format changes
* Fix smoke test subworkflow count
* Edit subworkflows for another smoke test
* fix: fix the bug that community context builder will cause a report to be repeated twice in local mode.
* Fix duplicates in community context builder
* Small tweaks on code
---------
Co-authored-by: jarlor <zjl58960902@outlook.com>
Update factories.py to allow the usage of the request timeout ChatOpenAI parameter
allow the usage of the request timeout ChatOpenAI parameter
Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
Correct links to verbs in comments
Updated the links in comments to reflect new paths for 'derive' and 'aggregate' verbs. This improves documentation and ensures that references are up to date for future developers.
Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
* Moved query loading from file to helper function
* added loading parquets from blob to function
* resolved adlfs async error
* debugging cleanup and small fixes
* added connection string support
* semversioner and ruff fixes
* completed testing for merge with main
* more ruff changes
* fixed unbound vars warning
* rewrote function to use storage utils
* removed unused vars
---------
Co-authored-by: Kenny Zhang <zhangken@microsoft.com>
* Create entypoint for cli and api (#1067)
* Add cli and api entrypoints for update index
* Semver
* Update docs
* Run tests on feature branch main
* Better /main handling in tests
* Clean and organize run index code
* Ruff fix
* Pyright fix
* Format fixes
* Pyright fix
* Format
* Fix integ tests
* Fix ruff
* Reorganize and clean up