* Added graphrag_import_neo4j_cypher Notebook
* changed to procedure for setting embedding property to save disk space
* Reformat and cleanup
* semver
* Poetry lock update
* Update AAIS docs
* Rename contrib folder
* Merge from main
* Revert "Merge from main"
This reverts commit a399dde97b689a5b5c62dc2e9c2290cb2503b3a4.
* Fix ruff check
* Add readme and fix tests
* Fix community reports
---------
Co-authored-by: Michael Hunger <github@jexp.de>
* changed placement of lancedb dir to under /artifacts
* ruff checks and semversioner
* added support for static paths
* added support for streaming
* more ruff changes
* ruff format changes
* removed string concat for path formation
* added more ruff checks
* removed os.join usage
* more ruff fixes and removed unneccesary path creations
* replaced cast calls with str()
---------
Co-authored-by: Kenny Zhang <zhangken@microsoft.com>
* Remove excess vars from gh-pages build
* Delete redundant javascript ci
* Pull apart testing CI
* Clean up integration tests build
* Move storage tests to integration CI
* Take py 3.10 out of smoke tests matrix
* Use minimum supported python version for most tests
* Re-run main CI on any test change
* Add Josh and Kenny to author list
* Update auto-resolve perms
* Initial Index API
- Implement main API entry point: build_index
- Rely on GraphRagConfig instead of PipelineConfig
- This unifies the API signature with the
promt_tune and query API entry points
- Derive cache settings, config, and resuming from
the config and other arguments to
simplify/reduce arguments to build_index
- Add preflight config file validations
- Add semver change
* fix smoke tests
* fix smoke tests
* Use asyncio
* Add e2e artifacts in GH actions
* Remove unnecessary E2E test, and add skip_validations flag to cli
* Nicer imports
* Reorganize API functions.
* Add license headers and module docstrings
* Fix ignored ruff rule
---------
Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
* Added streaming output support for global search. Introduce `--streaming` flag to enable or disable streaming mode
* ran ruff format --preview
* update
* cleanup code and streaming api
* update cli argument
* remove whitespace
* checkpoint - add context data to streaming api
* cleanup help menu
* ruff format update
* add context data to streaming response
* add semversioner file
* rename variable for better readability
* rename variable for better readability
* ruff fixes
* fix abstract class type annotation
* add documentation for --streaming CLI flag
---------
Co-authored-by: 6GOD <55304045+6ixGODD@users.noreply.github.com>
Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
* Add support for both float and int on schema validation for community report generation
* Cast instead of type check
* Add mising file
* Add prompt with ints to smoke tests
* Fix unit tests
* Fix unit tests
* Add stricter filtering and tests for cli data directory discovery
* Semver
* Ignore ruff on error type
* Format
* Fix for windows paths
* Fix for windows paths
* Uncomment blob tests
* Sort by timestamp name instead of modified date
* Format
* Add additional folder name test
* fix strategy config in entity_extraction
* should not post token list to the embedding model
* fix embedding in local query
* add sembersioner
* remove strategy
---------
Co-authored-by: KylinMountain <kose2livs@gmail.com>
* Fix sort_context max_tokens & max_tokens param in verb
* Fix sort_context for windows test
* add semversioner file
---------
Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
* initial API redesign
* typo fix
* update docstring
* update docsring
* remove artifacts caused by the merge from main
* minor typo updates
* add semversioner check
* switch API to async function calls
---------
Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
* Remove outdated references to entity resolution
* Clarify covariate extraction
* Minor edits from other PR feedback
* Remove duplicate line
* Semver
---------
Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
* fixed json issue
* change to use try_parse_json_object onlu
* pyproject add json-repair
* add check extra description before and after json object
* json.loads() before repire_json, based on jbradley1 suggestion.
* Fix json parsing and formatting
* semver
* Nicer tuple parsing
---------
Co-authored-by: paulg <paul.guo@iag.com.au>
* added default title_column and collection_name values for workflows using the vector store option
* incorporated vector database support to the query client
* Updated docuemnatation to reflect the new query client param.
* Fixed ruff formatting
* added new poetry lock file
---------
Co-authored-by: Gabriel Nieves-Ponce <gnievesponce@microsoft.com>
Co-authored-by: Alonso Guevara <alonsog@microsoft.com>