229 Commits

Author SHA1 Message Date
Alonso Guevara
75735bd103
Release v0.3.2 (#1034) v0.3.2 2024-08-26 17:57:16 -06:00
Alonso Guevara
32c0cdfcc0
Patch "past" dependency issues (#1033)
* Patch "past" dependency issues

* Semver
2024-08-26 17:03:51 -06:00
Josh Bradley
a90d210497
Improve search type hint (#1031)
* update get_local_search_engine and get_global_search_engine return annotation

* add semversioner file

* reorder imports

* fix pyright errors

* revert change and ignore previous pyright error

---------

Co-authored-by: wanhua.gu <wanhua.gu@wiz.ai>
Co-authored-by: longyunfeigu <2514553187@qq.com>
Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-08-26 15:31:46 -06:00
Alonso Guevara
4c2f5376a8
Add missing config parameter for prompt tuning docs (#1017) 2024-08-26 14:38:59 -06:00
Josh Bradley
fd8e56ce6f
Update developer guide (#1029) 2024-08-26 12:28:03 -04:00
Alonso Guevara
55e74a0c2e
Fix weight casting during graph extraction (#1016)
* Fix weight casting during graph extraction

* Format

* Format
2024-08-23 20:51:59 -06:00
Alonso Guevara
e15df44f0d
Ensure entity types to be str in prompt tune (#1015) 2024-08-23 18:35:24 -06:00
dependabot[bot]
13e17d2dac
Bump ruff from 0.5.7 to 0.6.2 (#1014)
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.5.7 to 0.6.2.
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/0.5.7...0.6.2)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-08-23 18:00:11 -06:00
dependabot[bot]
b1d4ddd799
Bump micromatch from 4.0.5 to 4.0.8 in /docsite (#1013)
Bumps [micromatch](https://github.com/micromatch/micromatch) from 4.0.5 to 4.0.8.
- [Release notes](https://github.com/micromatch/micromatch/releases)
- [Changelog](https://github.com/micromatch/micromatch/blob/4.0.8/CHANGELOG.md)
- [Commits](https://github.com/micromatch/micromatch/compare/4.0.5...4.0.8)

---
updated-dependencies:
- dependency-name: micromatch
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-23 17:38:26 -06:00
Alonso Guevara
cb0aae7e6b
Add graphrag_import_neo4j_cypher Notebook (#593)
* Added graphrag_import_neo4j_cypher Notebook

* changed to procedure for setting embedding property to save disk space

* Reformat and cleanup

* semver

* Poetry lock update

* Update AAIS docs

* Rename contrib folder

* Merge from main

* Revert "Merge from main"

This reverts commit a399dde97b689a5b5c62dc2e9c2290cb2503b3a4.

* Fix ruff check

* Add readme and fix tests

* Fix community reports

---------

Co-authored-by: Michael Hunger <github@jexp.de>
2024-08-23 15:18:35 -06:00
KennyZhang1
dd71135995
Change lancedb placement (#996)
* changed placement of lancedb dir to under /artifacts

* ruff checks and semversioner

* added support for static paths

* added support for streaming

* more ruff changes

* ruff format changes

* removed string concat for path formation

* added more ruff checks

* removed os.join usage

* more ruff fixes and removed unneccesary path creations

* replaced cast calls with str()

---------

Co-authored-by: Kenny Zhang <zhangken@microsoft.com>
2024-08-22 11:39:55 -06:00
Josh Bradley
4b9fdc0dfe
Add context data to query responses (#1003)
* add context data to query responses

* add semversioner file

* ignore typechecking ruff suggestion
2024-08-22 12:07:50 -04:00
Alonso Guevara
9c6f5e090a
Release v0.3.1 (#1001) v0.3.1 2024-08-21 17:03:55 -06:00
Nathan Evans
f5b4d2fea5
Ci streamline (#988)
* Remove excess vars from gh-pages build

* Delete redundant javascript ci

* Pull apart testing CI

* Clean up integration tests build

* Move storage tests to integration CI

* Take py 3.10 out of smoke tests matrix

* Use minimum supported python version for most tests

* Re-run main CI on any test change

* Add Josh and Kenny to author list

* Update auto-resolve perms
2024-08-21 15:16:15 -06:00
Nathan Evans
98cabba38b
Notebook tests (#978)
* Fix notebook test runs

* Delete old issue template

* Add notebook CI action

* Print temp directories

* Print more env

* Move printing up

* Use runner_temp

* Try using current directory

* Try TMP env

* Re-write TMP

* Wrong yml

* Fix echo

* Only export if windows

* More logging

* Move export

* Reformat env write

* Fix braces

* Switch to in-memory execution

* Downgrade action perms

* Unused import
2024-08-20 17:19:37 -06:00
dependabot[bot]
8a9a2f7574
Bump uvloop from 0.19.0 to 0.20.0 (#969)
Bumps [uvloop](https://github.com/MagicStack/uvloop) from 0.19.0 to 0.20.0.
- [Release notes](https://github.com/MagicStack/uvloop/releases)
- [Commits](https://github.com/MagicStack/uvloop/compare/v0.19.0...v0.20.0)

---
updated-dependencies:
- dependency-name: uvloop
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-08-20 16:18:45 -06:00
Derek Worthen
6b4de3d841
Index API (#953)
* Initial Index API

- Implement main API entry point: build_index
- Rely on GraphRagConfig instead of PipelineConfig
    - This unifies the API signature with the
    promt_tune and query API entry points
- Derive cache settings, config, and resuming from
    the config and other arguments to
    simplify/reduce arguments to build_index
- Add preflight config file validations
- Add semver change

* fix smoke tests

* fix smoke tests

* Use asyncio

* Add e2e artifacts in GH actions

* Remove unnecessary E2E test, and add skip_validations flag to cli

* Nicer imports

* Reorganize API functions.

* Add license headers and module docstrings

* Fix ignored ruff rule

---------

Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-08-20 15:42:20 -06:00
dependabot[bot]
5a781dd234
Bump nltk from 3.8.1 to 3.9.1 (#966)
* Bump nltk from 3.8.1 to 3.9.1

Bumps [nltk](https://github.com/nltk/nltk) from 3.8.1 to 3.9.1.
- [Changelog](https://github.com/nltk/nltk/blob/develop/ChangeLog)
- [Commits](https://github.com/nltk/nltk/compare/3.8.1...3.9.1)

---
updated-dependencies:
- dependency-name: nltk
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Download punk_tab

* Semver

* Add missing installs

* Add missing installs

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-08-20 14:49:39 -06:00
Josh Bradley
62546a3c14
Add streaming support for local/global search (#944)
* Added streaming output support for global search. Introduce `--streaming` flag to enable or disable streaming mode

* ran ruff format --preview

* update

* cleanup code and streaming api

* update cli argument

* remove whitespace

* checkpoint - add context data to streaming api

* cleanup help menu

* ruff format update

* add context data to streaming response

* add semversioner file

* rename variable for better readability

* rename variable for better readability

* ruff fixes

* fix abstract class type annotation

* add documentation for --streaming CLI flag

---------

Co-authored-by: 6GOD <55304045+6ixGODD@users.noreply.github.com>
Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-08-20 13:44:48 -06:00
longyunfeigu
a6238c654a
Move embeddings target position (#938)
move embeddings target position

Co-authored-by: wanhua.gu <wanhua.gu@wiz.ai>
Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-08-20 13:02:52 -06:00
Alonso Guevara
e4daf358b9
Fix gh-pages publishing (#976)
* Remove indexer run from gh-pages, and use a local zip to avoid running

* Semver
2024-08-19 16:30:55 -06:00
Nayeon Kim
84f9bae129
Update 0-architecture.md (#961) 2024-08-19 12:21:40 -06:00
KennyZhang1
3c0a98c2d8
Add preflight config file validations (#952)
Co-authored-by: Kenny Zhang <zhangken@microsoft.com>
Co-authored-by: Josh Bradley <joshbradley@microsoft.com>
2024-08-16 17:53:32 -04:00
Nathan Evans
4040f02508
Update general_issue.yml (#956)
Copy checklist from bug/feature to general
2024-08-16 13:26:24 -07:00
Nathan Evans
bd5be7bb1a
Update issues-autoresolve.yml (#955)
Add write permissions for actions so it can update the cache
2024-08-16 13:17:23 -07:00
Alonso Guevara
0b7c5a6ae9
Add cast check on schema validation for community reports (#932)
* Add support for both float and int on schema validation for community report generation

* Cast instead of type check

* Add mising file

* Add prompt with ints to smoke tests

* Fix unit tests

* Fix unit tests
2024-08-14 16:40:47 -06:00
dependabot[bot]
36facbd000
Bump textual from 0.74.0 to 0.76.0 (#901)
Bumps [textual](https://github.com/Textualize/textual) from 0.74.0 to 0.76.0.
- [Release notes](https://github.com/Textualize/textual/releases)
- [Changelog](https://github.com/Textualize/textual/blob/main/CHANGELOG.md)
- [Commits](https://github.com/Textualize/textual/compare/v0.74.0...v0.76.0)

---
updated-dependencies:
- dependency-name: textual
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-14 13:06:55 -06:00
dependabot[bot]
1ec1d2f920
Bump azure-storage-blob from 12.21.0 to 12.22.0 (#900)
Bumps [azure-storage-blob](https://github.com/Azure/azure-sdk-for-python) from 12.21.0 to 12.22.0.
- [Release notes](https://github.com/Azure/azure-sdk-for-python/releases)
- [Changelog](https://github.com/Azure/azure-sdk-for-python/blob/main/doc/esrp_release.md)
- [Commits](https://github.com/Azure/azure-sdk-for-python/compare/azure-storage-blob_12.21.0...azure-storage-blob_12.22.0)

---
updated-dependencies:
- dependency-name: azure-storage-blob
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-13 22:48:07 -06:00
dependabot[bot]
ba63eda7a4
Bump pyyaml from 6.0.1 to 6.0.2 (#898)
Bumps [pyyaml](https://github.com/yaml/pyyaml) from 6.0.1 to 6.0.2.
- [Release notes](https://github.com/yaml/pyyaml/releases)
- [Changelog](https://github.com/yaml/pyyaml/blob/main/CHANGES)
- [Commits](https://github.com/yaml/pyyaml/compare/6.0.1...6.0.2)

---
updated-dependencies:
- dependency-name: pyyaml
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-08-13 18:48:51 -06:00
Nathan Evans
ac504e31a0
Add stricter filtering and tests for cli data directory discovery (#910)
* Add stricter filtering and tests for cli data directory discovery

* Semver

* Ignore ruff on error type

* Format

* Fix for windows paths

* Fix for windows paths

* Uncomment blob tests

* Sort by timestamp name instead of modified date

* Format

* Add additional folder name test
2024-08-13 17:34:14 -06:00
Alonso Guevara
d68e323193
Disable fail fast on tests (#911) 2024-08-13 12:20:14 -06:00
Alonso Guevara
f9c1bdd748
Release v0.3.0 (#912) v0.3.0 2024-08-12 18:14:52 -06:00
Alonso Guevara
4b9f268604
Fix/query embedding (#909)
* fix strategy config in entity_extraction

* should not post token list to the embedding model

* fix embedding in local query

* add sembersioner

* remove strategy

---------

Co-authored-by: KylinMountain <kose2livs@gmail.com>
2024-08-12 17:12:51 -06:00
benx13
3f31af80d2
typo summarize prompt (#907)
* typo in  entity_summarization prompt

* typo in summarize prompt

---------

Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-08-12 16:03:08 -06:00
Andres Morales
5a7dbaa051
Fix sort_context max_tokens & max_tokens param in verb (#888)
* Fix sort_context max_tokens & max_tokens param in verb

* Fix sort_context for windows test

* add semversioner file

---------

Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-08-12 15:55:31 -06:00
Josh Bradley
238f1c2adc
Implement prompt tuning API (#855)
* initial setup commit

* cleanup API and CLI interfaces

* move datatype definition to types.py

* code cleanup

* add semversioner file

* remove unused import

---------

Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-08-12 15:09:00 -06:00
Josh Bradley
4bcbfd10eb
Implement query api (#839)
* initial API redesign

* typo fix

* update docstring

* update docsring

* remove artifacts caused by the merge from main

* minor typo updates

* add semversioner check

* switch API to async function calls

---------

Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-08-12 13:40:10 -06:00
Alonso Guevara
7fd23fa79c
Stabilize smoke tests for query community context building (#908)
* Stabilize smoke tests for query community context building

* Fix CODEOWNERS
2024-08-12 13:17:40 -06:00
Alonso Guevara
073f650ba9
Fix/json dumps ascii (#873)
* Ensure ascii false in json dumps, support for non ASCII chars

* Format

* Semver
2024-08-09 17:05:48 -06:00
Alonso Guevara
7376f149d2
Release v0.2.2 (#872) v0.2.2 2024-08-08 16:48:47 -06:00
dependabot[bot]
85a5a61340
Bump tenacity from 8.5.0 to 9.0.0 (#823)
Bumps [tenacity](https://github.com/jd/tenacity) from 8.5.0 to 9.0.0.
- [Release notes](https://github.com/jd/tenacity/releases)
- [Commits](https://github.com/jd/tenacity/compare/8.5.0...9.0.0)

---
updated-dependencies:
- dependency-name: tenacity
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-08 16:39:15 -06:00
dependabot[bot]
c88dbb3575
Bump json-repair from 0.25.3 to 0.26.0 (#824)
Bumps [json-repair](https://github.com/mangiucugna/json_repair) from 0.25.3 to 0.26.0.
- [Release notes](https://github.com/mangiucugna/json_repair/releases)
- [Commits](https://github.com/mangiucugna/json_repair/compare/0.25.3...0.26.0)

---
updated-dependencies:
- dependency-name: json-repair
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-08-08 15:27:13 -06:00
Alonso Guevara
c451aa0093
Update smoke tests (#861)
* Run smoke tests on 4o

* Shorten dulce for smoke tests

* Update secrets for consistency
2024-08-08 13:07:44 -06:00
Dayenne Souza
1e10bd342e
Re-enable smoke tests (#848)
* add smoke tests again

* add smoke tests separated action

* add patch version

* disable blob test

* blob conn again

* add file as cache type

* remove cache type enterely

* increase timeout

* remove comment

---------

Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-08-07 12:23:46 -06:00
Nathan Evans
c749fe2a15
Docs updates aug06 (#852)
* Remove outdated references to entity resolution

* Clarify covariate extraction

* Minor edits from other PR feedback

* Remove duplicate line

* Semver

---------

Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-08-06 16:31:47 -07:00
Ha Trinh
8a1221e0e4
Fix community context builder for local search (#850)
* add a check for empty context

* remove log and format code

* add changelog

---------

Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-08-06 16:08:45 -07:00
Alonso Guevara
53268406fe
Release v0.2.1 (#835) v0.2.1 2024-08-05 18:45:28 -06:00
Alonso Guevara
bd326d2614
Only repair broken responses (#834)
* Only repair broken reponses

* Format
2024-08-05 18:25:08 -06:00
Ha Trinh
482246528d
fix json parsing logic and warning message (#833)
* fix json parsing logic and warning message

* amended warning message

---------

Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-08-05 16:31:36 -06:00
Alonso Guevara
7b656af50c
Fix embeddings loading on local search cli (#831)
* Fix embeddings loading on local search cli

* Update lockfile

* Update rules in ruff check
2024-08-05 16:00:31 -06:00