229 Commits

Author SHA1 Message Date
Alonso Guevara
e807e580e4 Remove threshold 2024-10-03 18:03:18 -06:00
Alonso Guevara
9510efe5a5 Format 2024-10-03 17:57:04 -06:00
Alonso Guevara
e870a7616c Fix relationship lookup 2024-10-03 17:53:45 -06:00
Alonso Guevara
4dd7605d65 Fix relationship lookup 2024-10-03 17:48:10 -06:00
Alonso Guevara
2159d9c085 Format 2024-10-03 17:08:53 -06:00
Alonso Guevara
75628d3433 Make threshold percentual 2024-10-03 17:06:34 -06:00
Alonso Guevara
8a10f4a598 Fix format from main 2024-10-03 15:42:49 -06:00
Alonso Guevara
d70457087e Pyright fixes 2024-10-03 15:40:02 -06:00
Alonso Guevara
082db97614 Pyright and fixes 2024-10-03 15:36:34 -06:00
Alonso Guevara
a153cadaa4 Spellcheck 2024-10-03 13:01:54 -06:00
Alonso Guevara
45cc6469db Add more documentation 2024-10-03 12:59:19 -06:00
Alonso Guevara
4f93aa675c Update final nodes output 2024-10-03 12:54:22 -06:00
Alonso Guevara
43ec92e173 merge from main 2024-10-02 13:08:02 -06:00
Alonso Guevara
6d23d6a03b
Incremental indexing/update final text units (#1241)
* Update final text units

* Format

* Address comments
2024-10-02 12:59:53 -06:00
Nathan Evans
f5c5876dde
Reorganize flows (#1240)
* Extract base docs and entity graph

* Move extracted entities and text units

* Move communities and community reports

* Move covariates and final documents

* Move entities, nodes, relationships

* Move text_units and summarized entities

* Assert all snapshot null cases

* Remove disabled steps util

* Remove incorrect use of input "others"

* Convert text_embed_df to just return the embeddings, not update the df

* Convert snapshot functions to noops

* Semver

* Remove lingering covariates_enabled param

* Name consistency

* Syntax cleanup
2024-10-02 08:57:08 -07:00
Nathan Evans
d501813181 Collapse create base extracted entities (#1235)
* Set up base assertions

* Replace entity_extract

* Finish collapsing workflow

* Semver

* Update snoke tests
2024-10-01 15:10:02 -06:00
Nathan Evans
3103ae3435 Collapse create summarized entities (#1237)
* Collapse entity summarize

* Semver
2024-10-01 15:10:02 -06:00
Nathan Evans
f259d0c81c Collapse create base entity graph (#1233)
* Collapse create_base_entity_graph

* Format/typing

* Semver

* Fix smoke tests

* Simplify assignment
2024-10-01 15:10:02 -06:00
Nathan Evans
a44788bfad Collapse create final community reports (#1227)
* Remove extraneous param

* Add community report mocking assertions

* Collapse primary report generation

* Collapse embeddings

* Format

* Semver

* Remove extraneous check

* Move option set
2024-10-01 15:10:02 -06:00
Nathan Evans
9070ea5c3c
Collapse create base extracted entities (#1235)
* Set up base assertions

* Replace entity_extract

* Finish collapsing workflow

* Semver

* Update snoke tests
2024-09-30 17:32:56 -07:00
Alonso Guevara
336e6f9ca1
Update relationships after inc index (#1236) 2024-09-30 18:18:58 -06:00
Nathan Evans
630679f8e3
Collapse create summarized entities (#1237)
* Collapse entity summarize

* Semver
2024-09-30 17:17:44 -07:00
Nathan Evans
5220bb7ecc
Collapse create base entity graph (#1233)
* Collapse create_base_entity_graph

* Format/typing

* Semver

* Fix smoke tests

* Simplify assignment
2024-09-30 15:39:42 -07:00
Nathan Evans
00d5e77568
Collapse create final community reports (#1227)
* Remove extraneous param

* Add community report mocking assertions

* Collapse primary report generation

* Collapse embeddings

* Format

* Semver

* Remove extraneous check

* Move option set
2024-09-30 10:46:07 -07:00
Alonso Guevara
4d713f6b23 Merge branch 'main' into incremental_indexing/main 2024-09-27 17:17:12 -06:00
Alonso Guevara
0d348d6070
Remove unused cols from final entities (#1226)
* Remove unused cols from final entities

* Move verbs test to integ

* Move verbs test to integ

* Move to smoke tests
2024-09-27 17:10:52 -06:00
Alonso Guevara
737a471d18
Pandas-ify Create Final Entities (#1225) 2024-09-26 15:09:40 -06:00
Nathan Evans
ce71bcf7fb
Collapse create final entities (#1220)
* Collapse create_final_entities

* Update smoke tests

* Semver

* Remove prints

* Update embedding assertions
2024-09-25 17:35:44 -07:00
Nathan Evans
3217013019
Revisit create final text units (#1216)
* Add embeddings to collapsed subflow

* Semver

* Fix smoke tests
2024-09-25 16:55:27 -07:00
Alonso Guevara
bf45f42969 Merge branch 'main' into incremental_indexing/main 2024-09-25 17:33:33 -06:00
Nathan Evans
73e709b686
Collapse create final covariates (#1215)
* Add covariate test

* Add detailed mock assertions

* Collapse create_final_covariates

* Delete unused doc_id field

* Semver

* Update smoke test

* Remove unused subject/object type columns
2024-09-25 16:30:22 -07:00
Alonso Guevara
0952014fa9
Fix issue 1173 - Nested json parsing (#1218) 2024-09-25 17:11:49 -06:00
Nathan Evans
14750f4d37
Collapse create final documents (#1217)
* Collapse create_final_documents

* Semver
2024-09-25 15:50:46 -07:00
Alonso Guevara
dda4edd0fd
Pandas-ify Create Base Documents (#1209) 2024-09-24 18:37:45 -06:00
Nathan Evans
f518c8b80b
Collapse relationship embeddings (#1199)
* Merge text_embed into a single relationships subflow

* Update smoke tests

* Semver

* Spelling
2024-09-24 15:03:26 -07:00
Nathan Evans
1755afbdec
Collapse create base text units (#1178)
* Collapse non-attribute verbs

* Include document_column_attributes in collapse

* Remove merge_override verb

* Semver

* Setup initial test and config

* Collapse create_base_text_units

* Semver

* Spelling

* Fix smoke tests

* Addres PR comments

---------

Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-09-23 16:55:53 -07:00
Alonso Guevara
00048b3dd2 Merge from main 2024-09-23 17:00:30 -06:00
Alonso Guevara
be7d3eb189
Remove aggregate_df from final coomunities and final text units (#1179)
* Remove aggregate_df from final coomunities and final text units

* Semver

* Ruff and format

* Format

* Format

* Fix tests, ruff and checks

* Remove some leftover prints

* Removed _final_join method
2024-09-23 16:54:15 -06:00
Nathan Evans
fbc483e4e5
Collapse create base documents (#1176)
* Collapse non-attribute verbs

* Include document_column_attributes in collapse

* Remove merge_override verb

* Semver

* Clean up some df/tests
2024-09-23 13:24:06 -07:00
JunHo Kim (김준호)
ea468204bc
Fix typo in documentation for customizability (#1160)
Corrected a misspelling of 'customizability' in the env_vars.md documentation. This change ensures clarity and accuracy in the description of input data handling configurations.

Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-09-20 14:52:44 -06:00
Nathan Evans
f8ab1b30dc
Collapse create_final_nodes (#1171)
* Collapse create_final_nodes

* Update smoke tests

* Typo

---------

Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-09-20 13:48:56 -07:00
Alonso Guevara
fb65989c05
Incremental indexing/update old outputs (#1155)
* Create entypoint for cli and api (#1067)

* Add cli and api entrypoints for update index

* Semver

* Update docs

* Run tests on feature branch main

* Better /main handling in tests

* Incremental indexing/file delta (#1123)

* Calculate new inputs and deleted inputs on update

* Semver

* Clear ruff checks

* Fix pyright

* Fix PyRight

* Ruff again

* Update Final Entities merging in new and existing entities from delta

* Update formatting

* Pyright

* Ruff

* Fix for pyright

* Yet Another Pyright test

* Pyright

* Format
2024-09-20 14:21:50 -06:00
Chris Trevino
1dbcc42b81
Remove redundant code from error-handling code in GlobalSearch (#1170)
* remove a redundant retry

* semver

* formatting

---------

Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-09-20 11:29:56 -06:00
Alonso Guevara
16b4ea5dc9
Release v0.3.6 (#1172) v0.3.6 2024-09-19 18:29:52 -06:00
dependabot[bot]
b61c4ec737
Bump JamesIves/github-pages-deploy-action from 4.6.3 to 4.6.4 (#1104)
Bumps [JamesIves/github-pages-deploy-action](https://github.com/jamesives/github-pages-deploy-action) from 4.6.3 to 4.6.4.
- [Release notes](https://github.com/jamesives/github-pages-deploy-action/releases)
- [Commits](https://github.com/jamesives/github-pages-deploy-action/compare/v4.6.3...v4.6.4)

---
updated-dependencies:
- dependency-name: JamesIves/github-pages-deploy-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-09-19 18:07:44 -06:00
Nathan Evans
ae094bb144
Collapse create final relationships (#1158)
* Collapse pre/post embedding workflows

* Semver

* Fix smoke tests

---------

Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-09-19 17:38:01 -06:00
dependabot[bot]
bd2c1da9a8
Bump path-to-regexp from 6.2.1 to 6.3.0 in /docsite (#1130)
Bumps [path-to-regexp](https://github.com/pillarjs/path-to-regexp) from 6.2.1 to 6.3.0.
- [Release notes](https://github.com/pillarjs/path-to-regexp/releases)
- [Changelog](https://github.com/pillarjs/path-to-regexp/blob/master/History.md)
- [Commits](https://github.com/pillarjs/path-to-regexp/compare/v6.2.1...v6.3.0)

---
updated-dependencies:
- dependency-name: path-to-regexp
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-19 15:55:31 -06:00
Alonso Guevara
84fb14ce4d
Chore/dependency cleanup (#1169)
* fix dependencies with deptry

* change order in pyproject.toml

* fix

* Dependency updates and cleanup

* Future required

---------

Co-authored-by: Florian Maas <fpgmaas@gmail.com>
2024-09-19 15:08:13 -06:00
Alonso Guevara
96a2460375
Release v0.3.5 (#1166) v0.3.5 2024-09-19 11:34:49 -06:00
longyunfeigu
95409ff4bf
Remove lancedb_dir redundant assignments (#1163)
Co-authored-by: wanhua.gu <wanhua.gu@wiz.ai>
Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-09-19 09:25:10 -06:00