5 Commits

Author SHA1 Message Date
Nathan Evans
f5c5876dde
Reorganize flows (#1240)
* Extract base docs and entity graph

* Move extracted entities and text units

* Move communities and community reports

* Move covariates and final documents

* Move entities, nodes, relationships

* Move text_units and summarized entities

* Assert all snapshot null cases

* Remove disabled steps util

* Remove incorrect use of input "others"

* Convert text_embed_df to just return the embeddings, not update the df

* Convert snapshot functions to noops

* Semver

* Remove lingering covariates_enabled param

* Name consistency

* Syntax cleanup
2024-10-02 08:57:08 -07:00
Nathan Evans
ce71bcf7fb
Collapse create final entities (#1220)
* Collapse create_final_entities

* Update smoke tests

* Semver

* Remove prints

* Update embedding assertions
2024-09-25 17:35:44 -07:00
Nathan Evans
3217013019
Revisit create final text units (#1216)
* Add embeddings to collapsed subflow

* Semver

* Fix smoke tests
2024-09-25 16:55:27 -07:00
Nathan Evans
aa5b426f1d
Collapse final communities workflow (#1150)
* Collapse create_final_communities

* Semver

* Spellcheck

* Clean up filtering

* Add space in title

* Format

* Cleanup imports and format

* Spruce up the tests

* Update dictionary.txt

* Spellcheck

---------

Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
2024-09-17 17:04:42 -07:00
Nathan Evans
a473265580
Collapse verbs: create_final_text_units (#1143)
* Load default config in verb tests

* Load proper workflow config

* Collapse text unit pre-embedding steps

* Format

* Update smoke tests

* Semver

* Format

* Merge join* subflows into create_final_text_units

* Remove join_text_units_to_covariate_ids

* Format

* Remove join_text_units_to_entity_ids

* Remove join_text_units_to_relationship_ids

* Clean up merges and aggregations

* Remove unnecessary cast
2024-09-17 10:32:25 -07:00