graphrag

mirror of https://github.com/microsoft/graphrag.git synced 2025-11-03 19:30:10 +00:00

Author	SHA1	Message	Date
Kenny Zhang	14b1eccbff	partially resolved merge conflicts	2024-12-19 17:13:14 -05:00
Kenny Zhang	9ca67643b4	partially fixed merge conflicts	2024-12-18 15:10:45 -05:00
Kenny Zhang	82548e11f8	refactored collection_name variable naming	2024-12-04 12:58:49 -05:00
Kenny Zhang	bf5c72dec0	tested cosmosdb vector store querying	2024-12-03 15:22:29 -05:00
Kenny Zhang	c3e2394304	tested cosmosdb vector_store indexing	2024-12-03 12:03:19 -05:00
Kenny Zhang	dccd4aee68	modified query string to return all cols	2024-11-26 17:20:02 -05:00
Kenny Zhang	68dfb20961	modified factory class	2024-11-26 16:09:14 -05:00
Kenny Zhang	6d9ec16efb	added filter_by_id function	2024-11-26 15:09:19 -05:00
Kenny Zhang	01db1424a1	implemented similarity search methods	2024-11-26 10:50:14 -05:00
Kenny Zhang	fe2e718f8a	implemented load_document and search_by_id methods	2024-11-25 14:14:52 -05:00
Kenny Zhang	b50a7a8e70	implemented container creation and deletion functions	2024-11-22 14:56:37 -05:00
Kenny Zhang	72306b2529	implemented database creation and deletion functions	2024-11-22 13:56:28 -05:00
Kenny Zhang	863363d086	added cosmosdb vector store class outline	2024-11-22 13:33:49 -05:00
Kenny Zhang	9d899fc400	removed some whitespace	2024-11-21 15:19:41 -05:00
Kenny Zhang	232cd07762	simplified create_database and create_container functions	2024-11-20 14:26:18 -05:00
Kenny Zhang	76511d0180	collapsed cosmosdb schema to use minimal containers and databases	2024-11-19 15:30:59 -05:00
Kenny Zhang	c5281bb79a	tested query for cosmosdb	2024-11-19 14:45:59 -05:00
Kenny Zhang	31c0a7a316	added cosmosdb functionality to query pipeline	2024-11-18 14:53:06 -05:00
Kenny Zhang	6eb61342c0	fixed more merge conflicts	2024-11-18 11:52:22 -05:00
Kenny Zhang	594f332606	Merge branch 'main' of github.com:microsoft/graphrag into add-cosmosdb-to-storage	2024-11-18 11:49:41 -05:00
Kenny Zhang	dac0b861bd	merged with main and resolved conflicts	2024-11-18 11:47:51 -05:00
Alonso Guevara	6d21ef2683	Release v0.5.0 (#1415 ) v0.5.0	2024-11-18 00:06:54 -06:00
Josh Bradley	22a57d14c7	Improve CLI speed with lazy imports (#1319 )	2024-11-15 19:41:10 -05:00
Nathan Evans	9b4f24ebce	First cut at config cleanup (#1411 ) * Firsst cut at config cleanup * Reorder top nav * Add query prompts to tuning page * Remove dynamic notebook from nav * Add more thorough yml config descriptions in docs * Further clean out the config * Semver * Add new blog post * Emphasize yaml * Clarify output * Fix unit test * Fix bullet nesting	2024-11-15 14:33:26 -08:00
Kenny Zhang	0d93d0d305	added basic support for parquet emitter using internal conversions	2024-11-15 15:54:03 -05:00
Kenny Zhang	5e5f76d281	readded initial non-parquet emitter fix	2024-11-15 14:31:11 -05:00
Kenny Zhang	66641d66d7	removed nested try statement	2024-11-15 13:46:52 -05:00
Nathan Evans	425dbc60e3	Docs update (#1408 ) * Fix footer contrast * Fix broken links * Remove a few unneeded examples * Point python API example to the whole folder * Convert schema bullets to tables	2024-11-14 21:26:29 -06:00
JunHo Kim (김준호)	ec9cdcce4d	fix typo. Correct the wording "global search" to "drift search" in drift search documentation (#1383 ) Updated the wording of the example scenario from "global search" to "drift search" to accurately reflect the topic. This improves clarity and ensures the documentation accurately describes its content. Co-authored-by: Alonso Guevara <alonsog@microsoft.com>	2024-11-14 16:55:44 -06:00
Jeff Baumes	0a5801041a	Fix documentation for generate_indexing_prompts (#1336 ) Co-authored-by: Alonso Guevara <alonsog@microsoft.com>	2024-11-14 16:53:59 -06:00
Kenny Zhang	6d1a4d9914	Merge branch 'main' of github.com:microsoft/graphrag into add-cosmosdb-to-storage	2024-11-14 17:01:28 -05:00
Kenny Zhang	65c93bb098	reverted merged changed from closed branch	2024-11-14 17:01:10 -05:00
Kenny Zhang	716bfa4083	require base_dir to be typed as str	2024-11-14 15:42:43 -05:00
Alonso Guevara	c90166ca32	Add Parquet as part of the default emitters when not present (#1407 ) Add Parquet as part of the default emitters when not pressent	2024-11-14 13:04:19 -06:00
Nathan Evans	51912b2e03	Move prompts (#1404 ) * Move indexing prompts to root * Move query prompts to root * Export query prompts during init * Extract general knowledge prompt * Load query prompts from disk * Semver * Fix unit tests	2024-11-14 10:45:37 -08:00
Kenny Zhang	d1fc4f05df	removed extraneous container_name setting	2024-11-14 12:49:33 -05:00
Kenny Zhang	d6c3afcaad	first successful run of cosmosdb indexing	2024-11-14 12:24:42 -05:00
Kenny Zhang	0982efe6e0	Merge remote-tracking branch 'origin/fix/non-default-emitters' into add-cosmosdb-to-storage	2024-11-14 01:28:49 -05:00
Alonso Guevara	297066c168	ruff	2024-11-13 18:41:26 -06:00
Alonso Guevara	ea7a404098	Ruff	2024-11-13 18:23:56 -06:00
Alonso Guevara	d206e673a6	Format	2024-11-13 18:19:06 -06:00
Alonso Guevara	6d2427e118	Fix non-default emitters	2024-11-13 18:15:40 -06:00
Nathan Evans	c8c354e357	Artifact cleanup (#1341 ) * Add source documents for verb tests * Remove entity_type erroneous column * Add new test data * Remove source/target degree columns * Remove top_level_node_id * Remove chunk column configs * Rename "chunk" to "text" * Rename "chunk" to "text" in base * Re-map document input to use base text units * Revert base text units as final documents dep * Update test data * Split/rename node source_id * Drop node size (dup of degree) * Drop document_ids from covariates * Remove unused document_ids from models * Remove n_tokens from covariate table * Fix missed document_ids delete * Wire base text units to final documents * Rename relationship rank as combined_degree * Add rank as first-class property to Relationship * Remove split_text operation * Fix relationships test parquet * Update test parquets * Add entity ids to community table * Remove stored graph embedding columns * Format * Semver * Fix JSON typo * Spelling * Rename lancedb * Sort lancedb * Fix unit test * Fix test to account for changing period * Update tests for separate embeddings * Format * Better assertion printing * Fix unit test for windows * Rename document.raw_content -> document.text * Remove read_documents function * Remove unused document summary from model * Remove unused imports * Format * Add new snapshots to default init * Use util to construct embeddings collection name * Align inc index model with branch changes * Update data and tests for int ids * Clean up embedding locs * Switch entity "name" to "title" for consistency * Fix short_id -> human_readable_id defaults * Format * Rework community IDs * Fix community size compute * Fix unit tests * Fix report read * Pare down nodes table output * Fix unit test * Fix merge * Fix community loading * Format * Fix community id report extraction * Update tests * Consistent short IDs and ordering * Update ordering and tests * Update incremental for new nodes model * Guard document columns loc * Match column ordering * Fix document guard * Update smoke tests * Fill NA on community extract * Logging for smoke test debug * Add parquet schema details doc * Fix community hierarchy guard * Use better empty hierarchy guard * Back-compat shims * Semver * Fix warning * Format * Remove default fallback * Reuse key	2024-11-13 15:11:19 -08:00
Kenny Zhang	e0a0546958	modified cosmosdb setter to require json	2024-11-12 16:28:11 -05:00
Kenny Zhang	5436166450	Merge branch 'main' of github.com:microsoft/graphrag into add-cosmosdb-to-storage	2024-11-12 13:28:09 -05:00
Alonso Guevara	e53422366d	Implement dynamic community selection for global search (#1396 ) * update gitignore * add dynamic community sleection to updated main branch * update SearchResult to record output_tokens. * update search result * dynamic search working * format * add llm_calls_categories and prompt_tokens and output_tokens cate * update * formatting * log drift search output and prompt tokens separately * update global_search.ipynb. update operate dulce dataset and add create_final_communities. update dynamic community selection init * add .ipynb back to cspell.config.yaml * format * add notebook example on dynamic search * rearrange * update gitignore * format code * code format * code format * fix default variable --------- Co-authored-by: Bryan Li <bryanlimy@gmail.com>	2024-11-11 16:45:07 -08:00
Kenny Zhang	73d1e42a6c	Merge branch 'main' of github.com:microsoft/graphrag into add-cosmosdb-to-storage	2024-11-11 10:40:11 -05:00
Alonso Guevara	ba50caab4d	Release v0.4.1 (#1387 ) * Release v0.4.1 * Spellcheck v0.4.1	2024-11-08 17:59:57 -06:00
Kenny Zhang	a76eb54b2f	replaced primary key cosmosdb initialization with connection strings	2024-11-08 13:05:12 -05:00
Kenny Zhang	b263569167	Merge branch 'main' of github.com:microsoft/graphrag into add-cosmosdb-to-storage	2024-11-07 15:08:05 -05:00

1 2 3 4 5 ...

303 Commits