mirror of https://github.com/microsoft/graphrag.git synced 2025-12-12 15:31:24 +00:00

Go to file

* Add source documents for verb tests

* Remove entity_type erroneous column

* Add new test data

* Remove source/target degree columns

* Remove top_level_node_id

* Remove chunk column configs

* Rename "chunk" to "text"

* Rename "chunk" to "text" in base

* Re-map document input to use base text units

* Revert base text units as final documents dep

* Update test data

* Split/rename node source_id

* Drop node size (dup of degree)

* Drop document_ids from covariates

* Remove unused document_ids from models

* Remove n_tokens from covariate table

* Fix missed document_ids delete

* Wire base text units to final documents

* Rename relationship rank as combined_degree

* Add rank as first-class property to Relationship

* Remove split_text operation

* Fix relationships test parquet

* Update test parquets

* Add entity ids to community table

* Remove stored graph embedding columns

* Format

* Semver

* Fix JSON typo

* Spelling

* Rename lancedb

* Sort lancedb

* Fix unit test

* Fix test to account for changing period

* Update tests for separate embeddings

* Format

* Better assertion printing

* Fix unit test for windows

* Rename document.raw_content -> document.text

* Remove read_documents function

* Remove unused document summary from model

* Remove unused imports

* Format

* Add new snapshots to default init

* Use util to construct embeddings collection name

* Align inc index model with branch changes

* Update data and tests for int ids

* Clean up embedding locs

* Switch entity "name" to "title" for consistency

* Fix short_id -> human_readable_id defaults

* Format

* Rework community IDs

* Fix community size compute

* Fix unit tests

* Fix report read

* Pare down nodes table output

* Fix unit test

* Fix merge

* Fix community loading

* Format

* Fix community id report extraction

* Update tests

* Consistent short IDs and ordering

* Update ordering and tests

* Update incremental for new nodes model

* Guard document columns loc

* Match column ordering

* Fix document guard

* Update smoke tests

* Fill NA on community extract

* Logging for smoke test debug

* Add parquet schema details doc

* Fix community hierarchy guard

* Use better empty hierarchy guard

* Back-compat shims

* Semver

* Fix warning

* Format

* Remove default fallback

* Reuse key

2024-11-13 15:11:19 -08:00

.github

Update CI/CD - skip running unit tests on documentation-only PRs (#1371 )

2024-11-06 14:19:21 -05:00

.semversioner

Artifact cleanup (#1341 )

2024-11-13 15:11:19 -08:00

.vscode

Feat/update cli (#1376 )

2024-11-07 06:59:10 -06:00

docs

Artifact cleanup (#1341 )

2024-11-13 15:11:19 -08:00

examples

Correct links to datashaper verbs in comments (#1068 )

2024-09-12 12:44:38 -06:00

examples_notebooks/community_contrib

Artifact cleanup (#1341 )

2024-11-13 15:11:19 -08:00

graphrag

Artifact cleanup (#1341 )

2024-11-13 15:11:19 -08:00

scripts

Index API (#953 )

2024-08-20 15:42:20 -06:00

tests

Artifact cleanup (#1341 )

2024-11-13 15:11:19 -08:00

.gitattributes

move mkdocs-typer to devdeps (#1331 )

2024-10-30 14:49:30 -07:00

.gitignore

Implement dynamic community selection for global search (#1396 )

2024-11-11 16:45:07 -08:00

.vsts-ci.yml

Initial Release

2024-07-01 15:25:30 -06:00

CHANGELOG.md

Release v0.4.1 (#1387 )

2024-11-08 17:59:57 -06:00

CODE_OF_CONDUCT.md

Initial Release

2024-07-01 15:25:30 -06:00

CODEOWNERS

Stabilize smoke tests for query community context building (#908 )

2024-08-12 13:17:40 -06:00

CONTRIBUTING.md

Initial Release

2024-07-01 15:25:30 -06:00

cspell.config.yaml

Replace current docs by mkdocs (#1263 )

2024-10-11 13:39:03 -06:00

DEVELOPING.md

fix typo. Update documentation URLs for consistency (#1298 )

2024-10-21 17:24:17 -06:00

dictionary.txt

Add visualization guide (#1340 )

2024-11-06 14:06:50 -05:00

LICENSE

Initial Release

2024-07-01 15:25:30 -06:00

mkdocs.yaml

Artifact cleanup (#1341 )

2024-11-13 15:11:19 -08:00

poetry.lock

Updated the variable names within the for-loop to differentiate betwe… (#1356 )

2024-11-05 11:45:29 -06:00

pyproject.toml

Release v0.4.1 (#1387 )

2024-11-08 17:59:57 -06:00

RAI_TRANSPARENCY.md

Initial Release

2024-07-01 15:25:30 -06:00

README.md

fix typo. Update documentation URLs for consistency (#1298 )

2024-10-21 17:24:17 -06:00

SECURITY.md

Initial Release

2024-07-01 15:25:30 -06:00

SUPPORT.md

Initial Release

2024-07-01 15:25:30 -06:00

v1-breaking-changes.md

Drift Search CLI, API, Docs and Example Notebook (#1348 )

2024-11-05 12:05:19 -06:00

README.md

GraphRAG

👉 Use the GraphRAG Accelerator solution
👉 Microsoft Research Blog Post
👉 Read the docs
👉 GraphRAG Arxiv

Overview

The GraphRAG project is a data pipeline and transformation suite that is designed to extract meaningful, structured data from unstructured text using the power of LLMs.

To learn more about GraphRAG and how it can be used to enhance your LLM's ability to reason about your private data, please visit the Microsoft Research Blog Post.

Quickstart

To get started with the GraphRAG system we recommend trying the Solution Accelerator package. This provides a user-friendly end-to-end experience with Azure resources.

Repository Guidance

This repository presents a methodology for using knowledge graph memory structures to enhance LLM outputs. Please note that the provided code serves as a demonstration and is not an officially supported Microsoft offering.

⚠️ Warning: GraphRAG indexing can be an expensive operation, please read all of the documentation to understand the process and costs involved, and start small.

Diving Deeper

To learn about our contribution guidelines, see CONTRIBUTING.md
To start developing GraphRAG, see DEVELOPING.md
Join the conversation and provide feedback in the GitHub Discussions tab!

Prompt Tuning

Using GraphRAG with your data out of the box may not yield the best possible results. We strongly recommend to fine-tune your prompts following the Prompt Tuning Guide in our documentation.

Responsible AI FAQ

See RAI_TRANSPARENCY.md

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Privacy

Microsoft Privacy Statement

README.md Unescape Escape

GraphRAG

Overview

Quickstart

Repository Guidance

Diving Deeper

Prompt Tuning

Responsible AI FAQ

Trademarks

Privacy

README.md