Docs: Add models page (#1842)

* Add models page * Update config docs for new params * Spelling * Add comment on CoT with o-series * Add notes about managed identity * Update the viz guide * Spruce up the getting started wording * Capitalization * Add BYOG page * More BYOG edits * Update dictionary * Change example model name
2025-12-02 01:49:55 +00:00 · 2025-04-28 17:36:08 -07:00 · 2025-04-28 17:36:08 -07:00 · 25bbae8642
commit 25bbae8642
parent c8621477ed
10 changed files with 230 additions and 68 deletions
--- a/dictionary.txt
+++ b/dictionary.txt
@ -79,6 +79,8 @@ mkdocs
 fnllm
 typer
 spacy
 kwargs
 ollama
 # Library Methods
 iterrows
@ -190,6 +192,7 @@ Arxiv
 kwds
 jsons
 txts
 byog
 # Dulce
 astrotechnician
--- a/docs/config/init.md
+++ b/docs/config/init.md
@ -29,4 +29,4 @@ The `init` command will create the following files in the specified directory:
 ## Next Steps
-After initializing your workspace, you can either run the [Prompt Tuning](../prompt_tuning/auto_prompt_tuning.md) command to adapt the prompts to your data or even start running the [Indexing Pipeline](../index/overview.md) to index your data. For more information on configuring GraphRAG, see the [Configuration](overview.md) documentation.
+After initializing your workspace, you can either run the [Prompt Tuning](../prompt_tuning/auto_prompt_tuning.md) command to adapt the prompts to your data or even start running the [Indexing Pipeline](../index/overview.md) to index your data. For more information on configuration options available, see the [YAML details page](yaml.md).
--- a/docs/config/models.md
+++ b/docs/config/models.md
@ -0,0 +1,101 @@
 # Language Model Selection and Overriding
 This page contains information on selecting a model to use and options to supply your own model for GraphRAG. Note that this is not a guide to finding the right model for your use case.
 ## Default Model Support
 GraphRAG was built and tested using OpenAI models, so this is the default model set we support. This is not intended to be a limiter or statement of quality or fitness for your use case, only that it's the set we are most familiar with for prompting, tuning, and debugging.
 GraphRAG also utilizes a language model wrapper library used by several projects within our team, called fnllm. fnllm provides two important functions for GraphRAG: rate limiting configuration to help us maximize throughput for large indexing jobs, and robust caching of API calls to minimize consumption on repeated indexes for testing, experimentation, or incremental ingest. fnllm uses the OpenAI Python SDK under the covers, so OpenAI-compliant endpoints are a base requirement out-of-the-box.
 ## Model Selection Considerations
 GraphRAG has been most thoroughly tested with the gpt-4 series of models from OpenAI, including gpt-4 gpt-4-turbo, gpt-4o, and gpt-4o-mini. Our [arXiv paper](https://arxiv.org/abs/2404.16130), for example, performed quality evaluation using gpt-4-turbo.
 Versions of GraphRAG before 2.2.0 made extensive use of `max_tokens` and `logit_bias` to control generated response length or content. The introduction of the o-series of models added new, non-compatible parameters because these models include a reasoning component that has different consumption patterns and response generation attributes than non-reasoning models. GraphRAG 2.2.0 now supports these models, but there are important differences that need to be understood before you switch.
 - Previously, GraphRAG used `max_tokens` to limit responses in a few locations. This is done so that we can have predictable content sizes when building downstream context windows for summarization. We have now switched from using `max_tokens` to use a prompted approach, which is working well in our tests. We suggest using `max_tokens` in your language model config only for budgetary reasons if you want to limit consumption, and not for expected response length control. We now also support the o-series equivalent `max_completion_tokens`, but if you use this keep in mind that there may be some unknown fixed reasoning consumption amount in addition to the response tokens, so it is not a good technique for response control.
 - Previously, GraphRAG used a combination of `max_tokens` and `logit_bias` to strictly control a binary yes/no question during gleanings. This is not possible with reasoning models, so again we have switched to a prompted approach. Our tests with gpt-4o, gpt-4o-mini, and o1 show that this works consistently, but could have issues if you have an older or smaller model.
 - The o-series models are much slower and more expensive. It may be useful to use an asymmetric approach to model use in your config: you can define as many models as you like in the `models` block of your settings.yaml and reference them by key for every workflow that requires a language model. You could use gpt-4o for indexing and o1 for query, for example. Experiment to find the right balance of cost, speed, and quality for your use case.
 - The o-series models contain a form of native native chain-of-thought reasoning that is absent in the non-o-series models. GraphRAG's prompts sometimes contain CoT because it was an effective technique with the gpt-4* series. It may be counterproductive with the o-series, so you may want to tune or even re-write large portions of the prompt templates (particularly for graph and claim extraction).
 Example config with asymmetric model use:
 ```yaml
 models:
  extraction_chat_model:
    api_key: ${GRAPHRAG_API_KEY}
    type: openai_chat
    auth_type: api_key
    model: gpt-4o
    model_supports_json: true
  query_chat_model:
    api_key: ${GRAPHRAG_API_KEY}
    type: openai_chat
    auth_type: api_key
    model: o1
    model_supports_json: true
 ...
 extract_graph:
  model_id: extraction_chat_model
  prompt: "prompts/extract_graph.txt"
  entity_types: [organization,person,geo,event]
  max_gleanings: 1
 ...
 global_search:
  chat_model_id: query_chat_model
  map_prompt: "prompts/global_search_map_system_prompt.txt"
  reduce_prompt: "prompts/global_search_reduce_system_prompt.txt"
  knowledge_prompt: "prompts/global_search_knowledge_system_prompt.txt"
 ```
 Another option would be to avoid using a language model at all for the graph extraction, instead using the `fast` [indexing method](../index/methods.md) that uses NLP for portions of the indexing phase in lieu of LLM APIs.
 ## Using Non-OpenAI Models
 As noted above, our primary experience and focus has been on OpenAI models, so this is what is supported out-of-the-box. Many users have requested support for additional model types, but it's out of the scope of our research to handle the many models available today. There are two approaches you can use to connect to a non-OpenAI model:
 ### Proxy APIs
 Many users have used platforms such as [ollama](https://ollama.com/) to proxy the underlying model HTTP calls to a different model provider. This seems to work reasonably well, but we frequently see issues with malformed responses (especially JSON), so if you do this please understand that your model needs to reliably return the specific response formats that GraphRAG expects. If you're having trouble with a model, you may need to try prompting to coax the format, or intercepting the response within your proxy to try and handle malformed responses.
 ### Model Protocol
 As of GraphRAG 2.0.0, we support model injection through the use of a standard chat and embedding Protocol and an accompanying ModelFactory that you can use to register your model implementation. This is not supported with the CLI, so you'll need to use GraphRAG as a library.
 - Our Protocol is [defined here](https://github.com/microsoft/graphrag/blob/main/graphrag/language_model/protocol/base.py)
 - Our base implementation, which wraps fnllm, [is here](https://github.com/microsoft/graphrag/blob/main/graphrag/language_model/providers/fnllm/models.py)
 - We have a simple mock implementation in our tests that you can [reference here](https://github.com/microsoft/graphrag/blob/main/tests/mock_provider.py)
 Once you have a model implementation, you need to register it with our ModelFactory:
 ```python
 class MyCustomModel:
    ...
    # implementation
 # elsewhere...
 ModelFactory.register_chat("my-custom-chat-model", lambda **kwargs: MyCustomModel(**kwargs))
 ```
 Then in your config you can reference the type name you used:
 ```yaml
 models:
  default_chat_model:
    type: my-custom-chat-model
 extract_graph:
  model_id: default_chat_model
  prompt: "prompts/extract_graph.txt"
  entity_types: [organization,person,geo,event]
  max_gleanings: 1
 ```
 Note that your custom model will be passed the same params for init and method calls that we use throughout GraphRAG. There is not currently any ability to define custom parameters, so you may need to use closure scope or a factory pattern within your implementation to get custom config values.
--- a/docs/config/overview.md
+++ b/docs/config/overview.md
@ -4,8 +4,8 @@ The GraphRAG system is highly configurable. This page provides an overview of th
 ## Default Configuration Mode
-The default configuration mode is the simplest way to get started with the GraphRAG system. It is designed to work out-of-the-box with minimal configuration. The primary configuration sections for the Indexing Engine pipelines are described below. The main ways to set up GraphRAG in Default Configuration mode are via:
+The default configuration mode is the simplest way to get started with the GraphRAG system. It is designed to work out-of-the-box with minimal configuration. The main ways to set up GraphRAG in Default Configuration mode are via:
- [Init command](init.md) (recommended)
+- [Init command](init.md) (recommended first step)
- [Using YAML for deeper control](yaml.md)
+- [Edit settings.yaml for deeper control](yaml.md)
 - [Purely using environment variables](env_vars.md) (not recommended)
--- a/docs/config/yaml.md
+++ b/docs/config/yaml.md
@ -60,12 +60,14 @@ models:
 - `concurrent_requests` **int** The number of open requests to allow at once.
 - `async_mode` **asyncio|threaded** The async mode to use. Either `asyncio` or `threaded`.
 - `responses` **list[str]** - If this model type is mock, this is a list of response strings to return.
 - `max_tokens` **int** - The maximum number of output tokens.
 - `temperature` **float** - The temperature to use.
 - `top_p` **float** - The top-p value to use.
 - `n` **int** - The number of completions to generate.
- `frequency_penalty` **float** - Frequency penalty for token generation.
+- `max_tokens` **int** - The maximum number of output tokens. Not valid for o-series models.
- `presence_penalty` **float** - Frequency penalty for token generation.
+- `temperature` **float** - The temperature to use. Not valid for o-series models.
 - `top_p` **float** - The top-p value to use. Not valid for o-series models.
 - `frequency_penalty` **float** - Frequency penalty for token generation. Not valid for o-series models.
 - `presence_penalty` **float** - Frequency penalty for token generation. Not valid for o-series models.
 - `max_completion_tokens` **int** - Max number of tokens to consume for chat completion. Must be large enough to include an unknown amount for "reasoning" by the model. o-series models only.
 - `reasoning_effort` **low|medium|high** - Amount of "thought" for the model to expend reasoning about a response. o-series models only.
 ## Input Files and Chunking
@ -212,7 +214,6 @@ Tune the language model-based graph extraction process.
 - `prompt` **str** - The prompt file to use.
 - `entity_types` **list[str]** - The entity types to identify.
 - `max_gleanings` **int** - The maximum number of gleaning cycles to use.
 - `encoding_model` **str** - The text encoding model to use. Default is to use the encoding model aligned with the language model (i.e., it is retrieved from tiktoken if unset). This is only used for gleanings during the logit_bias check.
 ### summarize_descriptions
@ -221,6 +222,7 @@ Tune the language model-based graph extraction process.
 - `model_id` **str** - Name of the model definition to use for API calls.
 - `prompt` **str** - The prompt file to use.
 - `max_length` **int** - The maximum number of output tokens per summarization.
 - `max_input_length` **int** - The maximum number of tokens to collect for summarization (this will limit how many descriptions you send to be summarized for a given entity or relationship).
 ### extract_graph_nlp
@ -274,7 +276,6 @@ These are the settings used for Leiden hierarchical clustering of the graph to c
 - `prompt` **str** - The prompt file to use.
 - `description` **str** - Describes the types of claims we want to extract.
 - `max_gleanings` **int** - The maximum number of gleaning cycles to use.
 - `encoding_model` **str** - The text encoding model to use. Default is to use the encoding model aligned with the language model (i.e., it is retrieved from tiktoken if unset). This is only used for gleanings during the logit_bias check.
 ### community_reports
@ -329,11 +330,7 @@ Indicates whether we should run UMAP dimensionality reduction. This is used to p
 - `conversation_history_max_turns` **int** - The conversation history maximum turns.
 - `top_k_entities` **int** - The top k mapped entities.
 - `top_k_relationships` **int** - The top k mapped relations.
- `temperature` **float | None** - The temperature to use for token generation.
+- `max_context_tokens` **int** - The maximum tokens to use building the request context.
 - `top_p` **float | None** - The top-p value to use for token generation.
 - `n` **int | None** - The number of completions to generate.
 - `max_tokens` **int** - The maximum tokens.
 - `llm_max_tokens` **int** - The LLM maximum tokens.
 ### global_search
@ -346,20 +343,14 @@ Indicates whether we should run UMAP dimensionality reduction. This is used to p
 - `map_prompt` **str | None** - The global search mapper prompt to use.
 - `reduce_prompt` **str | None** - The global search reducer to use.
 - `knowledge_prompt` **str | None** - The global search general prompt to use.
- `temperature` **float | None** - The temperature to use for token generation.
+- `max_context_tokens` **int** - The maximum context size to create, in tokens.
- `top_p` **float | None** - The top-p value to use for token generation.
+- `data_max_tokens` **int** - The maximum tokens to use constructing the final response from the reduces responses.
- `n` **int | None** - The number of completions to generate.
+- `map_max_length` **int** - The maximum length to request for map responses, in words.
- `max_tokens` **int** - The maximum context size in tokens.
+- `reduce_max_length` **int** - The maximum length to request for reduce responses, in words.
 - `data_max_tokens` **int** - The data llm maximum tokens.
 - `map_max_tokens` **int** - The map llm maximum tokens.
 - `reduce_max_tokens` **int** - The reduce llm maximum tokens.
 - `concurrency` **int** - The number of concurrent requests.
 - `dynamic_search_llm` **str** - LLM model to use for dynamic community selection.
 - `dynamic_search_threshold` **int** - Rating threshold in include a community report.
 - `dynamic_search_keep_parent` **bool** - Keep parent community if any of the child communities are relevant.
 - `dynamic_search_num_repeats` **int** - Number of times to rate the same community report.
 - `dynamic_search_use_summary` **bool** - Use community summary instead of full_context.
 - `dynamic_search_concurrent_coroutines` **int** - Number of concurrent coroutines to rate community reports.
 - `dynamic_search_max_level` **int** - The maximum level of community hierarchy to consider if none of the processed communities are relevant.
 ### drift_search
@ -370,11 +361,9 @@ Indicates whether we should run UMAP dimensionality reduction. This is used to p
 - `embedding_model_id` **str** - Name of the model definition to use for Embedding calls.
 - `prompt` **str** - The prompt file to use.
 - `reduce_prompt` **str** - The reducer prompt file to use.
 - `temperature` **float** - The temperature to use for token generation.",
 - `top_p` **float** - The top-p value to use for token generation.
 - `n` **int** - The number of completions to generate.
 - `max_tokens` **int** - The maximum context size in tokens.
 - `data_max_tokens` **int** - The data llm maximum tokens.
 - `reduce_max_tokens` **int** - The maximum tokens for the reduce phase. Only use if a non-o-series model.
 - `reduce_max_completion_tokens` **int** - The maximum tokens for the reduce phase. Only use for o-series models.
 - `concurrency` **int** - The number of concurrent requests.
 - `drift_k_followups` **int** - The number of top global results to retrieve.
 - `primer_folds` **int** - The number of folds for search priming.
@ -388,7 +377,8 @@ Indicates whether we should run UMAP dimensionality reduction. This is used to p
 - `local_search_temperature` **float** - The temperature to use for token generation in local search.
 - `local_search_top_p` **float** - The top-p value to use for token generation in local search.
 - `local_search_n` **int** - The number of completions to generate in local search.
- `local_search_llm_max_gen_tokens` **int** - The maximum number of generated tokens for the LLM in local search.
+- `local_search_llm_max_gen_tokens` **int** - The maximum number of generated tokens for the LLM in local search. Only use if a non-o-series model.
 - `local_search_llm_max_gen_completion_tokens` **int** - The maximum number of generated tokens for the LLM in local search. Only use for o-series models.
 ### basic_search
@ -397,13 +387,4 @@ Indicates whether we should run UMAP dimensionality reduction. This is used to p
 - `chat_model_id` **str** - Name of the model definition to use for Chat Completion calls.
 - `embedding_model_id` **str** - Name of the model definition to use for Embedding calls.
 - `prompt` **str** - The prompt file to use.
- `text_unit_prop` **float** - The text unit proportion. 
+- `k` **int | None** - Number of text units to retrieve from the vector store for context building.
 - `community_prop` **float** - The community proportion.
 - `conversation_history_max_turns` **int** - The conversation history maximum turns.
 - `top_k_entities` **int** - The top k mapped entities.
 - `top_k_relationships` **int** - The top k mapped relations.
 - `temperature` **float | None** - The temperature to use for token generation.
 - `top_p` **float | None** - The top-p value to use for token generation.
 - `n` **int | None** - The number of completions to generate.
 - `max_tokens` **int** - The maximum tokens.
 - `llm_max_tokens` **int** - The LLM maximum tokens.
--- a/docs/get_started.md
+++ b/docs/get_started.md
@ -10,13 +10,8 @@ To get started with the GraphRAG system, you have a few options:
 👉 [Install from pypi](https://pypi.org/project/graphrag/). <br/>
 👉 [Use it from source](developing.md)<br/>
-## Quickstart
+The following is a simple end-to-end example for using the GraphRAG system, using the install from pypi option.
 To get started with the GraphRAG system we recommend trying the [Solution Accelerator](https://github.com/Azure-Samples/graphrag-accelerator) package. This provides a user-friendly end-to-end experience with Azure resources.
 # Overview
 The following is a simple end-to-end example for using the GraphRAG system.
 It shows how to use the system to index some text, and then use the indexed data to answer questions about the documents.
 # Install GraphRAG
@ -25,8 +20,6 @@ It shows how to use the system to index some text, and then use the indexed data
 pip install graphrag
 ```
 The graphrag library includes a CLI for a no-code approach to getting started. Please review the full [CLI documentation](cli.md) for further detail.
 # Running the Indexer
 We need to set up a data project and some initial configuration. First let's get a sample dataset ready:
@ -53,17 +46,17 @@ graphrag init --root ./ragtest
 This will create two files: `.env` and `settings.yaml` in the `./ragtest` directory.
 - `.env` contains the environment variables required to run the GraphRAG pipeline. If you inspect the file, you'll see a single environment variable defined,
-  `GRAPHRAG_API_KEY=<API_KEY>`. This is the API key for the OpenAI API or Azure OpenAI endpoint. You can replace this with your own API key. If you are using another form of authentication (i.e. managed identity), please delete this file.
+  `GRAPHRAG_API_KEY=<API_KEY>`. Replace `<API_KEY>` with your own OpenAI or Azure API key.
 - `settings.yaml` contains the settings for the pipeline. You can modify this file to change the settings for the pipeline.
  <br/>
-#### <ins>OpenAI and Azure OpenAI</ins>
+### Using OpenAI
-If running in OpenAI mode, update the value of `GRAPHRAG_API_KEY` in the `.env` file with your OpenAI API key.
+If running in OpenAI mode, you only need to update the value of `GRAPHRAG_API_KEY` in the `.env` file with your OpenAI API key.
-#### <ins>Azure OpenAI</ins>
+### Using Azure OpenAI
-In addition, Azure OpenAI users should set the following variables in the settings.yaml file. To find the appropriate sections, just search for the `llm:` configuration, you should see two sections, one for the chat endpoint and one for the embeddings endpoint. Here is an example of how to configure the chat endpoint:
+In addition to setting your API key, Azure OpenAI users should set the variables below in the settings.yaml file. To find the appropriate sections, just search for the `models:` root configuration; you should see two sections, one for the default chat endpoint and one for the default embeddings endpoint. Here is an example of what to add to the chat model config:
 ```yaml
 type: azure_openai_chat # Or azure_openai_embedding for embeddings
@ -72,9 +65,15 @@ api_version: 2024-02-15-preview # You can customize this for other versions
 deployment_name: <azure_model_deployment_name>
 ```
- For more details about configuring GraphRAG, see the [configuration documentation](config/overview.md).
+#### Using Managed Auth on Azure
- To learn more about Initialization, refer to the [Initialization documentation](config/init.md).
+To use managed auth, add an additional value to your model config and comment out or remove the api_key line:
- For more details about using the CLI, refer to the [CLI documentation](cli.md).
+
 ```yaml
 auth_type: azure_managed_identity # Default auth_type is is api_key
 # api_key: ${GRAPHRAG_API_KEY}
 ```
 You will also need to login with [az login](https://learn.microsoft.com/en-us/cli/azure/authenticate-azure-cli) and select the subscription with your endpoint.
 ## Running the Indexing pipeline
@ -86,13 +85,11 @@ graphrag index --root ./ragtest
 ![pipeline executing from the CLI](img/pipeline-running.png)
-This process will take some time to run. This depends on the size of your input data, what model you're using, and the text chunk size being used (these can be configured in your `settings.yml` file).
+This process will take some time to run. This depends on the size of your input data, what model you're using, and the text chunk size being used (these can be configured in your `settings.yaml` file).
 Once the pipeline is complete, you should see a new folder called `./ragtest/output` with a series of parquet files.
 # Using the Query Engine
 ## Running the Query Engine
 Now let's ask some questions using this dataset.
 Here is an example using Global search to ask a high-level question:
@ -115,5 +112,9 @@ graphrag query \
 Please refer to [Query Engine](query/overview.md) docs for detailed information about how to leverage our Local and Global search mechanisms for extracting meaningful insights from data after the Indexer has wrapped up execution.
-# Visualizing the Graph
+# Going Deeper
-Check out our [visualization guide](visualization_guide.md) for a more interactive experience in debugging and exploring the knowledge graph.
+
 - For more details about configuring GraphRAG, see the [configuration documentation](config/overview.md).
 - To learn more about Initialization, refer to the [Initialization documentation](config/init.md).
 - For more details about using the CLI, refer to the [CLI documentation](cli.md).
 - Check out our [visualization guide](visualization_guide.md) for a more interactive experience in debugging and exploring the knowledge graph.
--- a/docs/index/byog.md
+++ b/docs/index/byog.md
@ -0,0 +1,71 @@
 # Bring Your Own Graph
 Several users have asked if they can bring their own existing graph and have it summarized for query with GraphRAG. There are many possible ways to do this, but here we'll describe a simple method that aligns with the existing GraphRAG workflows quite easily.
 To cover the basic use cases for GraphRAG query, you should have two or three tables derived from your data:
 - entities.parquet - this is the list of entities found in the dataset, which are the nodes of the graph.
 - relationships.parquet - this is the list of relationships found in the dataset, which are the edges of the graph.
 - text_units.parquet - this is the source text chunks the graph was extracted from. This is optional depending on the query method you intend to use (described later).
 The approach described here will be to run a custom GraphRAG workflow pipeline that assumes the text chunking, entity extraction, and relationship extraction has already occurred.
 ## Tables
 ### Entities
 See the full entities [table schema](./outputs.md#entities). For graph summarization purposes, you only need id, title, description, and the list of text_unit_ids.
 The additional properties are used for optional graph visualization purposes.
 ### Relationships
 See the full relationships [table schema](./outputs.md#relationships). For graph summarization purposes, you only need id, source, target, description, weight, and the list of text_unit_ids.
 > Note: the `weight` field is important because it is used to properly compute Leiden communities!
 ## Workflow Configuration
 GraphRAG includes the ability to specify *only* the specific workflow steps that you need. For basic graph summarization and query, you need the following config in your settings.yaml:
 ```yaml
 workflows: [create_communities, create_community_reports]
 ```
 This will result in only the minimal workflows required for GraphRAG [Global Search](../query/global_search.md).
 ## Optional Additional Config
 If you would like to run [Local](../query/local_search.md), [DRIFT](../query/drift_search.md), or [Basic](../query/overview.md#basic-search) Search, you will need to include text_units and some embeddings.
 ### Text Units
 See the full text_units [table schema](./outputs.md#text_units). Text units are chunks of your documents that are sized to ensure they fit into the context window of your model. Some search methods use these, so you may want to include them if you have them.
 ### Expanded Config
 To perform the other search types above, you need some of the content to be embedded. Simply add the embeddings workflow:
 ```yaml
 workflows: [create_communities, create_community_reports, generate_text_embeddings]
 ```
 ### FastGraphRAG
 [FastGraphRAG](./methods.md#fastgraphrag) uses text_units for the community reports instead of the entity and relationship descriptions. If your graph is sourced in such a way that it does not have descriptions, this might be a useful alternative. In this case, you would update your workflows list to include the text variant:
 ```yaml
 workflows: [create_communities, create_community_reports_text, generate_text_embeddings]
 ```
 This method requires that your entities and relationships tables have valid links to a list of text_unit_ids. Also note that `generate_text_embeddings` is still only required if you are doing searches other than Global Search.
 ## Setup
 Putting it all together:
 - `input`: GraphRAG does require an input document set, even if you don't need us to process it. You can create an input folder and drop a dummy.txt document in there to work around this.
 - `output`: Create an output folder and put your entities and relationships (and optionally text_units) parquet files in it.
 - Update your config as noted above to only run the workflows subset you need.
 - Run `graphrag index --root <your project root>`
--- a/docs/query/overview.md
+++ b/docs/query/overview.md
@ -26,6 +26,10 @@ DRIFT Search introduces a new approach to local search queries by including comm
 To learn more about DRIFT Search, please refer to the [DRIFT Search](drift_search.md) documentation.
 ## Basic Search
 GraphRAG includes a rudimentary implementation of basic vector RAG to make it easy to compare different search results based on the type of question you are asking. You can specify the top `k` txt unit chunks to include in the summarization context.
 ## Question Generation
 This functionality takes a list of user queries and generates the next candidate questions. This is useful for generating follow-up questions in a conversation or for generating a list of questions for the investigator to dive deeper into the dataset.
--- a/docs/visualization_guide.md
+++ b/docs/visualization_guide.md
@ -13,19 +13,19 @@ snapshots:
 embed_graph:
  enabled: true # will generate node2vec embeddings for nodes
 umap:
-  enabled: true # will generate UMAP embeddings for nodes
+  enabled: true # will generate UMAP embeddings for nodes, giving the entities table an x/y position to plot
 ```
 After running the indexing pipeline over your data, there will be an output folder (defined by the `storage.base_dir` setting).
 - **Output Folder**: Contains artifacts from the LLM’s indexing pass.
 ## 2. Locate the Knowledge Graph
-In the output folder, look for a file named `merged_graph.graphml`. graphml is a standard [file format](http://graphml.graphdrawing.org) supported by many visualization tools. We recommend trying [Gephi](https://gephi.org).
+In the output folder, look for a file named `graph.graphml`. graphml is a standard [file format](http://graphml.graphdrawing.org) supported by many visualization tools. We recommend trying [Gephi](https://gephi.org).
 ## 3. Open the Graph in Gephi
 1. Install and open Gephi
 2. Navigate to the `output` folder containing the various parquet files.
-3. Import the `merged_graph.graphml` file into Gephi. This will result in a fairly plain view of the undirected graph nodes and edges.
+3. Import the `graph.graphml` file into Gephi. This will result in a fairly plain view of the undirected graph nodes and edges.
 <p align="center">
   <img src="../img/viz_guide/gephi-initial-graph-example.png" alt="A basic graph visualization by Gephi" width="300"/>
--- a/mkdocs.yaml
+++ b/mkdocs.yaml
@ -27,10 +27,11 @@ nav:
      - Development Guide: developing.md
  - Indexing:
      - Overview: "index/overview.md"
      - Architecture: "index/architecture.md"
      - Dataflow: "index/default_dataflow.md"
      - Methods: "index/methods.md"
      - Inputs: "index/inputs.md"
      - Outputs: "index/outputs.md"
      - Custom Graphs: "index/byog.md"
  - Prompt Tuning:
      - Overview: "prompt_tuning/overview.md"
      - Auto Tuning: "prompt_tuning/auto_prompt_tuning.md"
@ -49,8 +50,8 @@ nav:
  - Configuration:
      - Overview: "config/overview.md"
      - Init Command: "config/init.md"
-      - Using YAML: "config/yaml.md"
+      - Detailed Configuration: "config/yaml.md"
-      - Using Env Vars: "config/env_vars.md"
+      - Language Model Selection: "config/models.md"
  - CLI: "cli.md"
  - Extras:
      - Microsoft Research Blog: "blog_posts.md"
@ -29,4 +29,4 @@ The `init` command will create the following files in the specified directory:

	`## Next Steps`	`## Next Steps`

	`After initializing your workspace, you can either run the [Prompt Tuning](../prompt_tuning/auto_prompt_tuning.md) command to adapt the prompts to your data or even start running the [Indexing Pipeline](../index/overview.md) to index your data. For more information on configuring GraphRAG, see the [Configuration](overview.md) documentation.`	`After initializing your workspace, you can either run the [Prompt Tuning](../prompt_tuning/auto_prompt_tuning.md) command to adapt the prompts to your data or even start running the [Indexing Pipeline](../index/overview.md) to index your data. For more information on configuration options available, see the [YAML details page](yaml.md).`