diff --git a/data/operation_dulce/dataset.zip b/data/operation_dulce/dataset.zip index af16b962..e4ebdf87 100644 Binary files a/data/operation_dulce/dataset.zip and b/data/operation_dulce/dataset.zip differ diff --git a/index.html b/index.html index d45074f2..11507130 100644 --- a/index.html +++ b/index.html @@ -212,8 +212,6 @@ a {
  • CLI
  • -Prompt Tuning -
  • Configuration diff --git a/posts/config/custom/index.html b/posts/config/custom/index.html index f240fe85..5c8d95a8 100644 --- a/posts/config/custom/index.html +++ b/posts/config/custom/index.html @@ -212,8 +212,6 @@ a {
  • CLI
  • -Prompt Tuning -
  • Configuration diff --git a/posts/config/env_vars/index.html b/posts/config/env_vars/index.html index 367ae05b..254f3fa5 100644 --- a/posts/config/env_vars/index.html +++ b/posts/config/env_vars/index.html @@ -212,8 +212,6 @@ a {
  • CLI
  • -Prompt Tuning -
  • Configuration diff --git a/posts/config/json_yaml/index.html b/posts/config/json_yaml/index.html index 40f4c0b5..e029088f 100644 --- a/posts/config/json_yaml/index.html +++ b/posts/config/json_yaml/index.html @@ -212,8 +212,6 @@ a {
  • CLI
  • -Prompt Tuning -
  • Configuration diff --git a/posts/config/overview/index.html b/posts/config/overview/index.html index 6e5502fd..e8f63513 100644 --- a/posts/config/overview/index.html +++ b/posts/config/overview/index.html @@ -212,8 +212,6 @@ a {
  • CLI
  • -Prompt Tuning -
  • Configuration diff --git a/posts/config/template/index.html b/posts/config/template/index.html index f988001f..92c5c276 100644 --- a/posts/config/template/index.html +++ b/posts/config/template/index.html @@ -212,8 +212,6 @@ a {
  • CLI
  • -Prompt Tuning -
  • Configuration diff --git a/posts/developing/index.html b/posts/developing/index.html index 9c405a71..b524f61f 100644 --- a/posts/developing/index.html +++ b/posts/developing/index.html @@ -212,8 +212,6 @@ a {
  • CLI
  • -Prompt Tuning -
  • Configuration diff --git a/posts/get_started/index.html b/posts/get_started/index.html index 0c83e0b4..4d871bd6 100644 --- a/posts/get_started/index.html +++ b/posts/get_started/index.html @@ -212,8 +212,6 @@ a {
  • CLI
  • -Prompt Tuning -
  • Configuration diff --git a/posts/index/0-architecture/index.html b/posts/index/0-architecture/index.html index c34148d6..bb484b36 100644 --- a/posts/index/0-architecture/index.html +++ b/posts/index/0-architecture/index.html @@ -212,8 +212,6 @@ a {
  • CLI
  • -Prompt Tuning -
  • Configuration diff --git a/posts/index/1-default_dataflow/index.html b/posts/index/1-default_dataflow/index.html index 3337234d..4cafc1ab 100644 --- a/posts/index/1-default_dataflow/index.html +++ b/posts/index/1-default_dataflow/index.html @@ -212,8 +212,6 @@ a {
  • CLI
  • -Prompt Tuning -
  • Configuration diff --git a/posts/index/2-cli/index.html b/posts/index/2-cli/index.html index 10c562c6..f9e3071c 100644 --- a/posts/index/2-cli/index.html +++ b/posts/index/2-cli/index.html @@ -212,8 +212,6 @@ a {
  • CLI
  • -Prompt Tuning -
  • Configuration diff --git a/posts/index/3-prompt_tuning/index.html b/posts/index/3-prompt_tuning/index.html deleted file mode 100644 index a44c7038..00000000 --- a/posts/index/3-prompt_tuning/index.html +++ /dev/null @@ -1,350 +0,0 @@ - - - - - - - - - - Prompt Tuning - - - - - - - - - - - - - - -
    - - GraphRAG -
    -
    - - - - -
    -

    Prompt Tuning

    -

    The GraphRAG indexer, by default, will run with a handful of prompts that are designed to work well in the broad context of knowledge discovery. -However, it is quite common to want to tune the prompts to better suit your specific use case. -We provide a means for you to do this by allowing you to specify a custom prompt file, which will each use a series of token-replacements internally.

    -

    Each of these prompts may be overridden by writing a custom prompt file in plaintext. We use token-replacements in the form of {token_name}, and the descriptions for the available tokens can be found below.

    -

    Entity/Relationship Extraction

    -

    Prompt Source

    -

    Tokens (values provided by extractor)

    -
      -
    • {input_text} - The input text to be processed.
    • -
    • {entity_types} - A list of entity types
    • -
    • {tuple_delimiter} - A delimiter for separating values within a tuple. A single tuple is used to represent an individual entity or relationship.
    • -
    • {record_delimiter} - A delimiter for separating tuple instances.
    • -
    • {completion_delimiter} - An indicator for when generation is complete.
    • -
    -

    Summarize Entity/Relationship Descriptions

    -

    Prompt Source

    -

    Tokens (values provided by extractor)

    -
      -
    • {entity_name} - The name of the entity or the source/target pair of the relationship.
    • -
    • {description_list} - A list of descriptions for the entity or relationship.
    • -
    -

    Claim Extraction

    -

    Prompt Source

    -

    Tokens (values provided by extractor)

    -
      -
    • {input_text} - The input text to be processed.
    • -
    • {tuple_delimiter} - A delimiter for separating values within a tuple. A single tuple is used to represent an individual entity or relationship.
    • -
    • {record_delimiter} - A delimiter for separating tuple instances.
    • -
    • {completion_delimiter} - An indicator for when generation is complete.
    • -
    -

    Note: there is additional paramater for the Claim Description that is used in claim extraction. -The default value is

    -

    "Any claims or facts that could be relevant to information discovery."

    -

    See the configuration documentation for details on how to change this.

    -

    Generate Community Reports

    -

    Prompt Source

    -

    Tokens (values provided by extractor)

    -
      -
    • {input_text} - The input text to generate the report with. This will contain tables of entities and relationships.
    • -
    - -
    -
    - - - \ No newline at end of file diff --git a/posts/index/overview/index.html b/posts/index/overview/index.html index 21fcecb7..14a7f5ce 100644 --- a/posts/index/overview/index.html +++ b/posts/index/overview/index.html @@ -212,8 +212,6 @@ a {
  • CLI
  • -Prompt Tuning -
  • Configuration diff --git a/posts/prompt_tuning/auto_prompt_tuning/index.html b/posts/prompt_tuning/auto_prompt_tuning/index.html index 25d3982b..cb761a82 100644 --- a/posts/prompt_tuning/auto_prompt_tuning/index.html +++ b/posts/prompt_tuning/auto_prompt_tuning/index.html @@ -212,8 +212,6 @@ a {
  • CLI
  • -Prompt Tuning -
  • Configuration @@ -288,8 +286,8 @@ a {

    Prompt Tuning âš™ī¸

    -

    GraphRAG provides the ability to create domain adaptive templates for the generation of the knowledge graph. This step is optional, tho is is highly encouraged to run it as it will yield better results when executing an Index Run.

    -

    This templates are generated by loading the inputs, splitting them into chunks and then running a series of LLM invocations and template subtitions to generate the final prompts. It is highly suggested to use the default values the script provides, but in this page you'll find the detail of each in case you want to further explore and tweak the template generation algorithm.

    +

    GraphRAG provides the ability to create domain adaptive templates for the generation of the knowledge graph. This step is optional, though is is highly encouraged to run it as it will yield better results when executing an Index Run.

    +

    The templates are generated by loading the inputs, splitting them into chunks (text units) and then running a series of LLM invocations and template substitutions to generate the final prompts. We suggest using the default values provided by the script, but in this page you'll find the detail of each in case you want to further explore and tweak the template generation algorithm.

    Usage

    You can run the main script from the command line with various options:

    @@ -303,40 +301,40 @@ a {

    Command-Line Options

    Example Usage

    -
    python -m graphrag.prompt_tune --root /path/to/project --domain "environmental news" --method random --limit 10 --max_tokens 2048 --chunk_size 256 --no-entity-types --output /path/to/output
    +
    python -m graphrag.prompt_tune --root /path/to/project --domain "environmental news" --method random --limit 10 --max-tokens 2048 --chunk-size 256 --no-entity-types --output /path/to/output
    -

    or

    +

    or, with minimal configuration (suggested):

    python -m graphrag.prompt_tune --root /path/to/project --no-entity-types
    @@ -345,11 +343,27 @@ a {
    +

    Document Selection Methods

    +

    The auto template feature ingests the input data and then divides it into text units the size of the chunk size parameter. +After that, it uses one of the following selection methods to pick a sample to work with for template generation:

    +

    Modify Env Vars

    -

    After running auto-templating, you should modify the following environment variables (or config variables) to pick up the new prompts on your index run.

    -

    GRAPHRAG_ENTITY_EXTRACTION_PROMPT_FILE = "prompts/entity_extraction.txt" -GRAPHRAG_COMMUNITY_REPORT_PROMPT_FILE = "prompts/community_report.txt" -GRAPHRAG_SUMMARIZE_DESCRIPTIONS_PROMPT_FILE = "prompts/summarize_descriptions.txt"

    +

    After running auto-templating, you should modify the following environment variables (or config variables) to pick up the new prompts on your index run. Note: Please make sure to update the correct path to the generated prompts, in this example we are using the default "prompts" path.

    +
    diff --git a/posts/prompt_tuning/manual_prompt_tuning/index.html b/posts/prompt_tuning/manual_prompt_tuning/index.html index 72e9504f..bc5805ce 100644 --- a/posts/prompt_tuning/manual_prompt_tuning/index.html +++ b/posts/prompt_tuning/manual_prompt_tuning/index.html @@ -212,8 +212,6 @@ a {
  • CLI
  • -Prompt Tuning -
  • Configuration diff --git a/posts/prompt_tuning/overview/index.html b/posts/prompt_tuning/overview/index.html index 443fe542..e5c725af 100644 --- a/posts/prompt_tuning/overview/index.html +++ b/posts/prompt_tuning/overview/index.html @@ -7,7 +7,7 @@ - Prompt Tuning 🤖 + Prompt Tuning âš™ī¸ @@ -212,8 +212,6 @@ a {
  • CLI
  • -Prompt Tuning -
  • Configuration @@ -287,7 +285,7 @@ a {
    -

    Prompt Tuning 🤖

    +

    Prompt Tuning âš™ī¸

    This page provides an overview of the prompt tuning options available for the GraphRAG indexing engine.

    Default Prompts

    The default prompts are the simplest way to get started with the GraphRAG system. It is designed to work out-of-the-box with minimal configuration. You can find more detail about these prompts in the following links:

    @@ -300,7 +298,7 @@ a {

    Auto Templating

    Auto Templating leverages your input data and LLM interactions to create domain adaptive templates for the generation of the knowledge graph. It is highly encouraged to run it as it will yield better results when executing an Index Run. For more details about how to use it, please refer to the Auto Templating documentation.

    Manual Configuration

    -

    Manual configuration is an advanced use-case. Most users will want to use the Auto Templating feature instead.. Details about how to use manual configuration are available in the Manual Prompt Configuration documentation.

    +

    Manual configuration is an advanced use-case. Most users will want to use the Auto Templating feature instead. Details about how to use manual configuration are available in the Manual Prompt Configuration documentation.

    diff --git a/posts/query/0-global_search/index.html b/posts/query/0-global_search/index.html index e6cd75eb..426e2d7c 100644 --- a/posts/query/0-global_search/index.html +++ b/posts/query/0-global_search/index.html @@ -212,8 +212,6 @@ a {
  • CLI
  • -Prompt Tuning -
  • Configuration diff --git a/posts/query/1-local_search/index.html b/posts/query/1-local_search/index.html index 326a45a1..d6594096 100644 --- a/posts/query/1-local_search/index.html +++ b/posts/query/1-local_search/index.html @@ -212,8 +212,6 @@ a {
  • CLI
  • -Prompt Tuning -
  • Configuration diff --git a/posts/query/2-question_generation/index.html b/posts/query/2-question_generation/index.html index a1145200..5a05d20b 100644 --- a/posts/query/2-question_generation/index.html +++ b/posts/query/2-question_generation/index.html @@ -212,8 +212,6 @@ a {
  • CLI
  • -Prompt Tuning -
  • Configuration diff --git a/posts/query/3-cli/index.html b/posts/query/3-cli/index.html index 7447f7cb..b3ac94f8 100644 --- a/posts/query/3-cli/index.html +++ b/posts/query/3-cli/index.html @@ -212,8 +212,6 @@ a {
  • CLI
  • -Prompt Tuning -
  • Configuration diff --git a/posts/query/notebooks/global_search_nb/index.html b/posts/query/notebooks/global_search_nb/index.html index 3e13ab56..69d82284 100644 --- a/posts/query/notebooks/global_search_nb/index.html +++ b/posts/query/notebooks/global_search_nb/index.html @@ -212,8 +212,6 @@ a {
  • CLI
  • -Prompt Tuning -
  • Configuration diff --git a/posts/query/notebooks/local_search_nb/index.html b/posts/query/notebooks/local_search_nb/index.html index 62b53657..814531e7 100644 --- a/posts/query/notebooks/local_search_nb/index.html +++ b/posts/query/notebooks/local_search_nb/index.html @@ -212,8 +212,6 @@ a {
  • CLI
  • -Prompt Tuning -
  • Configuration diff --git a/posts/query/notebooks/overview/index.html b/posts/query/notebooks/overview/index.html index 9e46b76a..605a4d71 100644 --- a/posts/query/notebooks/overview/index.html +++ b/posts/query/notebooks/overview/index.html @@ -212,8 +212,6 @@ a {
  • CLI
  • -Prompt Tuning -
  • Configuration diff --git a/posts/query/overview/index.html b/posts/query/overview/index.html index e0d52b84..c02f6fe7 100644 --- a/posts/query/overview/index.html +++ b/posts/query/overview/index.html @@ -212,8 +212,6 @@ a {
  • CLI
  • -Prompt Tuning -
  • Configuration