graphrag/docs/examples_notebooks/global_search.ipynb
Alonso Guevara e21a38f2ab
Fix/notebooks (#1614)
* Add new inputs and missing vector store for retrieving vectors

* Format

* Semver

* Remove .Identifier files

* Fix spellcheck

* Remove unnecessary input file for notebooks
2025-01-13 17:41:39 -06:00

1089 lines
47 KiB
Plaintext

{
"cells": [
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"# Copyright (c) 2024 Microsoft Corporation.\n",
"# Licensed under the MIT License."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"import pandas as pd\n",
"import tiktoken\n",
"\n",
"from graphrag.query.indexer_adapters import (\n",
" read_indexer_communities,\n",
" read_indexer_entities,\n",
" read_indexer_reports,\n",
")\n",
"from graphrag.query.llm.oai.chat_openai import ChatOpenAI\n",
"from graphrag.query.llm.oai.typing import OpenaiApiType\n",
"from graphrag.query.structured_search.global_search.community_context import (\n",
" GlobalCommunityContext,\n",
")\n",
"from graphrag.query.structured_search.global_search.search import GlobalSearch"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Global Search example\n",
"\n",
"Global search method generates answers by searching over all AI-generated community reports in a map-reduce fashion. This is a resource-intensive method, but often gives good responses for questions that require an understanding of the dataset as a whole (e.g. What are the most significant values of the herbs mentioned in this notebook?)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### LLM setup"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"api_key = os.environ[\"GRAPHRAG_API_KEY\"]\n",
"llm_model = os.environ[\"GRAPHRAG_LLM_MODEL\"]\n",
"\n",
"llm = ChatOpenAI(\n",
" api_key=api_key,\n",
" model=llm_model,\n",
" api_type=OpenaiApiType.OpenAI, # OpenaiApiType.OpenAI or OpenaiApiType.AzureOpenAI\n",
" max_retries=20,\n",
")\n",
"\n",
"token_encoder = tiktoken.encoding_for_model(llm_model)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Load community reports as context for global search\n",
"\n",
"- Load all community reports in the `create_final_community_reports` table from the GraphRAG, to be used as context data for global search.\n",
"- Load entities from the `create_final_nodes` and `create_final_entities` tables from the GraphRAG, to be used for calculating community weights for context ranking. Note that this is optional (if no entities are provided, we will not calculate community weights and only use the rank attribute in the community reports table for context ranking)\n",
"- Load all communities in the `create_final_communites` table from the GraphRAG, to be used to reconstruct the community graph hierarchy for dynamic community selection."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"# parquet files generated from indexing pipeline\n",
"INPUT_DIR = \"./inputs/operation dulce\"\n",
"COMMUNITY_TABLE = \"create_final_communities\"\n",
"COMMUNITY_REPORT_TABLE = \"create_final_community_reports\"\n",
"ENTITY_TABLE = \"create_final_nodes\"\n",
"ENTITY_EMBEDDING_TABLE = \"create_final_entities\"\n",
"\n",
"# community level in the Leiden community hierarchy from which we will load the community reports\n",
"# higher value means we use reports from more fine-grained communities (at the cost of higher computation cost)\n",
"COMMUNITY_LEVEL = 2"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Total report count: 72\n",
"Report count after filtering by community level 2: 56\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>id</th>\n",
" <th>human_readable_id</th>\n",
" <th>community</th>\n",
" <th>parent</th>\n",
" <th>level</th>\n",
" <th>title</th>\n",
" <th>summary</th>\n",
" <th>full_content</th>\n",
" <th>rank</th>\n",
" <th>rank_explanation</th>\n",
" <th>findings</th>\n",
" <th>full_content_json</th>\n",
" <th>period</th>\n",
" <th>size</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>16949a5d17b740b2b4a6f787b0a637f1</td>\n",
" <td>43</td>\n",
" <td>43</td>\n",
" <td>10</td>\n",
" <td>2</td>\n",
" <td>Ben Bloomberg and the Harmoniser Project</td>\n",
" <td>The community centers around Ben Bloomberg, a ...</td>\n",
" <td># Ben Bloomberg and the Harmoniser Project\\n\\n...</td>\n",
" <td>7.5</td>\n",
" <td>The impact severity rating is high due to the ...</td>\n",
" <td>[{'explanation': 'Ben Bloomberg is a pivotal f...</td>\n",
" <td>{\\n \"title\": \"Ben Bloomberg and the Harmoni...</td>\n",
" <td>2025-01-10</td>\n",
" <td>35</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>4ff756b7041f4dcab6612e016af2b14d</td>\n",
" <td>44</td>\n",
" <td>44</td>\n",
" <td>10</td>\n",
" <td>2</td>\n",
" <td>North Hampton and Influential Musicians</td>\n",
" <td>The community centers around North Hampton, a ...</td>\n",
" <td># North Hampton and Influential Musicians\\n\\nT...</td>\n",
" <td>6.5</td>\n",
" <td>The impact severity rating is moderately high ...</td>\n",
" <td>[{'explanation': 'North Hampton serves as the ...</td>\n",
" <td>{\\n \"title\": \"North Hampton and Influential...</td>\n",
" <td>2025-01-10</td>\n",
" <td>4</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>2d3df394272743a781606ad80ccb5312</td>\n",
" <td>45</td>\n",
" <td>45</td>\n",
" <td>10</td>\n",
" <td>2</td>\n",
" <td>Prince of Monaco and Monaco</td>\n",
" <td>The community revolves around the Prince of Mo...</td>\n",
" <td># Prince of Monaco and Monaco\\n\\nThe community...</td>\n",
" <td>4.0</td>\n",
" <td>The impact severity rating is moderate due to ...</td>\n",
" <td>[{'explanation': 'The Prince of Monaco is a ke...</td>\n",
" <td>{\\n \"title\": \"Prince of Monaco and Monaco\",...</td>\n",
" <td>2025-01-10</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>becbd958973f42b0bd53cca9250feaf1</td>\n",
" <td>46</td>\n",
" <td>46</td>\n",
" <td>10</td>\n",
" <td>2</td>\n",
" <td>Robot Opera and Broadway</td>\n",
" <td>The community revolves around the Robot Opera,...</td>\n",
" <td># Robot Opera and Broadway\\n\\nThe community re...</td>\n",
" <td>7.5</td>\n",
" <td>The impact severity rating is high due to the ...</td>\n",
" <td>[{'explanation': 'The Robot Opera is a notable...</td>\n",
" <td>{\\n \"title\": \"Robot Opera and Broadway\",\\n ...</td>\n",
" <td>2025-01-10</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>f7d29921ae3e41a79ae7f88dae584892</td>\n",
" <td>47</td>\n",
" <td>47</td>\n",
" <td>13</td>\n",
" <td>2</td>\n",
" <td>Ben and Jacob's Fusion of Art and Technology</td>\n",
" <td>The community centers around Ben and Jacob, wh...</td>\n",
" <td># Ben and Jacob's Fusion of Art and Technology...</td>\n",
" <td>7.5</td>\n",
" <td>The impact severity rating is high due to the ...</td>\n",
" <td>[{'explanation': 'Ben and Jacob are key collab...</td>\n",
" <td>{\\n \"title\": \"Ben and Jacob's Fusion of Art...</td>\n",
" <td>2025-01-10</td>\n",
" <td>5</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" id human_readable_id community parent \\\n",
"0 16949a5d17b740b2b4a6f787b0a637f1 43 43 10 \n",
"1 4ff756b7041f4dcab6612e016af2b14d 44 44 10 \n",
"2 2d3df394272743a781606ad80ccb5312 45 45 10 \n",
"3 becbd958973f42b0bd53cca9250feaf1 46 46 10 \n",
"4 f7d29921ae3e41a79ae7f88dae584892 47 47 13 \n",
"\n",
" level title \\\n",
"0 2 Ben Bloomberg and the Harmoniser Project \n",
"1 2 North Hampton and Influential Musicians \n",
"2 2 Prince of Monaco and Monaco \n",
"3 2 Robot Opera and Broadway \n",
"4 2 Ben and Jacob's Fusion of Art and Technology \n",
"\n",
" summary \\\n",
"0 The community centers around Ben Bloomberg, a ... \n",
"1 The community centers around North Hampton, a ... \n",
"2 The community revolves around the Prince of Mo... \n",
"3 The community revolves around the Robot Opera,... \n",
"4 The community centers around Ben and Jacob, wh... \n",
"\n",
" full_content rank \\\n",
"0 # Ben Bloomberg and the Harmoniser Project\\n\\n... 7.5 \n",
"1 # North Hampton and Influential Musicians\\n\\nT... 6.5 \n",
"2 # Prince of Monaco and Monaco\\n\\nThe community... 4.0 \n",
"3 # Robot Opera and Broadway\\n\\nThe community re... 7.5 \n",
"4 # Ben and Jacob's Fusion of Art and Technology... 7.5 \n",
"\n",
" rank_explanation \\\n",
"0 The impact severity rating is high due to the ... \n",
"1 The impact severity rating is moderately high ... \n",
"2 The impact severity rating is moderate due to ... \n",
"3 The impact severity rating is high due to the ... \n",
"4 The impact severity rating is high due to the ... \n",
"\n",
" findings \\\n",
"0 [{'explanation': 'Ben Bloomberg is a pivotal f... \n",
"1 [{'explanation': 'North Hampton serves as the ... \n",
"2 [{'explanation': 'The Prince of Monaco is a ke... \n",
"3 [{'explanation': 'The Robot Opera is a notable... \n",
"4 [{'explanation': 'Ben and Jacob are key collab... \n",
"\n",
" full_content_json period size \n",
"0 {\\n \"title\": \"Ben Bloomberg and the Harmoni... 2025-01-10 35 \n",
"1 {\\n \"title\": \"North Hampton and Influential... 2025-01-10 4 \n",
"2 {\\n \"title\": \"Prince of Monaco and Monaco\",... 2025-01-10 2 \n",
"3 {\\n \"title\": \"Robot Opera and Broadway\",\\n ... 2025-01-10 2 \n",
"4 {\\n \"title\": \"Ben and Jacob's Fusion of Art... 2025-01-10 5 "
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"community_df = pd.read_parquet(f\"{INPUT_DIR}/{COMMUNITY_TABLE}.parquet\")\n",
"entity_df = pd.read_parquet(f\"{INPUT_DIR}/{ENTITY_TABLE}.parquet\")\n",
"report_df = pd.read_parquet(f\"{INPUT_DIR}/{COMMUNITY_REPORT_TABLE}.parquet\")\n",
"entity_embedding_df = pd.read_parquet(f\"{INPUT_DIR}/{ENTITY_EMBEDDING_TABLE}.parquet\")\n",
"\n",
"communities = read_indexer_communities(community_df, entity_df, report_df)\n",
"reports = read_indexer_reports(report_df, entity_df, COMMUNITY_LEVEL)\n",
"entities = read_indexer_entities(entity_df, entity_embedding_df, COMMUNITY_LEVEL)\n",
"\n",
"print(f\"Total report count: {len(report_df)}\")\n",
"print(\n",
" f\"Report count after filtering by community level {COMMUNITY_LEVEL}: {len(reports)}\"\n",
")\n",
"\n",
"report_df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Build global context based on community reports"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"context_builder = GlobalCommunityContext(\n",
" community_reports=reports,\n",
" communities=communities,\n",
" entities=entities, # default to None if you don't want to use community weights for ranking\n",
" token_encoder=token_encoder,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Perform global search"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"context_builder_params = {\n",
" \"use_community_summary\": False, # False means using full community reports. True means using community short summaries.\n",
" \"shuffle_data\": True,\n",
" \"include_community_rank\": True,\n",
" \"min_community_rank\": 0,\n",
" \"community_rank_name\": \"rank\",\n",
" \"include_community_weight\": True,\n",
" \"community_weight_name\": \"occurrence weight\",\n",
" \"normalize_community_weight\": True,\n",
" \"max_tokens\": 12_000, # change this based on the token limit you have on your model (if you are using a model with 8k limit, a good setting could be 5000)\n",
" \"context_name\": \"Reports\",\n",
"}\n",
"\n",
"map_llm_params = {\n",
" \"max_tokens\": 1000,\n",
" \"temperature\": 0.0,\n",
" \"response_format\": {\"type\": \"json_object\"},\n",
"}\n",
"\n",
"reduce_llm_params = {\n",
" \"max_tokens\": 2000, # change this based on the token limit you have on your model (if you are using a model with 8k limit, a good setting could be 1000-1500)\n",
" \"temperature\": 0.0,\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"search_engine = GlobalSearch(\n",
" llm=llm,\n",
" context_builder=context_builder,\n",
" token_encoder=token_encoder,\n",
" max_data_tokens=12_000, # change this based on the token limit you have on your model (if you are using a model with 8k limit, a good setting could be 5000)\n",
" map_llm_params=map_llm_params,\n",
" reduce_llm_params=reduce_llm_params,\n",
" allow_general_knowledge=False, # set this to True will add instruction to encourage the LLM to incorporate general knowledge in the response, which may increase hallucinations, but could be useful in some use cases.\n",
" json_mode=True, # set this to False if your LLM model does not support JSON mode.\n",
" context_builder_params=context_builder_params,\n",
" concurrent_coroutines=32,\n",
" response_type=\"multiple paragraphs\", # free form text describing the response type and format, can be anything, e.g. prioritized list, single paragraph, multiple paragraphs, multiple-page report\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"### Cosmic Vocalization: An Overview\n",
"\n",
"Cosmic Vocalization is a term coined by Jordan Hayes to describe a repeating sequence found in cryptic communications. This concept is pivotal in the realm of interstellar communication, serving as a common reference point for both humanity and extraterrestrial entities. The repeating sequence identified by Hayes is crucial for understanding and interpreting the signals exchanged during what is known as the Interstellar Duet [Data: Reports (65)].\n",
"\n",
"### Key Figures and Entities\n",
"\n",
"#### Jordan Hayes\n",
"Jordan Hayes is a central figure in the development and understanding of Cosmic Vocalization. Hayes' work in identifying and describing the repeating sequence in cryptic communications has been instrumental in the ongoing efforts to interact with extraterrestrial intelligence. His contributions provide the foundational knowledge required to decode and engage with these alien signals [Data: Reports (65)].\n",
"\n",
"#### Paranormal Military Squad\n",
"The Paranormal Military Squad plays a significant role in activities related to Cosmic Vocalization. Their responsibilities include responding to alien signals and participating in the Interstellar Duet. This squad is part of a broader initiative aimed at facilitating interstellar communication, ensuring that humanity can effectively engage with extraterrestrial entities [Data: Reports (65)].\n",
"\n",
"### Implications\n",
"\n",
"The concept of Cosmic Vocalization and the involvement of key figures like Jordan Hayes and the Paranormal Military Squad highlight the collaborative efforts required to advance interstellar communication. By establishing a common ground for understanding cryptic signals, these efforts may pave the way for more profound interactions with extraterrestrial intelligence, potentially leading to significant advancements in our knowledge and capabilities.\n",
"\n",
"In summary, Cosmic Vocalization is a critical concept in the field of interstellar communication, with Jordan Hayes and the Paranormal Military Squad being key contributors to its development and application [Data: Reports (65)].\n"
]
}
],
"source": [
"result = await search_engine.asearch(\n",
" \"What is Cosmic Vocalization and who are involved in it?\"\n",
")\n",
"\n",
"print(result.response)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>id</th>\n",
" <th>title</th>\n",
" <th>occurrence weight</th>\n",
" <th>content</th>\n",
" <th>rank</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>50</td>\n",
" <td>Alex Mercer and the Dulce Base Team</td>\n",
" <td>0.956522</td>\n",
" <td># Alex Mercer and the Dulce Base Team\\n\\nThe c...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>35</td>\n",
" <td>Kevin Scott and Technology Development</td>\n",
" <td>0.608696</td>\n",
" <td># Kevin Scott and Technology Development\\n\\nTh...</td>\n",
" <td>7.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>53</td>\n",
" <td>Dulce Base and Paranormal Military Squad</td>\n",
" <td>0.565217</td>\n",
" <td># Dulce Base and Paranormal Military Squad\\n\\n...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>22</td>\n",
" <td>Paranormal Military Squad and Technological Ex...</td>\n",
" <td>0.434783</td>\n",
" <td># Paranormal Military Squad and Technological ...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>60</td>\n",
" <td>First Contact with Extraterrestrial Civilization</td>\n",
" <td>0.304348</td>\n",
" <td># First Contact with Extraterrestrial Civiliza...</td>\n",
" <td>9.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>61</td>\n",
" <td>Dulce Base Operations and Distress</td>\n",
" <td>0.173913</td>\n",
" <td># Dulce Base Operations and Distress\\n\\nThe co...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>57</td>\n",
" <td>Operation: Dulce in New Mexico</td>\n",
" <td>0.130435</td>\n",
" <td># Operation: Dulce in New Mexico\\n\\nThe commun...</td>\n",
" <td>7.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>48</td>\n",
" <td>Jacob Collier and Ben Bloomberg's First Tour</td>\n",
" <td>0.130435</td>\n",
" <td># Jacob Collier and Ben Bloomberg's First Tour...</td>\n",
" <td>7.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>51</td>\n",
" <td>Cosmic Translators and Alien Script</td>\n",
" <td>0.086957</td>\n",
" <td># Cosmic Translators and Alien Script\\n\\nThe c...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>64</td>\n",
" <td>Terminal and Deep Hum</td>\n",
" <td>0.086957</td>\n",
" <td># Terminal and Deep Hum\\n\\nThe community revol...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td>29</td>\n",
" <td>Paranormal Military Squad and Cosmic Dialogue</td>\n",
" <td>0.086957</td>\n",
" <td># Paranormal Military Squad and Cosmic Dialogu...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td>39</td>\n",
" <td>Extraterrestrial Signal Decryption Community</td>\n",
" <td>0.086957</td>\n",
" <td># Extraterrestrial Signal Decryption Community...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td>34</td>\n",
" <td>Growth Mindset and Stanford</td>\n",
" <td>0.086957</td>\n",
" <td># Growth Mindset and Stanford\\n\\nThe community...</td>\n",
" <td>7.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>20</td>\n",
" <td>Jacob Collier and Taylor Swift's Albums</td>\n",
" <td>0.086957</td>\n",
" <td># Jacob Collier and Taylor Swift's Albums\\n\\nT...</td>\n",
" <td>7.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>0</td>\n",
" <td>Omberg and Jacob Collier Collaboration</td>\n",
" <td>0.086957</td>\n",
" <td># Omberg and Jacob Collier Collaboration\\n\\nTh...</td>\n",
" <td>7.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td>65</td>\n",
" <td>Galactic Orchestra and Interstellar Duet</td>\n",
" <td>0.043478</td>\n",
" <td># Galactic Orchestra and Interstellar Duet\\n\\n...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td>59</td>\n",
" <td>Alien Intelligence and Interstellar Siren's Call</td>\n",
" <td>0.043478</td>\n",
" <td># Alien Intelligence and Interstellar Siren's ...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>17</th>\n",
" <td>16</td>\n",
" <td>Jimmy Fallon Project on Primetime Television</td>\n",
" <td>0.043478</td>\n",
" <td># Jimmy Fallon Project on Primetime Television...</td>\n",
" <td>6.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>18</th>\n",
" <td>42</td>\n",
" <td>Decryption Process and Digital Soundscape</td>\n",
" <td>0.043478</td>\n",
" <td># Decryption Process and Digital Soundscape\\n\\...</td>\n",
" <td>6.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>19</th>\n",
" <td>5</td>\n",
" <td>Jacob Collier's Video Production</td>\n",
" <td>0.043478</td>\n",
" <td># Jacob Collier's Video Production\\n\\nThe comm...</td>\n",
" <td>4.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>20</th>\n",
" <td>14</td>\n",
" <td>Ben Bloomberg's Phone System and House</td>\n",
" <td>0.043478</td>\n",
" <td># Ben Bloomberg's Phone System and House\\n\\nTh...</td>\n",
" <td>3.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>21</th>\n",
" <td>9</td>\n",
" <td>Jacob Collier's Video Production</td>\n",
" <td>0.043478</td>\n",
" <td># Jacob Collier's Video Production\\n\\nThe comm...</td>\n",
" <td>3.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>22</th>\n",
" <td>17</td>\n",
" <td>Ben Bloomberg's Phone System and Parental Conc...</td>\n",
" <td>0.043478</td>\n",
" <td># Ben Bloomberg's Phone System and Parental Co...</td>\n",
" <td>3.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>23</th>\n",
" <td>58</td>\n",
" <td>Paranormal Military Squad and Alien Communicat...</td>\n",
" <td>0.956522</td>\n",
" <td># Paranormal Military Squad and Alien Communic...</td>\n",
" <td>7.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>24</th>\n",
" <td>52</td>\n",
" <td>Paranormal Military Squad at Dulce Base</td>\n",
" <td>0.695652</td>\n",
" <td># Paranormal Military Squad at Dulce Base\\n\\nT...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25</th>\n",
" <td>11</td>\n",
" <td>Jacob Collier and His Musical Collaborations</td>\n",
" <td>0.565217</td>\n",
" <td># Jacob Collier and His Musical Collaborations...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>26</th>\n",
" <td>54</td>\n",
" <td>Dr. Jordan Hayes and the Paranormal Military S...</td>\n",
" <td>0.347826</td>\n",
" <td># Dr. Jordan Hayes and the Paranormal Military...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>27</th>\n",
" <td>55</td>\n",
" <td>Operation: Dulce and Paranormal Military Squad</td>\n",
" <td>0.260870</td>\n",
" <td># Operation: Dulce and Paranormal Military Squ...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>28</th>\n",
" <td>68</td>\n",
" <td>Earth's Interstellar Communication and Galacti...</td>\n",
" <td>0.260870</td>\n",
" <td># Earth's Interstellar Communication and Galac...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>29</th>\n",
" <td>70</td>\n",
" <td>Paranormal Military Squad and Interstellar Com...</td>\n",
" <td>0.130435</td>\n",
" <td># Paranormal Military Squad and Interstellar C...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>30</th>\n",
" <td>71</td>\n",
" <td>Threshold and Humankind's Communication with E...</td>\n",
" <td>0.130435</td>\n",
" <td># Threshold and Humankind's Communication with...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>31</th>\n",
" <td>56</td>\n",
" <td>Dulce Military Base and Paranormal Operations</td>\n",
" <td>0.130435</td>\n",
" <td># Dulce Military Base and Paranormal Operation...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>32</th>\n",
" <td>69</td>\n",
" <td>Paranormal Military Squad and Interstellar Com...</td>\n",
" <td>0.130435</td>\n",
" <td># Paranormal Military Squad and Interstellar C...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>33</th>\n",
" <td>33</td>\n",
" <td>Behind the Tech and Microsoft Community</td>\n",
" <td>0.130435</td>\n",
" <td># Behind the Tech and Microsoft Community\\n\\nT...</td>\n",
" <td>7.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>34</th>\n",
" <td>49</td>\n",
" <td>Djesse Vol. 3 and Djesse Albums Series</td>\n",
" <td>0.130435</td>\n",
" <td># Djesse Vol. 3 and Djesse Albums Series\\n\\nTh...</td>\n",
" <td>7.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>35</th>\n",
" <td>66</td>\n",
" <td>Humanity and Cosmic Relationships</td>\n",
" <td>0.086957</td>\n",
" <td># Humanity and Cosmic Relationships\\n\\nThe com...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>36</th>\n",
" <td>19</td>\n",
" <td>Pandemic and Its Impact on Work and Art</td>\n",
" <td>0.086957</td>\n",
" <td># Pandemic and Its Impact on Work and Art\\n\\nT...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>37</th>\n",
" <td>67</td>\n",
" <td>Decryption and Understanding of Alien Signal</td>\n",
" <td>0.043478</td>\n",
" <td># Decryption and Understanding of Alien Signal...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>38</th>\n",
" <td>12</td>\n",
" <td>Montreux Jazz Festival and Key Performers</td>\n",
" <td>0.043478</td>\n",
" <td># Montreux Jazz Festival and Key Performers\\n\\...</td>\n",
" <td>7.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>39</th>\n",
" <td>46</td>\n",
" <td>Robot Opera and Broadway</td>\n",
" <td>0.043478</td>\n",
" <td># Robot Opera and Broadway\\n\\nThe community re...</td>\n",
" <td>7.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>40</th>\n",
" <td>21</td>\n",
" <td>Taylor Swift's Albums and Documentary</td>\n",
" <td>0.043478</td>\n",
" <td># Taylor Swift's Albums and Documentary\\n\\nThe...</td>\n",
" <td>7.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>41</th>\n",
" <td>7</td>\n",
" <td>Stage Equipment and Transportation Network</td>\n",
" <td>0.043478</td>\n",
" <td># Stage Equipment and Transportation Network\\n...</td>\n",
" <td>6.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>42</th>\n",
" <td>37</td>\n",
" <td>Jaron Lanier and His Collection of Musical Ins...</td>\n",
" <td>0.043478</td>\n",
" <td># Jaron Lanier and His Collection of Musical I...</td>\n",
" <td>4.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>43</th>\n",
" <td>45</td>\n",
" <td>Prince of Monaco and Monaco</td>\n",
" <td>0.043478</td>\n",
" <td># Prince of Monaco and Monaco\\n\\nThe community...</td>\n",
" <td>4.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>44</th>\n",
" <td>62</td>\n",
" <td>Paranormal Military Squad at Dulce Base</td>\n",
" <td>1.000000</td>\n",
" <td># Paranormal Military Squad at Dulce Base\\n\\nT...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>45</th>\n",
" <td>63</td>\n",
" <td>Paranormal Military Squad and Operation: Dulce</td>\n",
" <td>0.782609</td>\n",
" <td># Paranormal Military Squad and Operation: Dul...</td>\n",
" <td>9.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>46</th>\n",
" <td>43</td>\n",
" <td>Ben Bloomberg and the Harmoniser Project</td>\n",
" <td>0.478261</td>\n",
" <td># Ben Bloomberg and the Harmoniser Project\\n\\n...</td>\n",
" <td>7.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>47</th>\n",
" <td>47</td>\n",
" <td>Ben and Jacob's Fusion of Art and Technology</td>\n",
" <td>0.173913</td>\n",
" <td># Ben and Jacob's Fusion of Art and Technology...</td>\n",
" <td>7.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>48</th>\n",
" <td>28</td>\n",
" <td>Mission to Uncover Dulce's Mysteries</td>\n",
" <td>0.130435</td>\n",
" <td># Mission to Uncover Dulce's Mysteries\\n\\nThe ...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>49</th>\n",
" <td>18</td>\n",
" <td>Taylor Swift and Album of the Year</td>\n",
" <td>0.130435</td>\n",
" <td># Taylor Swift and Album of the Year\\n\\nThe co...</td>\n",
" <td>7.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>50</th>\n",
" <td>40</td>\n",
" <td>Conversation between Kevin Scott and Jacob Col...</td>\n",
" <td>0.086957</td>\n",
" <td># Conversation between Kevin Scott and Jacob C...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>51</th>\n",
" <td>41</td>\n",
" <td>Humanity and the Unseen Partner</td>\n",
" <td>0.043478</td>\n",
" <td># Humanity and the Unseen Partner\\n\\nThe commu...</td>\n",
" <td>8.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>52</th>\n",
" <td>15</td>\n",
" <td>Jimmy Fallon Project on Primetime TV</td>\n",
" <td>0.043478</td>\n",
" <td># Jimmy Fallon Project on Primetime TV\\n\\nThe ...</td>\n",
" <td>7.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>53</th>\n",
" <td>38</td>\n",
" <td>Kevin Scott and the Engineering Mindset</td>\n",
" <td>0.043478</td>\n",
" <td># Kevin Scott and the Engineering Mindset\\n\\nT...</td>\n",
" <td>6.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>54</th>\n",
" <td>44</td>\n",
" <td>North Hampton and Influential Musicians</td>\n",
" <td>0.043478</td>\n",
" <td># North Hampton and Influential Musicians\\n\\nT...</td>\n",
" <td>6.5</td>\n",
" </tr>\n",
" <tr>\n",
" <th>55</th>\n",
" <td>36</td>\n",
" <td>Kevin Scott's Daughter and Her Fantasy Novel</td>\n",
" <td>0.043478</td>\n",
" <td># Kevin Scott's Daughter and Her Fantasy Novel...</td>\n",
" <td>2.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" id title occurrence weight \\\n",
"0 50 Alex Mercer and the Dulce Base Team 0.956522 \n",
"1 35 Kevin Scott and Technology Development 0.608696 \n",
"2 53 Dulce Base and Paranormal Military Squad 0.565217 \n",
"3 22 Paranormal Military Squad and Technological Ex... 0.434783 \n",
"4 60 First Contact with Extraterrestrial Civilization 0.304348 \n",
"5 61 Dulce Base Operations and Distress 0.173913 \n",
"6 57 Operation: Dulce in New Mexico 0.130435 \n",
"7 48 Jacob Collier and Ben Bloomberg's First Tour 0.130435 \n",
"8 51 Cosmic Translators and Alien Script 0.086957 \n",
"9 64 Terminal and Deep Hum 0.086957 \n",
"10 29 Paranormal Military Squad and Cosmic Dialogue 0.086957 \n",
"11 39 Extraterrestrial Signal Decryption Community 0.086957 \n",
"12 34 Growth Mindset and Stanford 0.086957 \n",
"13 20 Jacob Collier and Taylor Swift's Albums 0.086957 \n",
"14 0 Omberg and Jacob Collier Collaboration 0.086957 \n",
"15 65 Galactic Orchestra and Interstellar Duet 0.043478 \n",
"16 59 Alien Intelligence and Interstellar Siren's Call 0.043478 \n",
"17 16 Jimmy Fallon Project on Primetime Television 0.043478 \n",
"18 42 Decryption Process and Digital Soundscape 0.043478 \n",
"19 5 Jacob Collier's Video Production 0.043478 \n",
"20 14 Ben Bloomberg's Phone System and House 0.043478 \n",
"21 9 Jacob Collier's Video Production 0.043478 \n",
"22 17 Ben Bloomberg's Phone System and Parental Conc... 0.043478 \n",
"23 58 Paranormal Military Squad and Alien Communicat... 0.956522 \n",
"24 52 Paranormal Military Squad at Dulce Base 0.695652 \n",
"25 11 Jacob Collier and His Musical Collaborations 0.565217 \n",
"26 54 Dr. Jordan Hayes and the Paranormal Military S... 0.347826 \n",
"27 55 Operation: Dulce and Paranormal Military Squad 0.260870 \n",
"28 68 Earth's Interstellar Communication and Galacti... 0.260870 \n",
"29 70 Paranormal Military Squad and Interstellar Com... 0.130435 \n",
"30 71 Threshold and Humankind's Communication with E... 0.130435 \n",
"31 56 Dulce Military Base and Paranormal Operations 0.130435 \n",
"32 69 Paranormal Military Squad and Interstellar Com... 0.130435 \n",
"33 33 Behind the Tech and Microsoft Community 0.130435 \n",
"34 49 Djesse Vol. 3 and Djesse Albums Series 0.130435 \n",
"35 66 Humanity and Cosmic Relationships 0.086957 \n",
"36 19 Pandemic and Its Impact on Work and Art 0.086957 \n",
"37 67 Decryption and Understanding of Alien Signal 0.043478 \n",
"38 12 Montreux Jazz Festival and Key Performers 0.043478 \n",
"39 46 Robot Opera and Broadway 0.043478 \n",
"40 21 Taylor Swift's Albums and Documentary 0.043478 \n",
"41 7 Stage Equipment and Transportation Network 0.043478 \n",
"42 37 Jaron Lanier and His Collection of Musical Ins... 0.043478 \n",
"43 45 Prince of Monaco and Monaco 0.043478 \n",
"44 62 Paranormal Military Squad at Dulce Base 1.000000 \n",
"45 63 Paranormal Military Squad and Operation: Dulce 0.782609 \n",
"46 43 Ben Bloomberg and the Harmoniser Project 0.478261 \n",
"47 47 Ben and Jacob's Fusion of Art and Technology 0.173913 \n",
"48 28 Mission to Uncover Dulce's Mysteries 0.130435 \n",
"49 18 Taylor Swift and Album of the Year 0.130435 \n",
"50 40 Conversation between Kevin Scott and Jacob Col... 0.086957 \n",
"51 41 Humanity and the Unseen Partner 0.043478 \n",
"52 15 Jimmy Fallon Project on Primetime TV 0.043478 \n",
"53 38 Kevin Scott and the Engineering Mindset 0.043478 \n",
"54 44 North Hampton and Influential Musicians 0.043478 \n",
"55 36 Kevin Scott's Daughter and Her Fantasy Novel 0.043478 \n",
"\n",
" content rank \n",
"0 # Alex Mercer and the Dulce Base Team\\n\\nThe c... 8.5 \n",
"1 # Kevin Scott and Technology Development\\n\\nTh... 7.5 \n",
"2 # Dulce Base and Paranormal Military Squad\\n\\n... 8.5 \n",
"3 # Paranormal Military Squad and Technological ... 8.5 \n",
"4 # First Contact with Extraterrestrial Civiliza... 9.5 \n",
"5 # Dulce Base Operations and Distress\\n\\nThe co... 8.5 \n",
"6 # Operation: Dulce in New Mexico\\n\\nThe commun... 7.5 \n",
"7 # Jacob Collier and Ben Bloomberg's First Tour... 7.5 \n",
"8 # Cosmic Translators and Alien Script\\n\\nThe c... 8.5 \n",
"9 # Terminal and Deep Hum\\n\\nThe community revol... 8.5 \n",
"10 # Paranormal Military Squad and Cosmic Dialogu... 8.5 \n",
"11 # Extraterrestrial Signal Decryption Community... 8.5 \n",
"12 # Growth Mindset and Stanford\\n\\nThe community... 7.5 \n",
"13 # Jacob Collier and Taylor Swift's Albums\\n\\nT... 7.5 \n",
"14 # Omberg and Jacob Collier Collaboration\\n\\nTh... 7.5 \n",
"15 # Galactic Orchestra and Interstellar Duet\\n\\n... 8.5 \n",
"16 # Alien Intelligence and Interstellar Siren's ... 8.5 \n",
"17 # Jimmy Fallon Project on Primetime Television... 6.5 \n",
"18 # Decryption Process and Digital Soundscape\\n\\... 6.5 \n",
"19 # Jacob Collier's Video Production\\n\\nThe comm... 4.0 \n",
"20 # Ben Bloomberg's Phone System and House\\n\\nTh... 3.0 \n",
"21 # Jacob Collier's Video Production\\n\\nThe comm... 3.0 \n",
"22 # Ben Bloomberg's Phone System and Parental Co... 3.0 \n",
"23 # Paranormal Military Squad and Alien Communic... 7.5 \n",
"24 # Paranormal Military Squad at Dulce Base\\n\\nT... 8.5 \n",
"25 # Jacob Collier and His Musical Collaborations... 8.5 \n",
"26 # Dr. Jordan Hayes and the Paranormal Military... 8.5 \n",
"27 # Operation: Dulce and Paranormal Military Squ... 8.5 \n",
"28 # Earth's Interstellar Communication and Galac... 8.5 \n",
"29 # Paranormal Military Squad and Interstellar C... 8.5 \n",
"30 # Threshold and Humankind's Communication with... 8.5 \n",
"31 # Dulce Military Base and Paranormal Operation... 8.5 \n",
"32 # Paranormal Military Squad and Interstellar C... 8.5 \n",
"33 # Behind the Tech and Microsoft Community\\n\\nT... 7.5 \n",
"34 # Djesse Vol. 3 and Djesse Albums Series\\n\\nTh... 7.5 \n",
"35 # Humanity and Cosmic Relationships\\n\\nThe com... 8.5 \n",
"36 # Pandemic and Its Impact on Work and Art\\n\\nT... 8.5 \n",
"37 # Decryption and Understanding of Alien Signal... 8.5 \n",
"38 # Montreux Jazz Festival and Key Performers\\n\\... 7.5 \n",
"39 # Robot Opera and Broadway\\n\\nThe community re... 7.5 \n",
"40 # Taylor Swift's Albums and Documentary\\n\\nThe... 7.0 \n",
"41 # Stage Equipment and Transportation Network\\n... 6.5 \n",
"42 # Jaron Lanier and His Collection of Musical I... 4.5 \n",
"43 # Prince of Monaco and Monaco\\n\\nThe community... 4.0 \n",
"44 # Paranormal Military Squad at Dulce Base\\n\\nT... 8.5 \n",
"45 # Paranormal Military Squad and Operation: Dul... 9.0 \n",
"46 # Ben Bloomberg and the Harmoniser Project\\n\\n... 7.5 \n",
"47 # Ben and Jacob's Fusion of Art and Technology... 7.5 \n",
"48 # Mission to Uncover Dulce's Mysteries\\n\\nThe ... 8.5 \n",
"49 # Taylor Swift and Album of the Year\\n\\nThe co... 7.5 \n",
"50 # Conversation between Kevin Scott and Jacob C... 8.5 \n",
"51 # Humanity and the Unseen Partner\\n\\nThe commu... 8.5 \n",
"52 # Jimmy Fallon Project on Primetime TV\\n\\nThe ... 7.5 \n",
"53 # Kevin Scott and the Engineering Mindset\\n\\nT... 6.5 \n",
"54 # North Hampton and Influential Musicians\\n\\nT... 6.5 \n",
"55 # Kevin Scott's Daughter and Her Fantasy Novel... 2.0 "
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# inspect the data used to build the context for the LLM responses\n",
"result.context_data[\"reports\"]"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"LLM calls: 4. Prompt tokens: 33015. Output tokens: 655.\n"
]
}
],
"source": [
"# inspect number of LLM calls and tokens\n",
"print(\n",
" f\"LLM calls: {result.llm_calls}. Prompt tokens: {result.prompt_tokens}. Output tokens: {result.output_tokens}.\"\n",
")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "graphrag-ta_-cxM1-py3.10",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
}