graphrag/docs/examples_notebooks/custom_vector_store.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Copyright (c) 2024 Microsoft Corporation.\n",
    "# Licensed under the MIT License."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Bring-Your-Own Vector Store\n",
    "\n",
    "This notebook demonstrates how to implement a custom vector store and register for usage with GraphRAG.\n",
    "\n",
    "## Overview\n",
    "\n",
    "GraphRAG uses a plug-and-play architecture that allow for easy integration of custom vector stores (outside of what is natively supported) by following a factory design pattern. This allows you to:\n",
    "\n",
    "- **Extend functionality**: Add support for new vector database backends\n",
    "- **Customize behavior**: Implement specialized search logic or data structures\n",
    "- **Integrate existing systems**: Connect GraphRAG to your existing vector database infrastructure\n",
    "\n",
    "### What You'll Learn\n",
    "\n",
    "1. Understanding the `BaseVectorStore` interface\n",
    "2. Implementing a custom vector store class\n",
    "3. Registering your vector store with the `VectorStoreFactory`\n",
    "4. Testing and validating your implementation\n",
    "5. Configuring GraphRAG to use your custom vector store\n",
    "\n",
    "Let's get started!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 1: Import Required Dependencies\n",
    "\n",
    "First, let's import the necessary GraphRAG components and other dependencies we'll need.\n",
    "\n",
    "```bash\n",
    "pip install graphrag\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from typing import Any\n",
    "\n",
    "import numpy as np\n",
    "import yaml\n",
    "\n",
    "from graphrag.data_model.types import TextEmbedder\n",
    "\n",
    "# GraphRAG vector store components\n",
    "from graphrag.vector_stores.base import (\n",
    "    BaseVectorStore,\n",
    "    VectorStoreDocument,\n",
    "    VectorStoreSearchResult,\n",
    ")\n",
    "from graphrag.vector_stores.factory import VectorStoreFactory"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 2: Understand the BaseVectorStore Interface\n",
    "\n",
    "Before using a custom vector store, let's examine the `BaseVectorStore` interface to understand what methods need to be implemented."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Let's inspect the BaseVectorStore class to understand the required methods\n",
    "import inspect\n",
    "\n",
    "print(\"BaseVectorStore Abstract Methods:\")\n",
    "print(\"=\" * 40)\n",
    "\n",
    "abstract_methods = []\n",
    "for name, method in inspect.getmembers(BaseVectorStore, predicate=inspect.isfunction):\n",
    "    if getattr(method, \"__isabstractmethod__\", False):\n",
    "        signature = inspect.signature(method)\n",
    "        abstract_methods.append(f\"• {name}{signature}\")\n",
    "        print(f\"• {name}{signature}\")\n",
    "\n",
    "print(f\"\\nTotal abstract methods to implement: {len(abstract_methods)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 3: Implement a Custom Vector Store\n",
    "\n",
    "Now let's implement a simple in-memory vector store as an example. This vector store will:\n",
    "\n",
    "- Store documents and vectors in memory using Python data structures\n",
    "- Support all required BaseVectorStore methods\n",
    "\n",
    "**Note**: This is a simplified example for demonstration. Production vector stores would typically use optimized libraries like FAISS, more sophisticated indexing, and persistent storage."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class SimpleInMemoryVectorStore(BaseVectorStore):\n",
    "    \"\"\"A simple in-memory vector store implementation for demonstration purposes.\n",
    "\n",
    "    This vector store stores documents and their embeddings in memory and provides\n",
    "    basic similarity search functionality using cosine similarity.\n",
    "\n",
    "    WARNING: This is for demonstration only - not suitable for production use.\n",
    "    For production, consider using optimized vector databases like LanceDB,\n",
    "    Azure AI Search, or other specialized vector stores.\n",
    "    \"\"\"\n",
    "\n",
    "    # Internal storage for documents and vectors\n",
    "    documents: dict[str, VectorStoreDocument]\n",
    "    vectors: dict[str, np.ndarray]\n",
    "    connected: bool\n",
    "\n",
    "    def __init__(self, **kwargs: Any):\n",
    "        \"\"\"Initialize the in-memory vector store.\"\"\"\n",
    "        super().__init__(**kwargs)\n",
    "\n",
    "        self.documents: dict[str, VectorStoreDocument] = {}\n",
    "        self.vectors: dict[str, np.ndarray] = {}\n",
    "        self.connected = False\n",
    "\n",
    "        print(\n",
    "            f\"🚀 SimpleInMemoryVectorStore initialized for collection: {self.collection_name}\"\n",
    "        )\n",
    "\n",
    "    def connect(self, **kwargs: Any) -> None:\n",
    "        \"\"\"Connect to the vector storage (no-op for in-memory store).\"\"\"\n",
    "        self.connected = True\n",
    "        print(f\"✅ Connected to in-memory vector store: {self.collection_name}\")\n",
    "\n",
    "    def load_documents(\n",
    "        self, documents: list[VectorStoreDocument], overwrite: bool = True\n",
    "    ) -> None:\n",
    "        \"\"\"Load documents into the vector store.\"\"\"\n",
    "        if not self.connected:\n",
    "            msg = \"Vector store not connected. Call connect() first.\"\n",
    "            raise RuntimeError(msg)\n",
    "\n",
    "        if overwrite:\n",
    "            self.documents.clear()\n",
    "            self.vectors.clear()\n",
    "\n",
    "        loaded_count = 0\n",
    "        for doc in documents:\n",
    "            if doc.vector is not None:\n",
    "                doc_id = str(doc.id)\n",
    "                self.documents[doc_id] = doc\n",
    "                self.vectors[doc_id] = np.array(doc.vector, dtype=np.float32)\n",
    "                loaded_count += 1\n",
    "\n",
    "        print(f\"📚 Loaded {loaded_count} documents into vector store\")\n",
    "\n",
    "    def _cosine_similarity(self, vec1: np.ndarray, vec2: np.ndarray) -> float:\n",
    "        \"\"\"Calculate cosine similarity between two vectors.\"\"\"\n",
    "        # Normalize vectors\n",
    "        norm1 = np.linalg.norm(vec1)\n",
    "        norm2 = np.linalg.norm(vec2)\n",
    "\n",
    "        if norm1 == 0 or norm2 == 0:\n",
    "            return 0.0\n",
    "\n",
    "        return float(np.dot(vec1, vec2) / (norm1 * norm2))\n",
    "\n",
    "    def similarity_search_by_vector(\n",
    "        self, query_embedding: list[float], k: int = 10, **kwargs: Any\n",
    "    ) -> list[VectorStoreSearchResult]:\n",
    "        \"\"\"Perform similarity search using a query vector.\"\"\"\n",
    "        if not self.connected:\n",
    "            msg = \"Vector store not connected. Call connect() first.\"\n",
    "            raise RuntimeError(msg)\n",
    "\n",
    "        if not self.vectors:\n",
    "            return []\n",
    "\n",
    "        query_vec = np.array(query_embedding, dtype=np.float32)\n",
    "        similarities = []\n",
    "\n",
    "        # Calculate similarity with all stored vectors\n",
    "        for doc_id, stored_vec in self.vectors.items():\n",
    "            similarity = self._cosine_similarity(query_vec, stored_vec)\n",
    "            similarities.append((doc_id, similarity))\n",
    "\n",
    "        # Sort by similarity (descending) and take top k\n",
    "        similarities.sort(key=lambda x: x[1], reverse=True)\n",
    "        top_k = similarities[:k]\n",
    "\n",
    "        # Create search results\n",
    "        results = []\n",
    "        for doc_id, score in top_k:\n",
    "            document = self.documents[doc_id]\n",
    "            result = VectorStoreSearchResult(document=document, score=score)\n",
    "            results.append(result)\n",
    "\n",
    "        return results\n",
    "\n",
    "    def similarity_search_by_text(\n",
    "        self, text: str, text_embedder: TextEmbedder, k: int = 10, **kwargs: Any\n",
    "    ) -> list[VectorStoreSearchResult]:\n",
    "        \"\"\"Perform similarity search using text (which gets embedded first).\"\"\"\n",
    "        # Embed the text first\n",
    "        query_embedding = text_embedder(text)\n",
    "\n",
    "        # Use vector search with the embedding\n",
    "        return self.similarity_search_by_vector(query_embedding, k, **kwargs)\n",
    "\n",
    "    def filter_by_id(self, include_ids: list[str] | list[int]) -> Any:\n",
    "        \"\"\"Build a query filter to filter documents by id.\n",
    "\n",
    "        For this simple implementation, we return the list of IDs as the filter.\n",
    "        \"\"\"\n",
    "        return [str(id_) for id_ in include_ids]\n",
    "\n",
    "    def search_by_id(self, id: str) -> VectorStoreDocument:\n",
    "        \"\"\"Search for a document by id.\"\"\"\n",
    "        doc_id = str(id)\n",
    "        if doc_id not in self.documents:\n",
    "            msg = f\"Document with id '{id}' not found\"\n",
    "            raise KeyError(msg)\n",
    "\n",
    "        return self.documents[doc_id]\n",
    "\n",
    "    def get_stats(self) -> dict[str, Any]:\n",
    "        \"\"\"Get statistics about the vector store (custom method).\"\"\"\n",
    "        return {\n",
    "            \"collection_name\": self.collection_name,\n",
    "            \"document_count\": len(self.documents),\n",
    "            \"vector_count\": len(self.vectors),\n",
    "            \"connected\": self.connected,\n",
    "            \"vector_dimension\": len(next(iter(self.vectors.values())))\n",
    "            if self.vectors\n",
    "            else 0,\n",
    "        }\n",
    "\n",
    "\n",
    "print(\"✅ SimpleInMemoryVectorStore class defined!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 4: Register the Custom Vector Store\n",
    "\n",
    "Now let's register our custom vector store with the `VectorStoreFactory` so it can be used throughout GraphRAG."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Register our custom vector store with a unique identifier\n",
    "CUSTOM_VECTOR_STORE_TYPE = \"simple_memory\"\n",
    "\n",
    "# Register the vector store class\n",
    "VectorStoreFactory.register(CUSTOM_VECTOR_STORE_TYPE, SimpleInMemoryVectorStore)\n",
    "\n",
    "print(f\"✅ Registered custom vector store with type: '{CUSTOM_VECTOR_STORE_TYPE}'\")\n",
    "\n",
    "# Verify registration\n",
    "available_types = VectorStoreFactory.get_vector_store_types()\n",
    "print(f\"\\n📋 Available vector store types: {available_types}\")\n",
    "print(\n",
    "    f\"🔍 Is our custom type supported? {VectorStoreFactory.is_supported_type(CUSTOM_VECTOR_STORE_TYPE)}\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 5: Test the Custom Vector Store\n",
    "\n",
    "Let's create some sample data and test our custom vector store implementation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create sample documents with mock embeddings\n",
    "def create_mock_embedding(dimension: int = 384) -> list[float]:\n",
    "    \"\"\"Create a random embedding vector for testing.\"\"\"\n",
    "    return np.random.normal(0, 1, dimension).tolist()\n",
    "\n",
    "\n",
    "# Sample documents\n",
    "sample_documents = [\n",
    "    VectorStoreDocument(\n",
    "        id=\"doc_1\",\n",
    "        text=\"GraphRAG is a powerful knowledge graph extraction and reasoning framework.\",\n",
    "        vector=create_mock_embedding(),\n",
    "        attributes={\"category\": \"technology\", \"source\": \"documentation\"},\n",
    "    ),\n",
    "    VectorStoreDocument(\n",
    "        id=\"doc_2\",\n",
    "        text=\"Vector stores enable efficient similarity search over high-dimensional data.\",\n",
    "        vector=create_mock_embedding(),\n",
    "        attributes={\"category\": \"technology\", \"source\": \"research\"},\n",
    "    ),\n",
    "    VectorStoreDocument(\n",
    "        id=\"doc_3\",\n",
    "        text=\"Machine learning models can process and understand natural language text.\",\n",
    "        vector=create_mock_embedding(),\n",
    "        attributes={\"category\": \"AI\", \"source\": \"article\"},\n",
    "    ),\n",
    "    VectorStoreDocument(\n",
    "        id=\"doc_4\",\n",
    "        text=\"Custom implementations allow for specialized behavior and integration.\",\n",
    "        vector=create_mock_embedding(),\n",
    "        attributes={\"category\": \"development\", \"source\": \"tutorial\"},\n",
    "    ),\n",
    "]\n",
    "\n",
    "print(f\"📝 Created {len(sample_documents)} sample documents\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Test creating vector store using the factory\n",
    "vector_store_config = {\"collection_name\": \"test_collection\"}\n",
    "\n",
    "# Create vector store instance using factory\n",
    "vector_store = VectorStoreFactory.create_vector_store(\n",
    "    CUSTOM_VECTOR_STORE_TYPE, vector_store_config\n",
    ")\n",
    "\n",
    "print(f\"✅ Created vector store instance: {type(vector_store).__name__}\")\n",
    "print(f\"📊 Initial stats: {vector_store.get_stats()}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Connect and load documents\n",
    "vector_store.connect()\n",
    "vector_store.load_documents(sample_documents)\n",
    "\n",
    "print(f\"📊 Updated stats: {vector_store.get_stats()}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Test similarity search\n",
    "query_vector = create_mock_embedding()  # Random query vector for testing\n",
    "\n",
    "search_results = vector_store.similarity_search_by_vector(\n",
    "    query_vector,\n",
    "    k=3,  # Get top 3 similar documents\n",
    ")\n",
    "\n",
    "print(f\"🔍 Found {len(search_results)} similar documents:\\n\")\n",
    "\n",
    "for i, result in enumerate(search_results, 1):\n",
    "    doc = result.document\n",
    "    print(f\"{i}. ID: {doc.id}\")\n",
    "    print(f\"   Text: {doc.text[:60]}...\")\n",
    "    print(f\"   Similarity Score: {result.score:.4f}\")\n",
    "    print(f\"   Category: {doc.attributes.get('category', 'N/A')}\")\n",
    "    print()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Test search by ID\n",
    "try:\n",
    "    found_doc = vector_store.search_by_id(\"doc_2\")\n",
    "    print(\"✅ Found document by ID:\")\n",
    "    print(f\"   ID: {found_doc.id}\")\n",
    "    print(f\"   Text: {found_doc.text}\")\n",
    "    print(f\"   Attributes: {found_doc.attributes}\")\n",
    "except KeyError as e:\n",
    "    print(f\"❌ Error: {e}\")\n",
    "\n",
    "# Test filter by ID\n",
    "id_filter = vector_store.filter_by_id([\"doc_1\", \"doc_3\"])\n",
    "print(f\"\\n🔧 ID filter result: {id_filter}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 6: Configuration for GraphRAG\n",
    "\n",
    "Now let's see how you would configure GraphRAG to use your custom vector store in a settings file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Example GraphRAG yaml settings\n",
    "example_settings = {\n",
    "    \"vector_store\": {\n",
    "        \"default_vector_store\": {\n",
    "            \"type\": CUSTOM_VECTOR_STORE_TYPE,  # \"simple_memory\"\n",
    "            \"collection_name\": \"graphrag_entities\",\n",
    "            # Add any custom parameters your vector store needs\n",
    "            \"custom_parameter\": \"custom_value\",\n",
    "        }\n",
    "    },\n",
    "    # Other GraphRAG configuration...\n",
    "    \"models\": {\n",
    "        \"default_embedding_model\": {\n",
    "            \"type\": \"openai_embedding\",\n",
    "            \"model\": \"text-embedding-3-small\",\n",
    "        }\n",
    "    },\n",
    "}\n",
    "\n",
    "# Convert to YAML format for settings.yml\n",
    "yaml_config = yaml.dump(example_settings, default_flow_style=False, indent=2)\n",
    "\n",
    "print(\"📄 Example settings.yml configuration:\")\n",
    "print(\"=\" * 40)\n",
    "print(yaml_config)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 7: Integration with GraphRAG Pipeline\n",
    "\n",
    "Here's how your custom vector store would be used in a typical GraphRAG pipeline."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Example of how GraphRAG would use your custom vector store\n",
    "def simulate_graphrag_pipeline():\n",
    "    \"\"\"Simulate how GraphRAG would use the custom vector store.\"\"\"\n",
    "    print(\"🚀 Simulating GraphRAG pipeline with custom vector store...\\n\")\n",
    "\n",
    "    # 1. GraphRAG creates vector store using factory\n",
    "    config = {\"collection_name\": \"graphrag_entities\", \"similarity_threshold\": 0.3}\n",
    "\n",
    "    store = VectorStoreFactory.create_vector_store(CUSTOM_VECTOR_STORE_TYPE, config)\n",
    "    store.connect()\n",
    "\n",
    "    print(\"✅ Step 1: Vector store created and connected\")\n",
    "\n",
    "    # 2. During indexing, GraphRAG loads extracted entities\n",
    "    entity_documents = [\n",
    "        VectorStoreDocument(\n",
    "            id=f\"entity_{i}\",\n",
    "            text=f\"Entity {i} description: Important concept in the knowledge graph\",\n",
    "            vector=create_mock_embedding(),\n",
    "            attributes={\"type\": \"entity\", \"importance\": i % 3 + 1},\n",
    "        )\n",
    "        for i in range(10)\n",
    "    ]\n",
    "\n",
    "    store.load_documents(entity_documents)\n",
    "    print(f\"✅ Step 2: Loaded {len(entity_documents)} entity documents\")\n",
    "\n",
    "    # 3. During query time, GraphRAG searches for relevant entities\n",
    "    query_embedding = create_mock_embedding()\n",
    "    relevant_entities = store.similarity_search_by_vector(query_embedding, k=5)\n",
    "\n",
    "    print(f\"✅ Step 3: Found {len(relevant_entities)} relevant entities for query\")\n",
    "\n",
    "    # 4. GraphRAG uses these entities for context building\n",
    "    context_entities = [result.document for result in relevant_entities]\n",
    "\n",
    "    print(\"✅ Step 4: Context built using retrieved entities\")\n",
    "    print(f\"📊 Final stats: {store.get_stats()}\")\n",
    "\n",
    "    return context_entities\n",
    "\n",
    "\n",
    "# Run the simulation\n",
    "context = simulate_graphrag_pipeline()\n",
    "print(f\"\\n🎯 Retrieved {len(context)} entities for context building\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 8: Testing and Validation\n",
    "\n",
    "Let's create a comprehensive test suite to ensure our vector store works correctly."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def test_custom_vector_store():\n",
    "    \"\"\"Comprehensive test suite for the custom vector store.\"\"\"\n",
    "    print(\"🧪 Running comprehensive vector store tests...\\n\")\n",
    "\n",
    "    # Test 1: Basic functionality\n",
    "    print(\"Test 1: Basic functionality\")\n",
    "    store = VectorStoreFactory.create_vector_store(\n",
    "        CUSTOM_VECTOR_STORE_TYPE, {\"collection_name\": \"test\"}\n",
    "    )\n",
    "    store.connect()\n",
    "\n",
    "    # Load test documents\n",
    "    test_docs = sample_documents[:2]\n",
    "    store.load_documents(test_docs)\n",
    "\n",
    "    assert len(store.documents) == 2, \"Should have 2 documents\"\n",
    "    assert len(store.vectors) == 2, \"Should have 2 vectors\"\n",
    "    print(\"✅ Basic functionality test passed\")\n",
    "\n",
    "    # Test 2: Search functionality\n",
    "    print(\"\\nTest 2: Search functionality\")\n",
    "    query_vec = create_mock_embedding()\n",
    "    results = store.similarity_search_by_vector(query_vec, k=5)\n",
    "\n",
    "    assert len(results) <= 2, \"Should not return more results than documents\"\n",
    "    assert all(isinstance(r, VectorStoreSearchResult) for r in results), (\n",
    "        \"Should return VectorStoreSearchResult objects\"\n",
    "    )\n",
    "    assert all(-1 <= r.score <= 1 for r in results), (\n",
    "        \"Similarity scores should be between -1 and 1\"\n",
    "    )\n",
    "    print(\"✅ Search functionality test passed\")\n",
    "\n",
    "    # Test 3: Search by ID\n",
    "    print(\"\\nTest 3: Search by ID\")\n",
    "    found_doc = store.search_by_id(\"doc_1\")\n",
    "    assert found_doc.id == \"doc_1\", \"Should find correct document\"\n",
    "\n",
    "    try:\n",
    "        store.search_by_id(\"nonexistent\")\n",
    "        assert False, \"Should raise KeyError for nonexistent ID\"\n",
    "    except KeyError:\n",
    "        pass  # Expected\n",
    "\n",
    "    print(\"✅ Search by ID test passed\")\n",
    "\n",
    "    # Test 4: Filter functionality\n",
    "    print(\"\\nTest 4: Filter functionality\")\n",
    "    filter_result = store.filter_by_id([\"doc_1\", \"doc_2\"])\n",
    "    assert filter_result == [\"doc_1\", \"doc_2\"], \"Should return filtered IDs\"\n",
    "    print(\"✅ Filter functionality test passed\")\n",
    "\n",
    "    # Test 5: Error handling\n",
    "    print(\"\\nTest 5: Error handling\")\n",
    "    disconnected_store = VectorStoreFactory.create_vector_store(\n",
    "        CUSTOM_VECTOR_STORE_TYPE, {\"collection_name\": \"test2\"}\n",
    "    )\n",
    "\n",
    "    try:\n",
    "        disconnected_store.load_documents(test_docs)\n",
    "        assert False, \"Should raise error when not connected\"\n",
    "    except RuntimeError:\n",
    "        pass  # Expected\n",
    "\n",
    "    try:\n",
    "        disconnected_store.similarity_search_by_vector(query_vec)\n",
    "        assert False, \"Should raise error when not connected\"\n",
    "    except RuntimeError:\n",
    "        pass  # Expected\n",
    "\n",
    "    print(\"✅ Error handling test passed\")\n",
    "\n",
    "    print(\"\\n🎉 All tests passed! Your custom vector store is working correctly.\")\n",
    "\n",
    "\n",
    "# Run the tests\n",
    "test_custom_vector_store()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Summary and Next Steps\n",
    "\n",
    "Congratulations! You've successfully learned how to implement and register a custom vector store with GraphRAG. Here's what you accomplished:\n",
    "\n",
    "### What You Built\n",
    "- ✅ **Custom Vector Store Class**: Implemented `SimpleInMemoryVectorStore` with all required methods\n",
    "- ✅ **Factory Integration**: Registered your vector store with `VectorStoreFactory`\n",
    "- ✅ **Comprehensive Testing**: Validated functionality with a full test suite\n",
    "- ✅ **Configuration Examples**: Learned how to configure GraphRAG to use your vector store\n",
    "\n",
    "### Key Takeaways\n",
    "1. **Interface Compliance**: Always implement all methods from `BaseVectorStore`\n",
    "2. **Factory Pattern**: Use `VectorStoreFactory.register()` to make your vector store available\n",
    "3. **Configuration**: Vector stores are configured in GraphRAG settings files\n",
    "4. **Testing**: Thoroughly test all functionality before deploying\n",
    "\n",
    "### Next Steps\n",
    "Check out the API Overview notebook to learn how to index and query data via the graphrag API.\n",
    "\n",
    "### Resources\n",
    "- [GraphRAG Documentation](https://microsoft.github.io/graphrag/)\n",
    "\n",
    "Happy building! 🚀"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "graphrag-venv (3.10.18)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.18"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}