LightRAG

mirror of https://github.com/HKUDS/LightRAG.git synced 2025-12-01 17:37:51 +00:00

History

yangdx 9923821d75 refactor: Remove deprecated max_token_size from embedding configuration

This parameter is no longer used. Its removal simplifies the API and clarifies that token length management is handled by upstream text chunking logic rather than the embedding wrapper.

2025-07-29 10:49:35 +08:00

__init__.py

Separated llms from the main llm.py file and fixed some deprication bugs

2025-01-25 00:11:00 +01:00

anthropic.py

Update webui assets

2025-03-22 00:36:38 +08:00

azure_openai.py

refactor: Remove deprecated max_token_size from embedding configuration

2025-07-29 10:49:35 +08:00

bedrock.py

refactor: Remove deprecated max_token_size from embedding configuration

2025-07-29 10:49:35 +08:00

binding_options.py

Update Ollama context length configuration

2025-07-29 09:53:37 +08:00

hf.py

Eliminate tenacity from dynamic import

2025-05-14 10:57:05 +08:00

jina.py

refactor: Remove deprecated max_token_size from embedding configuration

2025-07-29 10:49:35 +08:00

llama_index_impl.py

refactor: Remove deprecated max_token_size from embedding configuration

2025-07-29 10:49:35 +08:00

lmdeploy.py

Eliminate tenacity from dynamic import

2025-05-14 10:57:05 +08:00

lollms.py

Remove tenacity from dynamic import

2025-05-14 11:30:48 +08:00

nvidia_openai.py

refactor: Remove deprecated max_token_size from embedding configuration

2025-07-29 10:49:35 +08:00

ollama.py

options needs to be passed to ollama client embed() method

2025-07-28 12:05:40 +02:00

openai.py

refactor: Remove deprecated max_token_size from embedding configuration

2025-07-29 10:49:35 +08:00

Readme.md

refactor: Remove deprecated max_token_size from embedding configuration

2025-07-29 10:49:35 +08:00

siliconcloud.py

refactor: Remove deprecated max_token_size from embedding configuration

2025-07-29 10:49:35 +08:00

zhipu.py

refactor: Remove deprecated max_token_size from embedding configuration

2025-07-29 10:49:35 +08:00

Readme.md

LlamaIndex (llm/llama_index.py):
- Provides integration with OpenAI and other providers through LlamaIndex
- Supports both direct API access and proxy services like LiteLLM
- Handles embeddings and completions with consistent interfaces
- See example implementations:
  - Direct OpenAI Usage
  - LiteLLM Proxy Usage

Using LlamaIndex

LightRAG supports LlamaIndex for embeddings and completions in two ways: direct OpenAI usage or through LiteLLM proxy.

Setup

First, install the required dependencies:

pip install llama-index-llms-litellm llama-index-embeddings-litellm

Standard OpenAI Usage

from lightrag import LightRAG
from lightrag.llm.llama_index_impl import llama_index_complete_if_cache, llama_index_embed
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
from lightrag.utils import EmbeddingFunc

# Initialize with direct OpenAI access
async def llm_model_func(prompt, system_prompt=None, history_messages=[], **kwargs):
    try:
        # Initialize OpenAI if not in kwargs
        if 'llm_instance' not in kwargs:
            llm_instance = OpenAI(
                model="gpt-4",
                api_key="your-openai-key",
                temperature=0.7,
            )
            kwargs['llm_instance'] = llm_instance

        response = await llama_index_complete_if_cache(
            kwargs['llm_instance'],
            prompt,
            system_prompt=system_prompt,
            history_messages=history_messages,
            **kwargs,
        )
        return response
    except Exception as e:
        logger.error(f"LLM request failed: {str(e)}")
        raise

# Initialize LightRAG with OpenAI
rag = LightRAG(
    working_dir="your/path",
    llm_model_func=llm_model_func,
    embedding_func=EmbeddingFunc(
        embedding_dim=1536,
        func=lambda texts: llama_index_embed(
            texts,
            embed_model=OpenAIEmbedding(
                model="text-embedding-3-large",
                api_key="your-openai-key"
            )
        ),
    ),
)

Using LiteLLM Proxy

Use any LLM provider through LiteLLM
Leverage LlamaIndex's embedding and completion capabilities
Maintain consistent configuration across services

from lightrag import LightRAG
from lightrag.llm.llama_index_impl import llama_index_complete_if_cache, llama_index_embed
from llama_index.llms.litellm import LiteLLM
from llama_index.embeddings.litellm import LiteLLMEmbedding
from lightrag.utils import EmbeddingFunc

# Initialize with LiteLLM proxy
async def llm_model_func(prompt, system_prompt=None, history_messages=[], **kwargs):
    try:
        # Initialize LiteLLM if not in kwargs
        if 'llm_instance' not in kwargs:
            llm_instance = LiteLLM(
                model=f"openai/{settings.LLM_MODEL}",  # Format: "provider/model_name"
                api_base=settings.LITELLM_URL,
                api_key=settings.LITELLM_KEY,
                temperature=0.7,
            )
            kwargs['llm_instance'] = llm_instance

        response = await llama_index_complete_if_cache(
            kwargs['llm_instance'],
            prompt,
            system_prompt=system_prompt,
            history_messages=history_messages,
            **kwargs,
        )
        return response
    except Exception as e:
        logger.error(f"LLM request failed: {str(e)}")
        raise

# Initialize LightRAG with LiteLLM
rag = LightRAG(
    working_dir="your/path",
    llm_model_func=llm_model_func,
    embedding_func=EmbeddingFunc(
        embedding_dim=1536,
        func=lambda texts: llama_index_embed(
            texts,
            embed_model=LiteLLMEmbedding(
                model_name=f"openai/{settings.EMBEDDING_MODEL}",
                api_base=settings.LITELLM_URL,
                api_key=settings.LITELLM_KEY,
            )
        ),
    ),
)

Environment Variables

For OpenAI direct usage:

OPENAI_API_KEY=your-openai-key

For LiteLLM proxy:

# LiteLLM Configuration
LITELLM_URL=http://litellm:4000
LITELLM_KEY=your-litellm-key

# Model Configuration
LLM_MODEL=gpt-4
EMBEDDING_MODEL=text-embedding-3-large

Key Differences

Direct OpenAI:
- Simpler setup
- Direct API access
- Requires OpenAI API key
LiteLLM Proxy:
- Model provider agnostic
- Centralized API key management
- Support for multiple providers
- Better cost control and monitoring