LightRAG

mirror of https://github.com/HKUDS/LightRAG.git synced 2025-11-21 20:44:16 +00:00

Author	SHA1	Message	Date
yangdx	cb75e6631e	Remove quantized embedding info from LLM cache - Delete quantize_embedding function - Delete dequantize_embedding function - Remove embedding fields from CacheData - Update save_to_cache to exclude embedding data - Clean up unused quantization-related code	2025-08-05 17:58:34 +08:00
yangdx	32af45ff46	refactor: improve JSON parsing reliability with json-repair library Replace regex-based JSON extraction with json-repair for better handling of malformed LLM responses. Remove deprecated JSON parsing utilities and clean up keyword_extraction parameter across LLM providers. - Remove locate_json_string_body_from_string() and convert_response_to_json() - Use json-repair.loads() in extract_keywords_only() for robust parsing - Clean up LLM interfaces and remove unused parameters - Add json-repair dependency	2025-08-01 19:36:20 +08:00
yangdx	2af8a93dc7	fix: resolve _sort_key error in Redis get_docs_paginated function	2025-07-31 02:16:56 +08:00
yangdx	d0bc5e7c4a	Extend path filter to also cover POST requests	2025-07-31 02:06:56 +08:00
yangdx	3e5efd0b27	Add /documents/paginated to filtered logging paths	2025-07-31 02:00:00 +08:00
yangdx	6014b9bf73	feat: add track_id support for document processing progress monitoring - Add get_docs_by_track_id() method to all storage backends (MongoDB, PostgreSQL, Redis, JSON) - Implement automatic track_id generation with upload_/insert_ prefixes - Add /track_status/{track_id} API endpoint for frontend progress queries - Create database indexes for efficient track_id lookups - Enable real-time document processing status tracking across all storage types	2025-07-29 22:24:21 +08:00
yangdx	9923821d75	refactor: Remove deprecated `max_token_size` from embedding configuration This parameter is no longer used. Its removal simplifies the API and clarifies that token length management is handled by upstream text chunking logic rather than the embedding wrapper.	2025-07-29 10:49:35 +08:00
yangdx	e09929b42e	Refine rerank filtering log message for clarity	2025-07-27 16:57:38 +08:00
yangdx	f4bca7bfb2	Fix linting	2025-07-27 16:50:45 +08:00
yangdx	a9565d7379	feat: Skip rerank filtering when `min_rerank_score` is 0.0	2025-07-27 16:50:12 +08:00
yangdx	ebaff228aa	feat: Add rerank score filtering with configurable threshold - Add DEFAULT_MIN_RERANK_SCORE constant (default: 0.0) - Add MIN_RERANK_SCORE environment variable support - Filter chunks with rerank scores below threshold in process_chunks_unified - Add info-level logging for filtering operations - Handle empty results gracefully after filtering - Maintain backward compatibility with non-reranked chunks	2025-07-27 16:37:44 +08:00
yangdx	a67f93acc9	Replace hardcoded max tokens with DEFAULT_MAX_TOTAL_TOKENS constant - Use constant in process_chunks_unified - Update WebUI default to match (32000)	2025-07-26 11:23:54 +08:00
yangdx	7b915b34f6	Refactor: move build_file_path function from operate.py to utils.py	2025-07-26 10:52:59 +08:00
yangdx	d78fda1d89	Optimize logger message	2025-07-24 04:31:06 +08:00
yangdx	d97913873b	Update logger message	2025-07-24 03:44:02 +08:00
yangdx	3075691f72	Refactor: move reranking utilities from operate.py to utils.py • Move apply_rerank_if_enabled to utils • Move process_chunks_unified to utils	2025-07-24 03:33:38 +08:00
yangdx	5a5d32dc32	Optimize logger message	2025-07-24 02:13:39 +08:00
yangdx	02f79508e0	Optimize context building with weighted polling and round-robin data selection	2025-07-24 01:18:21 +08:00
zrguo	1541034816	Add DEFAULT_RELATED_CHUNK_NUMBER	2025-07-15 21:35:12 +08:00
SLKun	5f330ec11a	remove <think> tag for entities and keywords extraction	2025-07-08 14:59:15 +08:00
yangdx	e56734cb8b	Refac: Optimize document deletion performance - Adding chunks_list to dock_status - Adding llm_cache_list to text_chunks - Implemented storage types: JsonKV and Redis	2025-07-03 04:18:25 +08:00
yangdx	271722405f	feat: Flatten LLM cache structure for improved recall efficiency Refactored the LLM cache to a flat Key-Value (KV) structure, replacing the previous nested format. The old structure used the 'mode' as a key and stored specific cache content as JSON nested under it. This change significantly enhances cache recall efficiency.	2025-07-02 16:11:53 +08:00
zrguo	ead82a8dbd	update delete_by_doc_id	2025-06-09 18:52:34 +08:00
yangdx	38b862e993	Remove unsed functions	2025-05-18 07:16:52 +08:00
sa9arr	36b606d0db	Fix: Correct GraphML to JSON mapping in xml_to_json function	2025-05-17 19:32:25 +05:45
yangdx	2845e268e4	Ensure priority_limit_async_func_call decorator receive callable	2025-05-13 02:00:01 +08:00
yangdx	4d57370c94	Refactor: Move get_env_value from api.config to utils Relocates the `get_env_value` utility function from `lightrag.api.config` to `lightrag.utils` to decouple LightRAG core from API Server	2025-05-10 08:58:18 +08:00
yangdx	3eb3b170ab	Remove list_of_list_to_dict function	2025-05-07 18:01:23 +08:00
yangdx	156244e260	Refactor: Unify naive context to JSON format - Merges 'mix' mode query handling into 'hybrid' mode, simplifying query logic by removing the dedicated `mix_kg_vector_query` function - Standardizes vector search result by using JSON string format to build context - Fixes a bug in `query_with_keywords` ensuring `hl_keywords` and `ll_keywords` are correctly passed to `kg_query_with_keywords`	2025-05-07 17:42:14 +08:00
yangdx	3146309fde	Change function name from list_of_list_to_json to list_of_list_to_dict	2025-05-07 10:52:26 +08:00
yangdx	dbfcf30801	Fix linting	2025-05-06 22:03:40 +08:00
yangdx	c8ecfa2d68	feat: Centralize configuration and update defaults This commit introduces `lightrag/constants.py` to centralize default values for various configurations across the API and core components. Key changes: - Added `constants.py` to centralize default values - Improved the `get_env_value` function in `api/config.py` to correctly handle string "None" as a None value and to catch `TypeError` during value conversion. - Updated the default `SUMMARY_LANGUAGE` to "English" - Set default `WORKERS` to 2	2025-05-06 22:00:43 +08:00
yangdx	a36abce8d6	Update commnents	2025-05-05 11:26:31 +08:00
yangdx	62fd4a0540	Optimize log messages	2025-04-30 13:53:03 +08:00
yangdx	81953e6d46	Enhance the robustness of concurrency control and scheduling logic	2025-04-29 13:38:11 +08:00
yangdx	1afcbcbfb5	Fix race condition for health_check and ensure_workers	2025-04-29 00:08:52 +08:00
yangdx	1fc26127d5	Fix linting	2025-04-28 23:21:34 +08:00
yangdx	0ecae90002	Enhance the function's robustness	2025-04-28 22:52:31 +08:00
yangdx	e30afe8686	fix(utils): Fix TypeError in priority_limit_async_func_call when comparing Future objects	2025-04-28 21:07:01 +08:00
yangdx	2d59ac1ecb	Remove deprecated embedding cache logic	2025-04-28 18:51:43 +08:00
yangdx	5a393e563e	remove duplicate priority setting for merge summerization	2025-04-28 18:37:51 +08:00
yangdx	140b1b3cbb	Add priority control for limited async decorator	2025-04-28 18:12:29 +08:00
yangdx	02e9055f9d	Fix linting	2025-04-24 20:04:42 +08:00
yangdx	f6129857a1	Improve quantize and dequantize handling of embedding	2025-04-24 20:03:01 +08:00
yangdx	6977db3dd1	Remove the single quotation marks that enclose the names of the entities	2025-04-23 21:30:07 +08:00
yangdx	21c0bb7abf	Merge branch 'context_format_csv_to_json'	2025-04-22 12:25:50 +08:00
yangdx	e7063b5f1e	Remove embedding_cache_config	2025-04-22 00:28:17 +08:00
yangdx	85684164f0	Fix linting	2025-04-21 20:18:05 +08:00
yangdx	17f5439952	Remove space between chinese chars and Egnlish symbols	2025-04-21 19:21:30 +08:00
孟超	8064a2339f	change process_combine_contexts params type to list[dict[str, str]]	2025-04-21 12:08:12 +08:00

1 2 3 4

174 Commits