m3

2025-06-27 02:39:58 +00:00 · 2024-01-29 22:09:27 +08:00 · 2024-01-29 22:09:27 +08:00 · ec721e3aa0
commit ec721e3aa0
parent dc52e4e865 84612ce8f8
4 changed files with 6 additions and 6 deletions
--- a/Long_LLM/activation_beacon/docs/evaluation.md
+++ b/Long_LLM/activation_beacon/docs/evaluation.md
@ -32,7 +32,7 @@ tar -xzvf activation-beacon-eval.tar.gz
 ## Long-Context Generation
 ### Language Modeling Perplexity
 ```bash
-data_root="/data"
+data_root="/data/activation-beacon"

 # NOTE: in the first run, the tokenization could be super slow (often consumes half an hour). However the tokenized corpus will be saved and reused. Be patient.

@ -75,7 +75,7 @@ The results can be found at `data/results/lm/pg19.log`.
 ### LongBench

 ```bash
-data_root="/data"
+data_root="/data/activation-beacon"

 ############## Llama-2 ##############
 torchrun --nproc_per_node 8 -m main.eval_longbench --data_root $data_root --max_length 3500 --use_flash_attention_2
@ -102,7 +102,7 @@ The results can be found at `data/results/longbench/metrics.log`.
 ## Synthetic Tasks
 ### Topic Retrieval
 ```bash
-data_root="/data"
+data_root="/data/activation-beacon"

 ############## Llama-2 ##############
 torchrun --nproc_per_node 8 -m main.eval_longeval --data_root $data_root --use_flash_attention_2
--- a/README.md
+++ b/README.md
@ -38,7 +38,7 @@ FlagEmbedding focus on retrieval-augmented LLMs, consisting of following project

 ## News 
 - 1/30/2024: Release**BGE-M3**, the first embedding model which supports multiple retrieval modes、multilingual and multi-granularity retrieval. [Technical Report]() and [Code](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3).
- 1/9/2024: Release **Activation-Beacon**, an effective, efficient, compatible, and low-cost (training) method to extend the context length of LLM. Model and code will be open-sourced. Please stay tuned. [Technical Report](https://arxiv.org/abs/2401.03462) :fire:
+- 1/9/2024: Release [Activation-Beacon](https://github.com/FlagOpen/FlagEmbedding/tree/master/Long_LLM/activation_beacon), an effective, efficient, compatible, and low-cost (training) method to extend the context length of LLM. [Technical Report](https://arxiv.org/abs/2401.03462) :fire:
 - 12/24/2023: Release **LLaRA**, a LLaMA-7B based dense retriever, leading to state-of-the-art performances on MS MARCO and BEIR. Model and code will be open-sourced. Please stay tuned. [Technical Report](https://arxiv.org/abs/2312.15503) :fire:
 - 11/23/2023: Release [LM-Cocktail](https://github.com/FlagOpen/FlagEmbedding/tree/master/LM_Cocktail), a method to maintain general capabilities during fine-tuning by merging multiple language models. [Technical Report](https://arxiv.org/abs/2311.13534) :fire:  
 - 10/12/2023: Release [LLM-Embedder](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/llm_embedder), a unified embedding model to support diverse retrieval augmentation needs for LLMs. [Technical Report](https://arxiv.org/pdf/2310.07554.pdf)
--- a/README_zh.md
+++ b/README_zh.md
@ -37,7 +37,7 @@ FlagEmbedding专注于检索增强llm领域，目前包括以下项目:

 ## 更新
 - 1/30/2024: 发布**BGE-M3**, 第一个具有多功能、多语言和多粒度特性的文本检索模型，高效支持多语言、长文本和混合检索。[技术报告]()和[代码](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/BGE_M3).
- 1/9/2024: 发布**Activation-Beacon**, 一个有效、高效、兼容、低成本（训练）的扩展大预言模型上下文长度的方法。模型与代码将会陆续开源. 敬请关注. [技术报告](https://arxiv.org/abs/2401.03462) :fire:
+- 1/9/2024: 发布[Activation-Beacon](https://github.com/FlagOpen/FlagEmbedding/tree/master/Long_LLM/activation_beacon), 一个有效、高效、兼容、低成本（训练）的扩展大预言模型上下文长度的方法。[技术报告](https://arxiv.org/abs/2401.03462) :fire:
 - 12/24/2023: 发布**LLaRA**, 一个基于LLaMA-7B的稠密检索模型, MS MARCO与BEIR上取得了迄今最好的实验结果. 模型与代码将会陆续开源. 敬请关注. [技术报告](https://arxiv.org/abs/2312.15503) :fire:
 - 11/23/2023: 发布[LM-Cocktail](https://github.com/FlagOpen/FlagEmbedding/tree/master/LM_Cocktail), 一种通过模型融合在微调时保持原有模型通用能力的方法. [技术报告](https://arxiv.org/abs/2311.13534) :fire:
 - 10/12/2023: 发布 [LLM-Embedder](https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/llm_embedder), 专为大语言模型**各种检索增强任务设计**的英文向量模型。[技术报告](https://arxiv.org/pdf/2310.07554.pdf) 
--- a/setup.py
+++ b/setup.py
@ -5,7 +5,7 @@ with open("README.md", mode="r", encoding="utf-8") as readme_file:

 setup(
    name='FlagEmbedding',
-    version='1.1.9',
+    version='1.2.0',
    description='FlagEmbedding',
    long_description=readme,
    long_description_content_type="text/markdown",