FlagEmbedding/examples/README.md

# Examples

- [1. Introduction](#1-Introduction)
- [2. Installation](#2-Installation)
- [3. Inference](#3-Inference)
- [4. Finetune](#4-Finetune)
- [5. Evaluation](#5-Evaluation)

## 1. Introduction

In this example, we show how to **inference**, **finetune** and **evaluate** the baai-general-embedding.

## 2. Installation

* **with pip**

```shell
pip install -U FlagEmbedding
```

* **from source**

```shell
git clone https://github.com/FlagOpen/FlagEmbedding.git
cd FlagEmbedding
pip install  .
```

For development, install as editable:

```shell
pip install -e .
```

## 3. Inference

We have provided the inference code for two types of models: the **embedder** and the **reranker**. These can be loaded using `FlagAutoModel` and `FlagAutoReranker`, respectively. For more detailed instructions on their use, please refer to the documentation for the [embedder](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/inference/embedder) and [reranker](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/inference/reranker).

### 1. Embedder

```python
from FlagEmbedding import FlagAutoModel
sentences_1 = ["样例数据-1", "样例数据-2"]
sentences_2 = ["样例数据-3", "样例数据-4"]
model = FlagAutoModel.from_finetuned('BAAI/bge-large-zh-v1.5', 
                                     query_instruction_for_retrieval="为这个句子生成表示以用于检索相关文章：",
                                     use_fp16=True,
                                     devices=['cuda:0']) # Setting use_fp16 to True speeds up computation with a slight performance degradation
embeddings_1 = model.encode_corpus(sentences_1)
embeddings_2 = model.encode_corpus(sentences_2)
similarity = embeddings_1 @ embeddings_2.T
print(similarity)

# for s2p(short query to long passage) retrieval task, suggest to use encode_queries() which will automatically add the instruction to each query
# corpus in retrieval task can still use encode_corpus(), since they don't need instruction
queries = ['query_1', 'query_2']
passages = ["样例文档-1", "样例文档-2"]
q_embeddings = model.encode_queries(queries)
p_embeddings = model.encode_corpus(passages)
scores = q_embeddings @ p_embeddings.T
print(scores)
```

### 2. Reranker

```python
from FlagEmbedding import FlagAutoReranker
pairs = [("样例数据-1", "样例数据-3"), ("样例数据-2", "样例数据-4")]
model = FlagAutoReranker.from_finetuned('BAAI/bge-reranker-large',
                                        use_fp16=True,
                                        devices=['cuda:0']) # Setting use_fp16 to True speeds up computation with a slight performance degradation
similarity = model.compute_score(pairs, normalize=True)
print(similarity)

pairs = [("query_1", "样例文档-1"), ("query_2", "样例文档-2")]
scores = model.compute_score(pairs)
print(scores)
```

## 4. Finetune

We support fine-tuning a variety of BGE series models, including `bge-large-en-v1.5`, `bge-m3`, `bge-en-icl`, `bge-multilingual-gemma2`, `bge-reranker-v2-m3`, `bge-reranker-v2-gemma`, and `bge-reranker-v2-minicpm-layerwise`, among others. As examples, we use the basic models `bge-large-en-v1.5` and `bge-reranker-large`. For more details, please refer to the [embedder](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune/embedder) and [reranker](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune/reranker) sections.

If you do not have the `deepspeed` and `flash-attn` packages installed, you can install them with the following commands:
```shell
pip install deepspeed
pip install flash-attn --no-build-isolation
```

### 1. Embedder

```shell
torchrun --nproc_per_node 2 \
    -m FlagEmbedding.finetune.embedder.encoder_only.base \
    --model_name_or_path BAAI/bge-large-en-v1.5 \
    --cache_dir ./cache/model \
    --train_data ./finetune/embedder/example_data/retrieval \
    --cache_path ./cache/data \
    --train_group_size 8 \
    --query_max_len 512 \
    --passage_max_len 512 \
    --pad_to_multiple_of 8 \
    --query_instruction_for_retrieval 'Represent this sentence for searching relevant passages: ' \
    --query_instruction_format '{}{}' \
    --knowledge_distillation False \
    --output_dir ./test_encoder_only_base_bge-large-en-v1.5 \
    --overwrite_output_dir \
    --learning_rate 1e-5 \
    --fp16 \
    --num_train_epochs 1 \
    --per_device_train_batch_size 2 \
    --dataloader_drop_last True \
    --warmup_ratio 0.1 \
    --gradient_checkpointing \
    --deepspeed ./finetune/ds_stage0.json \
    --logging_steps 1 \
    --save_steps 1000 \
    --negatives_cross_device \
    --temperature 0.02 \
    --sentence_pooling_method cls \
    --normalize_embeddings True \
    --kd_loss_type kl_div
```

### 2. Reranker

```shell
torchrun --nproc_per_node 2 \
    -m FlagEmbedding.finetune.reranker.encoder_only.base \
    --model_name_or_path BAAI/bge-reranker-large \
    --cache_dir ./cache/model \
    --train_data ./finetune/reranker/example_data/normal/examples.jsonl \
    --cache_path ./cache/data \
    --train_group_size 8 \
    --query_max_len 256 \
    --passage_max_len 256 \
    --pad_to_multiple_of 8 \
    --knowledge_distillation False \
    --output_dir ./test_encoder_only_base_bge-reranker-large \
    --overwrite_output_dir \
    --learning_rate 6e-5 \
    --fp16 \
    --num_train_epochs 1 \
    --per_device_train_batch_size 2 \
    --gradient_accumulation_steps 1 \
    --dataloader_drop_last True \
    --warmup_ratio 0.1 \
    --gradient_checkpointing \
    --weight_decay 0.01 \
    --deepspeed ./finetune/ds_stage0.json \
    --logging_steps 1 \
    --save_steps 1000
```

## 5. Evaluation

We support evaluations on [MTEB](https://github.com/embeddings-benchmark/mteb), [BEIR](https://github.com/beir-cellar/beir), [MSMARCO](https://microsoft.github.io/msmarco/), [MIRACL](https://github.com/project-miracl/miracl), [MLDR](https://huggingface.co/datasets/Shitao/MLDR), [MKQA](https://github.com/apple/ml-mkqa), [AIR-Bench](https://github.com/AIR-Bench/AIR-Bench), [BRIGHT](https://brightbenchmark.github.io/), and custom datasets. Below is an example of evaluating MSMARCO passages. For more details, please refer to the [evaluation examples](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/evaluation).

```shell
pip install pytrec_eval
# if you fail to install pytrec_eval, try the following command
# pip install pytrec-eval-terrier
pip install https://github.com/kyamagu/faiss-wheels/releases/download/v1.7.3/faiss_gpu-1.7.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
python -m FlagEmbedding.evaluation.msmarco \
    --eval_name msmarco \
    --dataset_dir ./data/msmarco \
    --dataset_names passage \
    --splits dev dl19 dl20 \
    --corpus_embd_save_dir ./data/msmarco/corpus_embd \
    --output_dir ./data/msmarco/search_results \
    --search_top_k 1000 \
    --rerank_top_k 100 \
    --cache_path ./cache/data \
    --overwrite True \
    --k_values 10 100 \
    --eval_output_method markdown \
    --eval_output_path ./data/msmarco/msmarco_eval_results.md \
    --eval_metrics ndcg_at_10 mrr_at_10 recall_at_100 \
    --embedder_name_or_path BAAI/bge-large-en-v1.5 \
    --embedder_batch_size 512 \
    --embedder_query_max_length 512 \
    --embedder_passage_max_length 512 \
    --reranker_name_or_path BAAI/bge-reranker-v2-m3 \
    --reranker_batch_size 512 \
    --reranker_query_max_length 512 \
    --reranker_max_length 1024 \
    --devices cuda:0 cuda:1 cuda:2 cuda:3 cuda:4 cuda:5 cuda:6 cuda:7 \
    --cache_dir ./cache/model
```
update readme 2024-10-31 21:41:40 +08:00			`# Examples`

update readme 2024-10-31 21:43:01 +08:00			`- [1. Introduction](#1-Introduction)`
			`- [2. Installation](#2-Installation)`
			`- [3. Inference](#3-Inference)`
			`- [4. Finetune](#4-Finetune)`
			`- [5. Evaluation](#5-Evaluation)`
update readme 2024-10-31 21:41:40 +08:00
			`## 1. Introduction`
update tmp readme 2024-10-28 14:59:46 +08:00
update readme 2024-10-29 15:52:10 +08:00			`In this example, we show how to inference, finetune and evaluate the baai-general-embedding.`
update tmp readme 2024-10-28 14:59:46 +08:00
update readme 2024-10-31 21:41:40 +08:00			`## 2. Installation`
update tmp readme 2024-10-28 14:59:46 +08:00
			`* with pip`
update readme 2024-10-31 21:41:40 +08:00
update tmp readme 2024-10-28 14:59:46 +08:00			```shell
			`pip install -U FlagEmbedding`
			```

			`* from source`
update readme 2024-10-31 21:41:40 +08:00
update tmp readme 2024-10-28 14:59:46 +08:00			```shell
			`git clone https://github.com/FlagOpen/FlagEmbedding.git`
			`cd FlagEmbedding`
			`pip install .`
			```
update readme 2024-10-31 21:41:40 +08:00
update tmp readme 2024-10-28 14:59:46 +08:00			`For development, install as editable:`
update readme 2024-10-31 21:41:40 +08:00
update tmp readme 2024-10-28 14:59:46 +08:00			```shell
			`pip install -e .`
			```

update readme 2024-10-31 21:41:40 +08:00			`## 3. Inference`
update tmp readme 2024-10-28 14:59:46 +08:00
update readme files - modify link to the official repo 2024-10-30 18:43:26 +08:00			We have provided the inference code for two types of models: the embedder and the reranker. These can be loaded using `FlagAutoModel` and `FlagAutoReranker`, respectively. For more detailed instructions on their use, please refer to the documentation for the [embedder](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/inference/embedder) and [reranker](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/inference/reranker).
update tmp readme 2024-10-28 14:59:46 +08:00
update readme 2024-10-31 21:41:40 +08:00			`### 1. Embedder`
update tmp readme 2024-10-28 14:59:46 +08:00
			```python
			`from FlagEmbedding import FlagAutoModel`
			`sentences_1 = ["样例数据-1", "样例数据-2"]`
			`sentences_2 = ["样例数据-3", "样例数据-4"]`
			`model = FlagAutoModel.from_finetuned('BAAI/bge-large-zh-v1.5',`
			`query_instruction_for_retrieval="为这个句子生成表示以用于检索相关文章：",`
			`use_fp16=True,`
update readme files - modify link to the official repo 2024-10-30 18:43:26 +08:00			`devices=['cuda:0']) # Setting use_fp16 to True speeds up computation with a slight performance degradation`
update tmp readme 2024-10-28 14:59:46 +08:00			`embeddings_1 = model.encode_corpus(sentences_1)`
			`embeddings_2 = model.encode_corpus(sentences_2)`
			`similarity = embeddings_1 @ embeddings_2.T`
			`print(similarity)`

			`# for s2p(short query to long passage) retrieval task, suggest to use encode_queries() which will automatically add the instruction to each query`
			`# corpus in retrieval task can still use encode_corpus(), since they don't need instruction`
			`queries = ['query_1', 'query_2']`
			`passages = ["样例文档-1", "样例文档-2"]`
			`q_embeddings = model.encode_queries(queries)`
			`p_embeddings = model.encode_corpus(passages)`
			`scores = q_embeddings @ p_embeddings.T`
			`print(scores)`
			```

update readme 2024-10-31 21:41:40 +08:00			`### 2. Reranker`
update tmp readme 2024-10-28 14:59:46 +08:00
			```python
			`from FlagEmbedding import FlagAutoReranker`
			`pairs = [("样例数据-1", "样例数据-3"), ("样例数据-2", "样例数据-4")]`
			`model = FlagAutoReranker.from_finetuned('BAAI/bge-reranker-large',`
			`use_fp16=True,`
update readme files - modify link to the official repo 2024-10-30 18:43:26 +08:00			`devices=['cuda:0']) # Setting use_fp16 to True speeds up computation with a slight performance degradation`
update tmp readme 2024-10-28 14:59:46 +08:00			`similarity = model.compute_score(pairs, normalize=True)`
			`print(similarity)`

			`pairs = [("query_1", "样例文档-1"), ("query_2", "样例文档-2")]`
			`scores = model.compute_score(pairs)`
			`print(scores)`
			```

update readme 2024-10-31 21:41:40 +08:00			`## 4. Finetune`
update tmp readme 2024-10-28 14:59:46 +08:00
update readme files - modify link to the official repo 2024-10-30 18:43:26 +08:00			We support fine-tuning a variety of BGE series models, including `bge-large-en-v1.5`, `bge-m3`, `bge-en-icl`, `bge-multilingual-gemma2`, `bge-reranker-v2-m3`, `bge-reranker-v2-gemma`, and `bge-reranker-v2-minicpm-layerwise`, among others. As examples, we use the basic models `bge-large-en-v1.5` and `bge-reranker-large`. For more details, please refer to the [embedder](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune/embedder) and [reranker](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune/reranker) sections.
update tmp readme 2024-10-28 14:59:46 +08:00
update readme 2024-10-31 23:49:16 +08:00			If you do not have the `deepspeed` and `flash-attn` packages installed, you can install them with the following commands:
mv deepspeed flash-attn to FT 2024-10-31 19:58:18 +08:00			```shell
			`pip install deepspeed`
			`pip install flash-attn --no-build-isolation`
			```

update readme 2024-10-31 21:41:40 +08:00			`### 1. Embedder`
update tmp readme 2024-10-28 14:59:46 +08:00
update examples readme 2024-10-28 15:30:13 +08:00			```shell
			`torchrun --nproc_per_node 2 \`
			`-m FlagEmbedding.finetune.embedder.encoder_only.base \`
			`--model_name_or_path BAAI/bge-large-en-v1.5 \`
			`--cache_dir ./cache/model \`
			`--train_data ./finetune/embedder/example_data/retrieval \`
			`--cache_path ./cache/data \`
			`--train_group_size 8 \`
			`--query_max_len 512 \`
			`--passage_max_len 512 \`
			`--pad_to_multiple_of 8 \`
			`--query_instruction_for_retrieval 'Represent this sentence for searching relevant passages: ' \`
			`--query_instruction_format '{}{}' \`
			`--knowledge_distillation False \`
			`--output_dir ./test_encoder_only_base_bge-large-en-v1.5 \`
			`--overwrite_output_dir \`
			`--learning_rate 1e-5 \`
			`--fp16 \`
update examples readme 2024-10-28 15:36:40 +08:00			`--num_train_epochs 1 \`
			`--per_device_train_batch_size 2 \`
update examples readme 2024-10-28 15:30:13 +08:00			`--dataloader_drop_last True \`
			`--warmup_ratio 0.1 \`
			`--gradient_checkpointing \`
			`--deepspeed ./finetune/ds_stage0.json \`
			`--logging_steps 1 \`
			`--save_steps 1000 \`
			`--negatives_cross_device \`
			`--temperature 0.02 \`
			`--sentence_pooling_method cls \`
			`--normalize_embeddings True \`
			`--kd_loss_type kl_div`
			```
update tmp readme 2024-10-28 14:59:46 +08:00
update readme 2024-10-31 21:41:40 +08:00			`### 2. Reranker`
update examples readme 2024-10-28 15:30:13 +08:00
			```shell
			`torchrun --nproc_per_node 2 \`
			`-m FlagEmbedding.finetune.reranker.encoder_only.base \`
			`--model_name_or_path BAAI/bge-reranker-large \`
			`--cache_dir ./cache/model \`
			`--train_data ./finetune/reranker/example_data/normal/examples.jsonl \`
upload embedder inference 2024-10-28 21:10:29 +08:00			`--cache_path ./cache/data \`
update examples readme 2024-10-28 15:30:13 +08:00			`--train_group_size 8 \`
			`--query_max_len 256 \`
			`--passage_max_len 256 \`
			`--pad_to_multiple_of 8 \`
update readme 2024-10-30 19:17:50 +08:00			`--knowledge_distillation False \`
update examples readme 2024-10-28 15:30:13 +08:00			`--output_dir ./test_encoder_only_base_bge-reranker-large \`
			`--overwrite_output_dir \`
			`--learning_rate 6e-5 \`
			`--fp16 \`
update examples readme 2024-10-28 15:36:40 +08:00			`--num_train_epochs 1 \`
update examples readme 2024-10-28 15:30:13 +08:00			`--per_device_train_batch_size 2 \`
			`--gradient_accumulation_steps 1 \`
			`--dataloader_drop_last True \`
			`--warmup_ratio 0.1 \`
			`--gradient_checkpointing \`
			`--weight_decay 0.01 \`
			`--deepspeed ./finetune/ds_stage0.json \`
			`--logging_steps 1 \`
upload embedder inference 2024-10-28 21:10:29 +08:00			`--save_steps 1000`
update examples readme 2024-10-28 15:30:13 +08:00			```
update tmp readme 2024-10-28 14:59:46 +08:00
update readme 2024-10-31 21:41:40 +08:00			`## 5. Evaluation`
update tmp readme 2024-10-28 14:59:46 +08:00
update examples README: support BRIGHT evaluation 2025-10-10 20:20:48 +08:00			We support evaluations on [MTEB](https://github.com/embeddings-benchmark/mteb), [BEIR](https://github.com/beir-cellar/beir), [MSMARCO](https://microsoft.github.io/msmarco/), [MIRACL](https://github.com/project-miracl/miracl), [MLDR](https://huggingface.co/datasets/Shitao/MLDR), [MKQA](https://github.com/apple/ml-mkqa), [AIR-Bench](https://github.com/AIR-Bench/AIR-Bench), [BRIGHT](https://brightbenchmark.github.io/), and custom datasets. Below is an example of evaluating MSMARCO passages. For more details, please refer to the [evaluation examples](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/evaluation).
update examples readme 2024-10-28 15:30:13 +08:00
			```shell
update eval args 2024-10-30 19:48:37 +08:00			`pip install pytrec_eval`
fix bug: pytrec_eval installation issue 2025-10-22 14:09:47 +08:00			`# if you fail to install pytrec_eval, try the following command`
fix bug: pytrec_eval installation issue 2025-10-22 14:08:51 +08:00			`# pip install pytrec-eval-terrier`
update readme 2024-10-30 18:53:12 +08:00			`pip install https://github.com/kyamagu/faiss-wheels/releases/download/v1.7.3/faiss_gpu-1.7.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl`
update examples readme 2024-10-28 15:30:13 +08:00			`python -m FlagEmbedding.evaluation.msmarco \`
			`--eval_name msmarco \`
			`--dataset_dir ./data/msmarco \`
			`--dataset_names passage \`
			`--splits dev dl19 dl20 \`
			`--corpus_embd_save_dir ./data/msmarco/corpus_embd \`
			`--output_dir ./data/msmarco/search_results \`
			`--search_top_k 1000 \`
			`--rerank_top_k 100 \`
			`--cache_path ./cache/data \`
			`--overwrite True \`
			`--k_values 10 100 \`
			`--eval_output_method markdown \`
			`--eval_output_path ./data/msmarco/msmarco_eval_results.md \`
			`--eval_metrics ndcg_at_10 mrr_at_10 recall_at_100 \`
			`--embedder_name_or_path BAAI/bge-large-en-v1.5 \`
			`--embedder_batch_size 512 \`
			`--embedder_query_max_length 512 \`
			`--embedder_passage_max_length 512 \`
			`--reranker_name_or_path BAAI/bge-reranker-v2-m3 \`
			`--reranker_batch_size 512 \`
			`--reranker_query_max_length 512 \`
			`--reranker_max_length 1024 \`
			`--devices cuda:0 cuda:1 cuda:2 cuda:3 cuda:4 cuda:5 cuda:6 cuda:7 \`
			`--cache_dir ./cache/model`
			```