From d8590e208f558457697cfac121a21bfd0191f373 Mon Sep 17 00:00:00 2001 From: hanhainebula <2512674094@qq.com> Date: Wed, 30 Oct 2024 23:28:08 +0800 Subject: [PATCH] update link in README --- README.md | 2 +- research/LM_Cocktail/README.md | 2 +- research/baai_general_embedding/README.md | 2 +- research/old-examples/finetune/README.md | 2 +- research/old-examples/unified_finetune/README.md | 2 +- 5 files changed, 5 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index fe78204..b08427e 100644 --- a/README.md +++ b/README.md @@ -33,7 +33,7 @@ -[English](README.md) | [中文](https://github.com/hanhainebula/FlagEmbedding/blob/new-flagembedding-v1/README_zh.md) +[English](README.md) | [中文](https://github.com/FlagOpen/FlagEmbedding/blob/master/README_zh.md) diff --git a/research/LM_Cocktail/README.md b/research/LM_Cocktail/README.md index 496a2a0..d77a20e 100644 --- a/research/LM_Cocktail/README.md +++ b/research/LM_Cocktail/README.md @@ -237,7 +237,7 @@ Merge 10 models fine-tuned on other tasks based on five examples for new tasks: - Examples Data for dataset from FLAN: [./llm_examples.json]() - MMLU dataset: https://huggingface.co/datasets/cais/mmlu (use the example in dev set to do in-context learning) -You can use these models and our code to produce a new model and evaluate its performance using the [llm-embedder script](https://github.com/hanhainebula/FlagEmbedding/blob/new-flagembedding-v1/research/llm_embedder/docs/evaluation.md) as following: +You can use these models and our code to produce a new model and evaluate its performance using the [llm-embedder script](https://github.com/FlagOpen/FlagEmbedding/blob/master/research/llm_embedder/docs/evaluation.md) as following: ``` # for 30 tasks from FLAN torchrun --nproc_per_node 8 -m evaluation.eval_icl \ diff --git a/research/baai_general_embedding/README.md b/research/baai_general_embedding/README.md index 268d862..e495506 100644 --- a/research/baai_general_embedding/README.md +++ b/research/baai_general_embedding/README.md @@ -17,7 +17,7 @@ Following this [example](https://github.com/FlagOpen/FlagEmbedding/tree/master/e Some suggestions: - Mine hard negatives following this [example](https://github.com/FlagOpen/FlagEmbedding/tree/master/examples/finetune/embedder#hard-negatives), which can improve the retrieval performance. -- In general, larger hyper-parameter `per_device_train_batch_size` brings better performance. You can expand it by enabling `--fp16`, `--deepspeed df_config.json` (df_config.json can refer to [ds_config.json](https://github.com/hanhainebula/FlagEmbedding/blob/new-flagembedding-v1/examples/finetune/ds_stage0.json), `--gradient_checkpointing`, etc. +- In general, larger hyper-parameter `per_device_train_batch_size` brings better performance. You can expand it by enabling `--fp16`, `--deepspeed df_config.json` (df_config.json can refer to [ds_config.json](https://github.com/FlagOpen/FlagEmbedding/blob/master/examples/finetune/ds_stage0.json), `--gradient_checkpointing`, etc. - If you want to maintain the performance on other tasks when fine-tuning on your data, you can use [LM-Cocktail](https://github.com/FlagOpen/FlagEmbedding/tree/master/research/LM_Cocktail) to merge the fine-tuned model and the original bge model. Besides, if you want to fine-tune on multiple tasks, you also can approximate the multi-task learning via model merging as [LM-Cocktail](https://github.com/FlagOpen/FlagEmbedding/tree/master/research/LM_Cocktail). - If you pre-train bge on your data, the pre-trained model cannot be directly used to calculate similarity, and it must be fine-tuned with contrastive learning before computing similarity. - If the accuracy of the fine-tuned model is still not high, it is recommended to use/fine-tune the cross-encoder model (bge-reranker) to re-rank top-k results. Hard negatives also are needed to fine-tune reranker. diff --git a/research/old-examples/finetune/README.md b/research/old-examples/finetune/README.md index 54a5a09..e08cf9d 100644 --- a/research/old-examples/finetune/README.md +++ b/research/old-examples/finetune/README.md @@ -142,7 +142,7 @@ Please replace the `query_instruction_for_retrieval` with your instruction if yo ### 6. Evaluate model -We provide [a simple script](https://github.com/hanhainebula/FlagEmbedding/blob/new-flagembedding-v1/research/baai_general_embedding/finetune/eval_msmarco.py) to evaluate the model's performance. +We provide [a simple script](https://github.com/FlagOpen/FlagEmbedding/blob/master/research/baai_general_embedding/finetune/eval_msmarco.py) to evaluate the model's performance. A brief summary of how the script works: 1. Load the model on all available GPUs through [DataParallel](https://pytorch.org/docs/stable/generated/torch.nn.DataParallel.html). diff --git a/research/old-examples/unified_finetune/README.md b/research/old-examples/unified_finetune/README.md index 73c53bd..179671b 100644 --- a/research/old-examples/unified_finetune/README.md +++ b/research/old-examples/unified_finetune/README.md @@ -63,7 +63,7 @@ torchrun --nproc_per_node {number of gpus} \ You can also refer to [this script](./unified_finetune_bge-m3_exmaple.sh) for more details. In this script, we use `deepspeed` to perform distributed training. Learn more about `deepspeed` at https://www.deepspeed.ai/getting-started/. Note that there are some important parameters to be modified in this script: - `HOST_FILE_CONTENT`: Machines and GPUs for training. If you want to use multiple machines for training, please refer to https://www.deepspeed.ai/getting-started/#resource-configuration-multi-node (note that you should configure `pdsh` and `ssh` properly). -- `DS_CONFIG_FILE`: Path of deepspeed config file. [Here](https://github.com/hanhainebula/FlagEmbedding/blob/new-flagembedding-v1/examples/finetune/ds_stage0.json) is an example of `ds_config.json`. +- `DS_CONFIG_FILE`: Path of deepspeed config file. [Here](https://github.com/FlagOpen/FlagEmbedding/blob/master/examples/finetune/ds_stage0.json) is an example of `ds_config.json`. - `DATA_PATH`: One or more paths of training data. **Each path must be a directory containing one or more jsonl files**. - `DEFAULT_BATCH_SIZE`: Default batch size for training. If you use efficient batching strategy, which means you have split your data to different parts by sequence length, then the batch size for each part will be decided by the `get_file_batch_size()` function in [`BGE_M3/data.py`](../../BGE_M3/data.py). Before starting training, you should set the corresponding batch size for each part in this function according to the GPU memory of your machines. `DEFAULT_BATCH_SIZE` will be used for the part whose sequence length is not in the `get_file_batch_size()` function. - `EPOCHS`: Number of training epochs.