"FlagEmbedding provides a high level class `FlagAutoModel` that unify the inference of embedding models. Besides BGE series, it also supports other popular open-source embedding models such as E5, GTE, SFR, etc. In this tutorial, we will have an idea how to use it."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"% pip install FlagEmbedding"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Usage"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First, import `FlagAutoModel` from FlagEmbedding, and use the `from_finetuned()` function to initialize the model:"
" query_instruction_for_retrieval=\"Represent this sentence for searching relevant passages: \",\n",
" devices=\"cuda:0\", # if not specified, will use all available gpus or cpu when no gpu available\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Then use the model exactly same to `FlagModel` (`FlagM3Model` if using BGE M3, `FlagLLMModel` if using BGE Multilingual Gemma2, `FlagICLModel` if using BGE ICL)"
"You're using a BertTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.\n"
"If you want to use your own models through `FlagAutoModel`, consider the following steps:\n",
"\n",
"1. Check the type of your embedding model and choose the appropriate model class, is it an encoder or a decoder?\n",
"2. What kind of pooling method it uses? CLS token, mean pooling, or last token?\n",
"3. Does your model needs `trust_remote_code=Ture` to ran?\n",
"4. Is there a query instruction format for retrieval?\n",
"\n",
"After these four attributes are assured, add your model name as the key and corresponding EmbedderConfig as the value to `MODEL_MAPPING`. Now have a try!"