831 Commits

Author SHA1 Message Date
cfli
4e9d0b386f update reranker finetune 2024-10-21 16:13:13 +08:00
cfli
dbbff43909 add source 2024-10-21 14:09:40 +08:00
hanhainebula
8f3b1e0c15 decrease num_epochs for embedder finetune examples 2024-10-21 13:38:02 +08:00
hanhainebula
dbbcabb888 fix a bug in AbsEmbedderSameDatasetTrainDataset
- allow empty pos_scores and neg_scores
2024-10-21 13:31:55 +08:00
hanhainebula
efa9703506 fix a bug in EncoderOnlyEmbedderTrainer 2024-10-21 13:28:34 +08:00
hanhainebula
ff611e127e fix a bug in AbsArguments.py for embedder 2024-10-21 12:59:12 +08:00
hanhainebula
918571ad6e upload example finetune scripts for embedder 2024-10-21 12:56:30 +08:00
hanhainebula
bc99e9a126 upload deepspeed configs for embedder finetune 2024-10-21 12:56:06 +08:00
hanhainebula
94822a897f upload example-data for embedder finetune 2024-10-21 12:55:31 +08:00
hanhainebula
dbb01ed6f8 upload finetune code for embedder: decoder icl 2024-10-21 01:38:54 +08:00
hanhainebula
d973adafad adapt finetune code of embedder for new para: kd
- encoder-only: base & m3
- decode-only: base
2024-10-21 01:38:18 +08:00
hanhainebula
70212678e9 update loss computation for embedder finetune
- add para: kd_loss_type & kd_loss_plus_normal_loss
- fix no_in_batch_neg option
- add two kinds of kd loss: kl_div, m3_kd_loss
2024-10-21 01:36:10 +08:00
hanhainebula
97be9b0f48 simplify example code for embedder inference 2024-10-19 22:31:30 +08:00
hanhainebula
effc2bb352 simplify example code for embedder inference 2024-10-19 22:23:53 +08:00
hanhainebula
9a8bcd7dfa update embedder inference code
- use_fp16=True, normalize_embeddings=True
- add more default para for auto_embedder
2024-10-19 22:23:28 +08:00
hanhainebula
33970b899b add expected results for m3 compute_score 2024-10-19 21:56:55 +08:00
hanhainebula
291dec2f6a fix a bug in M3Embedder 2024-10-19 21:44:27 +08:00
hanhainebula
46dc5a031f fix a bug in EncoderOnlyEmbedderM3Model 2024-10-19 21:40:10 +08:00
hanhainebula
8916939a8d fix a bug in EncoderOnlyEmbedderM3Model 2024-10-19 21:38:46 +08:00
hanhainebula
d2232b4159 fix a bug in EncoderOnlyEmbedderM3Model 2024-10-19 21:37:38 +08:00
hanhainebula
d3b4f7bb78 fix a bug in m3 embedder compute_score 2024-10-19 21:35:10 +08:00
hanhainebula
2c4c9629f3 upload m3 embedder compute_score examples 2024-10-19 21:33:16 +08:00
hanhainebula
7fc9fca908 implement compute_score func for m3 embedder 2024-10-19 21:23:22 +08:00
hanhainebula
02197abddf fix a bug for embedder inference: single input 2024-10-19 21:22:22 +08:00
hanhainebula
f4d46ff73a add expected results for embedder inference code 2024-10-19 20:48:46 +08:00
hanhainebula
ae72acf12c fix a bug in ICLLLMEmbedder inference 2024-10-19 20:28:52 +08:00
hanhainebula
fce4e46635 del debug code for FlagLLMModel inference 2024-10-19 20:28:04 +08:00
hanhainebula
0ba20823a2 fix a bug in ICLLLMEmbedder inference 2024-10-19 20:25:14 +08:00
hanhainebula
39b362112a fix bugs in embedder inference: enable use_fp16 2024-10-19 20:22:52 +08:00
hanhainebula
eb724106ad fix UserWarning: rm max_length para when padding 2024-10-19 20:11:31 +08:00
hanhainebula
99dca1d3d2 update example infer code for multilingual-gemma2 2024-10-19 20:03:05 +08:00
hanhainebula
78d1a8727c update example code for embedder 2024-10-19 18:57:56 +08:00
hanhainebula
baa06bf033 fix UserWarning: rm max_length para when encoding 2024-10-19 18:15:43 +08:00
hanhainebula
2d3b64a43c set weights_only=True when loading m3 linear 2024-10-19 18:09:33 +08:00
hanhainebula
4d2916cfe6 update m3 example code 2024-10-19 18:01:31 +08:00
hanhainebula
052954588a update multi-gpu code for decoder-only embedder 2024-10-19 18:01:08 +08:00
hanhainebula
0a18a87e0d update examples code for embedder inferencce 2024-10-19 13:01:48 +08:00
hanhainebula
d5cd99284b update M3Embedder compute lex scores func 2024-10-19 13:01:01 +08:00
hanhainebula
f3a0638193 update get_model func for M3Runnder 2024-10-19 13:00:08 +08:00
hanhainebula
950cf8ef0a rename BGEM3Model to BGEM3FlagModel 2024-10-19 12:38:56 +08:00
hanhainebula
d4cac4719b update BGEM3Model examples 2024-10-19 12:27:17 +08:00
hanhainebula
cfe14db868 update BGEM3Model multi-device example 2024-10-19 12:26:27 +08:00
hanhainebula
48ebe65232 update BGEM3Model example 2024-10-19 12:24:22 +08:00
hanhainebula
443765a4f1 update BGEM3Model example 2024-10-19 12:22:10 +08:00
hanhainebula
1b4bd85c26 update para for M3Embedder when compute lex scores 2024-10-19 12:19:48 +08:00
hanhainebula
48239045ee update para for M3Embedder when compute lex scores 2024-10-19 12:17:08 +08:00
hanhainebula
4708b4f7ea fix a bug in M3Embedder 2024-10-19 12:11:53 +08:00
hanhainebula
9dbaffbf03 fix a bug: EncoderOnlyEmbedderM3ModelForInference 2024-10-19 12:06:27 +08:00
hanhainebula
ea6537cca9 remove para convert_to_numpy from AbsEmbedder 2024-10-19 12:03:48 +08:00
hanhainebula
b5327dee5d fix a bug in M3Embedder 2024-10-19 11:57:56 +08:00