This commit is contained in:
cfli 2025-05-15 22:13:46 +08:00
parent 2df953b301
commit b9dfc78a9a

View File

@ -0,0 +1,26 @@
<h1 align="center">CodeR: Towards A Generalist Code Embedding Model</h1>
<p align="center">
<a href="https://huggingface.co/datasets/nebula2025/CodeR-Pile">
<img alt="Build" src="https://img.shields.io/badge/🤗 Dataset-CodeR Pile-yellow">
</a>
<a href="https://huggingface.co/nebula2025/CodeR-full">
<img alt="Build" src="https://img.shields.io/badge/🤗 Model-CodeR Full-green">
</a>
<a href="https://huggingface.co/nebula2025/CodeR-synthetic">
<img alt="Build" src="https://img.shields.io/badge/🤗 Model-CodeR Synthetic-blue">
</a>
</p>
This repo contains the data, training, and evaluation pipeline for CodeR / [BGE-Code-v1](https://huggingface.co/BAAI/bge-code-v1)
## :bell: News:
- 🥳 5/15/2025: We have released the CodeR! :fire:
## Process Data
## Synthetic Data
## Training
## Evaluation