Skip to content

Commit 27015fa

Browse files
committed
expand training and shortgpt_prune code to support more model
1 parent 5a4b905 commit 27015fa

File tree

2 files changed

+3
-4
lines changed

2 files changed

+3
-4
lines changed

slm/pipelines/examples/contrastive_training/README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,10 +18,11 @@ pip install -r slm/pipelines/examples/contrastive_training/requirements.txt
1818
```
1919

2020

21-
下载 DuReader-Retrieval 中文数据集:
21+
下载 DuReader-Retrieval 和 MMarco-Retrieval 中文数据集:
2222
```
2323
cd data
2424
wget https://paddlenlp.bj.bcebos.com/datasets/dureader_dual.train.jsonl
25+
python download_mmarco.py
2526
```
2627

2728
## 训练

slm/pipelines/examples/contrastive_training/data/download_mmarco.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,9 +22,7 @@
2222
print(len(dataset["train"]))
2323

2424

25-
fw = open(
26-
"/141nfs/lizhuoqun/PaddleNLP_1022/PaddleNLP/slm/pipelines/examples/contrastive_training/data/mmarco.jsonl", "w"
27-
)
25+
fw = open("./mmarco.jsonl", "w")
2826

2927
i = 0
3028
for data in tqdm.tqdm(dataset["train"]):

0 commit comments

Comments
 (0)