Skip to content

Commit 5d685cd

Browse files
Add min_p option to benchmark script; Add an example usage to the README
1 parent 2430b17 commit 5d685cd

File tree

2 files changed

+3
-0
lines changed

2 files changed

+3
-0
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,7 @@ To test generation latency (e.g. batch size = 1) with different sampling strateg
138138
```
139139
python benchmarks/benchmark_generation_mamba_simple.py --model-name "state-spaces/mamba-2.8b" --prompt "My cat wrote all this CUDA code for a new language model and" --topp 0.9 --temperature 0.7 --repetition-penalty 1.2
140140
python benchmarks/benchmark_generation_mamba_simple.py --model-name "EleutherAI/pythia-2.8b" --prompt "My cat wrote all this CUDA code for a new language model and" --topp 0.9 --temperature 0.7 --repetition-penalty 1.2
141+
python benchmarks/benchmark_generation_mamba_simple.py --model-name "state-spaces/mamba-2.8b" --prompt "My cat wrote all this CUDA code for a new language model and" --minp 0.05 --temperature 0.7 --repetition-penalty 1.2
141142
```
142143

143144
To test generation throughput with random prompts (e.g. large batch size):

benchmarks/benchmark_generation_mamba_simple.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@
2222
parser.add_argument("--temperature", type=float, default=1.0)
2323
parser.add_argument("--topk", type=int, default=1)
2424
parser.add_argument("--topp", type=float, default=1.0)
25+
parser.add_argument("--minp", type=float, default=0.0)
2526
parser.add_argument("--repetition-penalty", type=float, default=1.0)
2627
parser.add_argument("--batch", type=int, default=1)
2728
args = parser.parse_args()
@@ -62,6 +63,7 @@
6263
temperature=args.temperature,
6364
top_k=args.topk,
6465
top_p=args.topp,
66+
min_p=args.minp,
6567
repetition_penalty=args.repetition_penalty,
6668
)
6769
else:

0 commit comments

Comments
 (0)