Skip to content

Commit 375903c

Browse files
committed
feat: add MOE configuration parameters for GLM4_MOE in test models
1 parent ccc308a commit 375903c

4 files changed

Lines changed: 32 additions & 0 deletions

File tree

test/convergence/bf16/test_mini_models.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1148,6 +1148,14 @@
11481148
eos_token_id=2, # 151329, 151336, 151338
11491149
pad_token_id=2, # 151329
11501150
partial_rotary_factor=0.5,
1151+
moe_intermediate_size=1408,
1152+
num_experts_per_tok=2,
1153+
n_shared_experts=1,
1154+
n_routed_experts=8,
1155+
routed_scaling_factor=1.0,
1156+
n_group=1,
1157+
topk_group=1,
1158+
first_k_dense_replace=1,
11511159
cross_attention_layers=None,
11521160
dropout=0,
11531161
hidden_act="silu",

test/convergence/bf16/test_mini_models_with_logits.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1096,6 +1096,14 @@
10961096
eos_token_id=2, # 151329, 151336, 151338
10971097
pad_token_id=2, # 151329
10981098
partial_rotary_factor=0.5,
1099+
moe_intermediate_size=1408,
1100+
num_experts_per_tok=2,
1101+
n_shared_experts=1,
1102+
n_routed_experts=8,
1103+
routed_scaling_factor=1.0,
1104+
n_group=1,
1105+
topk_group=1,
1106+
first_k_dense_replace=1,
10991107
cross_attention_layers=None,
11001108
dropout=0,
11011109
hidden_act="silu",

test/convergence/fp32/test_mini_models.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1084,6 +1084,14 @@
10841084
eos_token_id=2, # 151329, 151336, 151338
10851085
pad_token_id=2, # 151329
10861086
partial_rotary_factor=0.5,
1087+
moe_intermediate_size=1408,
1088+
num_experts_per_tok=2,
1089+
n_shared_experts=1,
1090+
n_routed_experts=8,
1091+
routed_scaling_factor=1.0,
1092+
n_group=1,
1093+
topk_group=1,
1094+
first_k_dense_replace=1,
10871095
cross_attention_layers=None,
10881096
dropout=0,
10891097
hidden_act="silu",

test/convergence/fp32/test_mini_models_with_logits.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1113,6 +1113,14 @@
11131113
eos_token_id=2, # 151329, 151336, 151338
11141114
pad_token_id=2, # 151329
11151115
partial_rotary_factor=0.5,
1116+
moe_intermediate_size=1408,
1117+
num_experts_per_tok=2,
1118+
n_shared_experts=1,
1119+
n_routed_experts=8,
1120+
routed_scaling_factor=1.0,
1121+
n_group=1,
1122+
topk_group=1,
1123+
first_k_dense_replace=1,
11161124
cross_attention_layers=None,
11171125
dropout=0,
11181126
hidden_act="silu",

0 commit comments

Comments
 (0)