-
Notifications
You must be signed in to change notification settings - Fork 191
Description
While I try to start deepseek_r1_distill_llama_8b_q40 model on my Raspberry Pi 4B 8G machine, It failed with segmentation fault as follows. The rpi can successfully deploy a smaller llama 1b model.
sudo nice -n -20 ./dllama chat --model models/deepseek_r1_distill_llama_8b_q40/dllama_model_deepseek_r1_distill_llama_8b_q40.m --tokenizer models/deepseek_r1_distill_llama_8b_q40/dllama_tokenizer_deepseek_r1_distill_llama_8b_q40.t --buffer-float-type q80 --nthreads 3 --max-seq-len 256
[sudo] password for zhangddjs:
📄 BosId: 128000 (<|begin▁of▁sentence|>)
📄 EosId: 128001 (<|end▁of▁sentence|>) 128001 (<|end▁of▁sentence|>)
📄 RegularVocabSize: 128000
📄 SpecialVocabSize: 256
💡 Arch: Llama
💡 HiddenAct: Silu
💡 Dim: 4096
💡 KvDim: 1024
💡 HiddenDim: 14336
💡 VocabSize: 128256
💡 nLayers: 32
💡 nHeads: 32
💡 nKvHeads: 8
💡 OrigSeqLen: 131072
💡 SeqLen: 256
💡 NormEpsilon: 0.000010
💡 RopeType: Llama3.1
💡 RopeTheta: 500000
💡 RopeScaling: f=8.0, l=1.0, h=4.0, o=8192
📀 RequiredMemory: 6285070 kB
🧠 CPU: neon fp16
💿 Loading weights...
[1] 36355 segmentation fault sudo nice -n -20 ./dllama chat --model --tokenizer --buffer-float-type q80
FAIL