Skip to content

Latest commit

 

History

History
233 lines (180 loc) · 6.73 KB

File metadata and controls

233 lines (180 loc) · 6.73 KB

Quick Start Guide

🚀 Basic Usage

1. Model Inference

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load a SmallDoge model
model_name = "SmallDoge/Doge-60M-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True)

# Generate text
prompt = "Explain the concept of machine learning:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

2. WebUI Interface

Launch the interactive web interface:

# Start WebUI (default: both backend and frontend)
small-doge-webui

# Development mode with auto-reload
small-doge-webui --dev

# Custom configuration
small-doge-webui --backend-host 127.0.0.1 --backend-port 8000 --frontend-port 7860

Access URLs:

3. Jupyter Notebook Tutorial

Follow our interactive tutorials:

📋 Available Models

Base Models (Pre-trained)

Model Parameters Speed (tokens/s on i7-11 CPU) Use Case
Doge-20M 20M 142 Ultra-fast prototyping
Doge-60M 60M 62 Balanced performance
Doge-160M 160M 28 Better reasoning
Doge-320M 320M 16 High performance

Instruction-tuned Models

Model Parameters Features
Doge-20M-Instruct 20M Chat & instruction following
Doge-60M-Instruct 60M Enhanced conversation
Doge-160M-Instruct 160M Advanced reasoning

🎓 Training Your Own Model

Quick Training Example

# One-stop dataset preparation
from transformers import AutoTokenizer
from small_doge.processor.pt_datasets_process import mix_datasets_by_radio

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("SmallDoge/Doge-tokenizer")

# Download, process, and mix datasets in one call
datasets_and_ratios = [
    {"fineweb-edu": 0.7},
    {"cosmopedia-v2": 0.2}, 
    {"python-edu": 0.05},
    {"finemath": 0.05},
]

mixed_dataset = mix_datasets_by_radio(
    datasets_and_ratios=datasets_and_ratios,
    total_sample_size=128000000,
    processing_class=tokenizer,
    max_length=2048,
    packing=True,
    seed=233,
    cache_dir="./cache",
)

# Save and start training
mixed_dataset.save_to_disk("./datasets/pt_dataset")
# Start pre-training (14 hours on RTX 4090)
ACCELERATE_LOG_LEVEL=info accelerate launch \
    --config_file recipes/accelerate_configs/single_gpu.yaml \
    ./src/small_doge/trainer/doge/pt.py \
    --config recipes/doge/Doge-20M/config_full.yaml

Training Stages

  1. Pre-training: Train from scratch or continue from checkpoint

    # Use Doge checkpoints for continued training
    # Doge-20M-checkpoint: learning_rate=8e-3
    # Doge-60M-checkpoint: learning_rate=6e-3  
    # Doge-160M-checkpoint: learning_rate=4e-3
    # Doge-320M-checkpoint: learning_rate=2e-3
  2. Instruction Fine-tuning: Create chat-capable models

    # SFT (Supervised Fine-tuning)
    accelerate launch --config_file ../accelerate_configs/single_gpu.yaml ../../src/small_doge/trainer/doge/sft.py --config Doge-20M-Instruct/sft/config_full.yaml
    
    # DPO (Direct Preference Optimization)  
    accelerate launch --config_file ../accelerate_configs/single_gpu.yaml ../../src/small_doge/trainer/doge/dpo.py --config Doge-20M-Instruct/dpo/config_full.yaml
  3. Reasoning Fine-tuning: Enhance reasoning capabilities

    # Distillation from teacher models
    accelerate launch --config_file ../accelerate_configs/single_gpu.yaml ../../src/small_doge/trainer/doge/sft.py --config Doge-20M-Reason/sft/config_full.yaml
    
    # GRPO (Group Relative Policy Optimization)
    accelerate launch --config_file ../accelerate_configs/single_gpu.yaml ../../src/small_doge/trainer/doge/grpo.py --config Doge-20M-Reason/grpo/config_full.yaml

📚 Detailed guide: Complete Training Guide

📊 Model Evaluation

Evaluate model performance using our evaluation toolkit:

# Install evaluation toolkit
pip install lighteval

# Linux evaluation
bash ./evaluation/eval_downstream_tasks.sh

# Windows evaluation
powershell ./evaluation/eval_downstream_tasks.ps1

# Custom evaluation
export MODEL="SmallDoge/Doge-60M"
export OUTPUT_DIR="./eval_results"
bash ./evaluation/eval_downstream_tasks.sh

Supported benchmarks: MMLU, ARC, PIQA, HellaSwag, Winogrande, TriviaQA, BBH, IFEval

🔍 Evaluation toolkit: Evaluation Guide

🔧 Configuration Examples

Low Resource Setup (4GB GPU)

# Use the smallest model
model_name = "SmallDoge/Doge-20M-Instruct"

# Enable gradient checkpointing
training_args = {
    "gradient_checkpointing": True,
    "dataloader_pin_memory": False,
    "per_device_train_batch_size": 1,
    "gradient_accumulation_steps": 8
}

High Performance Setup (24GB GPU)

# Use larger model
model_name = "SmallDoge/Doge-320M"

# Optimize for speed
training_args = {
    "per_device_train_batch_size": 8,
    "dataloader_num_workers": 4,
    "bf16": True,
    "tf32": True
}

📱 Integration Examples

OpenAI-compatible API

import openai

# Configure client for SmallDoge WebUI
client = openai.OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="not-needed"
)

# Chat completion
response = client.chat.completions.create(
    model="SmallDoge/Doge-60M-Instruct",
    messages=[{"role": "user", "content": "Hello!"}]
)

Streamlit App

import streamlit as st
from transformers import pipeline

@st.cache_resource
def load_model():
    return pipeline("text-generation", model="SmallDoge/Doge-60M-Instruct")

generator = load_model()
user_input = st.text_input("Enter your prompt:")
if user_input:
    result = generator(user_input, max_length=100)
    st.write(result[0]['generated_text'])

🎯 Next Steps