-
Notifications
You must be signed in to change notification settings - Fork 859
Description
I am experimenting with LoRA to fine-tune a model to process and analyze PDF files so that I can ask questions based on the files. Essentially, I would upload PDFs, then the program would split it into chunks, and "learn" from the PDFs so I wouldn't have to repeatedly upload files and it would remember the context from the files (as I am building a streamlit application) and then, generate a vector store for querying.
Here is my fine-tuning function:
`def fine_tune_model_lora_with_suggestions(train_data):
st.write("Starting high-performance fine-tuning with LoRA...")
try:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TrainingArguments, Trainer
from peft import LoraConfig, get_peft_model
from datasets import Dataset
# Define model name
model_name = "bigscience/bloom-7b1"
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Load model normally if CUDA is available, otherwise use CPU
if torch.cuda.is_available():
model = AutoModelForCausalLM.from_pretrained(
model_name,
load_in_8bit=True, # Enable 8-bit quantization
device_map="auto", # Automatically map layers between GPU and CPU
llm_int8_enable_fp32_cpu_offload=True, # Offload some layers to CPU in FP32
torch_dtype=torch.float16, # Use FP16 for GPU-loaded layers
)
else:
model = AutoModelForCausalLM.from_pretrained(model_name)
# Apply LoRA configuration
lora_config = LoraConfig(
r=16,
lora_alpha=32,
lora_dropout=0.05,
bias="none",
task_type="CAUSAL_LM",
target_modules=["query_key_value"], # Specify target modules for LoRA
)
model = get_peft_model(model, lora_config)
# Prepare dataset
dataset = Dataset.from_list(train_data)
def tokenize_function(examples):
tokens = tokenizer(
examples["text"],
padding="max_length",
truncation=True,
max_length=512,
)
tokens["labels"] = tokens["input_ids"].copy()
return tokens
tokenized_dataset = dataset.map(tokenize_function, batched=True)
# Training arguments
training_args = TrainingArguments(
per_device_train_batch_size=4,
gradient_accumulation_steps=4,
max_steps=200,
learning_rate=2e-4,
fp16=torch.cuda.is_available(), # Enable FP16 only if CUDA is available
logging_steps=10,
output_dir="./outputs",
save_steps=10,
save_total_limit=2,
report_to="none",
)
# Initialize Trainer
trainer = Trainer(
model=model,
args=training_args,
train_dataset=tokenized_dataset,
)
# Train model
trainer.train()
# Save fine-tuned model
model.save_pretrained("./fine_tuned_bloom_lora")
tokenizer.save_pretrained("./fine_tuned_bloom_lora")
st.write("Fine-tuning completed successfully.")
except ImportError as e:
st.error(f"Import Error: {e}")
except Exception as e:
st.error(f"Error during LoRA fine-tuning: {e}")`
Just as a side note, I am running this code in Google Colabs.
When I run my code in its entirety, I get this error: Error during LoRA fine-tuning: Got unexpected arguments: {'num_items_in_batch': 8192}.
I would appreciate any help I could get on this! Thank you!