Skip to content

Add Function Calling Fine-tuning LLMs on xLAM Dataset notebook #321

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

behroozazarkhalili
Copy link

@behroozazarkhalili behroozazarkhalili commented Aug 1, 2025

Summary

This notebook demonstrates how to fine-tune language models for function calling capabilities using the xLAM dataset from Salesforce and QLoRA technique.

Key Features

  • Universal Model Support: Works with Llama, Qwen, Mistral, Gemma, Phi, and more
  • Memory Efficient: QLoRA training on consumer GPUs (16-24GB VRAM)
  • Automatic Configuration: Smart token detection and model setup
  • Production Ready: Comprehensive documentation and error handling
  • Complete Pipeline: From training to Hugging Face Hub deployment

Technical Details

  • Uses QLoRA (Quantized Low-Rank Adaptation) for efficient fine-tuning
  • Supports multiple model architectures with automatic pad token detection
  • Includes comprehensive testing and evaluation functions
  • Modular design with proper type hints and documentation

Contribution Guidelines Compliance

  • Notebook filename in lowercase: function_calling_fine_tuning_llms_on_xlam.ipynb
  • Author information added with GitHub profile link
  • Added to _toctree.yml in LLM Recipes section
  • Added to index.md in Latest notebooks section
  • Non-informative outputs removed from pip install cells
  • No empty code cells
  • Comprehensive documentation and markdown explanations

Test Plan

  • Notebook structure and organization verified
  • All cells contain proper documentation
  • Code quality and error handling implemented
  • Ready for community use and contribution

✅ All contribution guidelines followed according to the README

@merveenoyan , @stevhliu

This notebook demonstrates how to fine-tune language models for function
calling capabilities using the xLAM dataset from Salesforce and QLoRA
technique.

Key features:
- Universal model support (Llama, Qwen, Mistral, Gemma, Phi, etc.)
- Memory-efficient QLoRA training on consumer GPUs (16-24GB)
- Automatic model configuration and token detection
- Production-ready code with comprehensive documentation
- Complete pipeline from training to deployment on Hugging Face Hub

✅ Contribution task completed
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@stevhliu
Copy link
Member

stevhliu commented Aug 4, 2025

Thanks for your contribution!

My first impression is that it is very code-heavy without really any supporting text that explains what is happening and the rationale behind certain decisions. Breaking up these code blocks will make it easier for users to digest.

Also pinging @sergiopaniego, our recipe chef, for any other additional suggestions ❤️

@behroozazarkhalili
Copy link
Author

Thanks for your contribution!

My first impression is that it is very code-heavy without really any supporting text that explains what is happening and the rationale behind certain decisions. Breaking up these code blocks will make it easier for users to digest.

Also pinging @sergiopaniego, our recipe chef, for any other additional suggestions ❤️

Hi @stevhliu, Thank you for the feedback.
Are you asking if I should include an explanation for each step and the reasons for selecting specific parameters, etc.? I would be grateful if you could add any more clarification.

@stevhliu
Copy link
Member

stevhliu commented Aug 5, 2025

Sorry I wasn't clear!

Yes, a general explanation for each step would be nice. You don't have to go too in-depth explaining why you selected specific parameters (unless its important), but the user should be able to read a paragraph and have a good idea of what is happening at a step.

@behroozazarkhalili
Copy link
Author

Sorry I wasn't clear!

Yes, a general explanation for each step would be nice. You don't have to go too in-depth explaining why you selected specific parameters (unless its important), but the user should be able to read a paragraph and have a good idea of what is happening at a step.

No worries. I'll make the updates based on your comments and submit the pull request soon. :)

@@ -0,0 +1,14340 @@
{
Copy link
Member

@sergiopaniego sergiopaniego Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove the subsections here as they interfere with the correct rendering of the table of contents. Instead, we could use bold text.

Additionally, I'd remove the need to use <small>

Have we tested it on Colab? Remember all the recipes in Cookbook include an Open in Colab button, so it's important to consider that platform and add details about it if needed :)


Reply via ReviewNB

@@ -0,0 +1,14340 @@
{
Copy link
Member

@sergiopaniego sergiopaniego Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding all the imports here, we could add them when needed. This way, we can get rid of the code block. Also, it feels like it does a lot of stuff in the same block, which is difficult to understand from a reader perspective. Always consider the target audience for generating the code blocks :) (learners).


Reply via ReviewNB

@@ -0,0 +1,14340 @@
{
Copy link
Member

@sergiopaniego sergiopaniego Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we use <small>? I'd remove it from all the possible places since it's difficult to see the text


Reply via ReviewNB

@@ -0,0 +1,14340 @@
{
Copy link
Member

@sergiopaniego sergiopaniego Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This blocks feel ok for a .py script but too much for a learning notebook. It could benefit from dividing these type of blocks intro smaller blocks, with explanations


Reply via ReviewNB

@@ -0,0 +1,14340 @@
{
Copy link
Member

@sergiopaniego sergiopaniego Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indent is not correct


Reply via ReviewNB

@@ -0,0 +1,14340 @@
{
Copy link
Member

@sergiopaniego sergiopaniego Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove this or simplify. This could feel too heavy for a recipe


Reply via ReviewNB

Copy link
Member

@sergiopaniego sergiopaniego left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the effort!! 😃 Following the same ideas suggested by @stevhliu and similar to #319:
Code blocks should be divided into smaller sections and explained. We don’t need an in-depth breakdown of every parameter, but rather an explanation of the problem we’re trying to solve and why each function or block of code is necessary.

A recipe should be aimed at readers who want to learn more about a specific technique or package, so the focus should be more educational rather than simply presenting a complete project with a lot of code. You can also reference other recipes to provide additional context and insights.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants