Add DeepSeek R1 Qwen3 (8B) - GRPO Model #672

Dhivya-Bharathy · 2025-06-18T07:44:37Z

User description

Experience reasoning-powered text generation using DeepSeek's fine-tuned Qwen3-8B model with GRPO.
The notebook demonstrates controlled output using structured prompts for assistant-style responses.
A practical setup for building custom LLM agents that handle nuanced dialogue and instruction following.

PR Type

Documentation

Description

• Add DeepSeek R1 Qwen3 GRPO model notebook
• Add four Phi model conversational examples
• Include Colab integration for all notebooks
• Demonstrate various model sizes and capabilities

Changes walkthrough 📝

Relevant files

Documentation

DeepSeek_Qwen3_GRPO.ipynb `Add DeepSeek Qwen3 GRPO reasoning notebook` examples/cookbooks/DeepSeek_Qwen3_GRPO.ipynb • Creates notebook demonstrating DeepSeek's Qwen3-8B model with GRPO • Includes reasoning-powered text generation example • Shows structured prompts for assistant-style responses • Provides Colab integration badge	+172/-0
Phi_3_5_Mini_Conversational.ipynb `Add Phi-3.5 Mini conversational example` examples/cookbooks/Phi_3_5_Mini_Conversational.ipynb • Creates lightweight inference example using Phi-3.5 Mini • Demonstrates basic conversational AI capabilities • Includes dependencies and tool descriptions • Shows simple question-answer interaction	+120/-0
Phi_3_Medium_Conversational.ipynb `Add Phi-3 Medium conversational inference` examples/cookbooks/Phi_3_Medium_Conversational.ipynb • Implements Phi-3 Medium model conversational inference • Shows efficient pipeline usage for text generation • Demonstrates basic loading and response generation • Includes geography question example	+122/-0
Phi_4_14B_GRPO.ipynb `Add Phi-4 14B GRPO optimization example` examples/cookbooks/Phi_4_14B_GRPO.ipynb • Creates Phi-4 14B parameter model with GRPO optimization • Demonstrates healthcare AI consultation example • Shows professional consultant system prompt usage • Includes thoughtful AI application insights	+123/-0
Phi_4_Conversational.ipynb `Add Phi-4 conversational chat example` examples/cookbooks/Phi_4_Conversational.ipynb • Implements basic Phi-4 conversational chat interface • Shows turn-based communication capabilities • Demonstrates tutor-style machine learning explanation • Includes educational content generation example	+124/-0

Need help?
Type /help how to ... in the comments thread for any questions about Qodo Merge usage.
Check out the documentation for more information.

Summary by CodeRabbit

Documentation
- Updated the notebook title in the first markdown cell for improved clarity.

coderabbitai · 2025-06-18T07:44:43Z

Walkthrough

The change updates the title in the first markdown cell of a Jupyter notebook, renaming it from "GRPO Agent Demo" to "GRPO Model," and adds a newline at the end of the file. No code, logic, or output content within the notebook has been modified.

Changes

File(s)	Change Summary
examples/cookbooks/DeepSeek_Qwen3_GRPO.ipynb	Updated notebook title in the first markdown cell; added a newline at end of file

Poem

A title refreshed, a subtle tweak,
The notebook’s name now less mystique.
No code was touched, just words anew,
A gentle change, as rabbits do.
With every hop, we tidy more—
Small edits open learning’s door!
🐇✨

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f4a637f and f205e78.

📒 Files selected for processing (1)

examples/cookbooks/DeepSeek_Qwen3_GRPO.ipynb (2 hunks)

✅ Files skipped from review due to trivial changes (1)

examples/cookbooks/DeepSeek_Qwen3_GRPO.ipynb

⏰ Context from checks skipped due to timeout of 90000ms (5)

GitHub Check: test-core (3.11)
GitHub Check: performance-test
GitHub Check: test-examples
GitHub Check: quick-test
GitHub Check: Run tests and collect coverage

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

gemini-code-assist

Summary of Changes

Hello @DhivyaBharathy-web, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the examples/cookbooks directory by introducing five new Jupyter notebooks. These notebooks serve as practical demonstrations for leveraging various large language models, including DeepSeek's Qwen3-8B (though the code uses DeepSeek-MoE-16B-Chat) and several models from the Microsoft Phi series (Phi-3.5 Mini, Phi-3 Medium, Phi-4, and Phi-4 14B). The examples cover general conversational inference and specifically highlight the application of Guided Reasoning Prompt Optimization (GRPO) for enhanced reasoning capabilities in LLM agents.

Highlights

New DeepSeek GRPO Cookbook: Introduced a new Jupyter notebook (DeepSeek_Qwen3_GRPO.ipynb) demonstrating the usage of DeepSeek's Qwen3-8B model with Guided Reasoning Prompt Optimization (GRPO) for interactive conversational reasoning tasks. Note: The code in the notebook currently loads deepseek-ai/deepseek-moe-16b-chat.
New Phi-3.5 Mini Conversational Cookbook: Added a new cookbook (Phi_3_5_Mini_Conversational.ipynb) showcasing lightweight conversational inference using the Phi-3.5 Mini model, suitable for smaller hardware and educational use cases.
New Phi-3 Medium Conversational Cookbook: Included a new Jupyter notebook (Phi_3_Medium_Conversational.ipynb) demonstrating conversational inference with the Phi-3 Medium model, illustrating basic loading, prompting, and response generation.
New Phi-4 (14B) GRPO Cookbook: Integrated a new cookbook (Phi_4_14B_GRPO.ipynb) that demonstrates inference using the Phi-4 14B parameter model with GRPO optimization strategy for conversational tasks.
New Phi-4 Conversational Cookbook: Added a new Jupyter notebook (Phi_4_Conversational.ipynb) for basic conversational inference with the Phi-4 model, demonstrating chat-style turn-based communication.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

qodo-merge-pro · 2025-06-18T07:45:15Z

You are nearing your monthly Qodo Merge usage quota. For more information, please visit here.

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Model Mismatch The notebook title and description reference "DeepSeek R1 Qwen3 (8B)" but the actual model used in the code is "deepseek-ai/deepseek-moe-16b-chat" which is a 16B MoE model, not the 8B Qwen3 model mentioned. `"model_id = \"deepseek-ai/deepseek-moe-16b-chat\"\n", "tokenizer = AutoTokenizer.from_pretrained(model_id)\n",` Invalid Code The YAML prompt section contains invalid Python syntax with unquoted YAML content that will cause execution errors when run as a code cell. `"\n", "prompt:\n", " task: \"Reasoning over multi-step instructions\"\n", " context: \"User provides a math problem or logical question.\"\n", " model: \"deepseek-ai/deepseek-moe-16b-chat\"\n" ]` Missing Model The notebook references "microsoft/phi-4-14b" model which may not exist on HuggingFace Hub. The actual Phi-4 model identifier should be verified for availability. `"tokenizer = AutoTokenizer.from_pretrained(\"microsoft/phi-4-14b\")\n", "model = AutoModelForCausalLM.from_pretrained(\"microsoft/phi-4-14b\")\n",`

qodo-merge-pro · 2025-06-18T07:46:09Z

You are nearing your monthly Qodo Merge usage quota. For more information, please visit here.

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Fix invalid YAML syntax The YAML prompt section contains invalid Python syntax and should be properly formatted as a string or comment. This code block will cause syntax errors when executed. examples/cookbooks/DeepSeek_Qwen3_GRPO.ipynb [97-100] -prompt: - task: "Reasoning over multi-step instructions" - context: "User provides a math problem or logical question." - model: "deepseek-ai/deepseek-moe-16b-chat" +# YAML Prompt Configuration +# task: "Reasoning over multi-step instructions" +# context: "User provides a math problem or logical question." +# model: "deepseek-ai/deepseek-moe-16b-chat" `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 9 __ Why: The suggestion correctly identifies that the YAML content is in a code cell, which is invalid Python syntax and would cause a `SyntaxError`. The proposed fix of commenting out the lines resolves the issue, making the notebook runnable.	High
Possible issue	Verify model identifier exists The model ID "microsoft/phi-4-14b" may not exist on HuggingFace Hub. Verify the correct model identifier or use an available Phi-4 variant to prevent runtime errors. examples/cookbooks/Phi_4_14B_GRPO.ipynb [90-91] -tokenizer = AutoTokenizer.from_pretrained("microsoft/phi-4-14b") -model = AutoModelForCausalLM.from_pretrained("microsoft/phi-4-14b") +tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-medium-4k-instruct") # Use verified model ID +model = AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-medium-4k-instruct") `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 9 __ Why: The suggestion correctly identifies that the model `microsoft/phi-4-14b` is not a valid HuggingFace model identifier, which would cause a runtime error. While the proposed code uses a different model, the core suggestion to verify and use a correct model ID is critical for the notebook to function.	High
General	Correct model ID mismatch The model ID references a different model than mentioned in the title and description. The notebook claims to demonstrate "DeepSeek R1 Qwen3 (8B)" but uses "deepseek-moe-16b-chat" which is inconsistent. examples/cookbooks/DeepSeek_Qwen3_GRPO.ipynb [125-127] -model_id = "deepseek-ai/deepseek-moe-16b-chat" +model_id = "deepseek-ai/deepseek-r1-qwen3-8b" # or appropriate DeepSeek R1 Qwen3 model ID tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto") `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 9 __ Why: The suggestion correctly points out a significant inconsistency. The notebook's title and description refer to the "Qwen3 (8B)" model, but the code implements the "deepseek-moe-16b-chat" model. This is misleading and makes the example incorrect.	High
Update

codecov · 2025-06-18T07:46:22Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 14.50%. Comparing base (8ee013e) to head (f205e78).
Report is 147 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #672   +/-   ##
=======================================
  Coverage   14.50%   14.50%           
=======================================
  Files          25       25           
  Lines        2517     2517           
  Branches      357      357           
=======================================
  Hits          365      365           
  Misses       2136     2136           
  Partials       16       16

Flag	Coverage Δ
quick-validation	`0.00% <ø> (ø)`
unit-tests	`14.50% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

gemini-code-assist

Code Review

This pull request adds several Jupyter notebooks demonstrating various AI models. The notebooks are generally well-structured and provide useful examples. Key areas for improvement include correcting a critical model mismatch in the DeepSeek_Qwen3_GRPO.ipynb notebook, clarifying the use of GRPO in the Phi_4_14B_GRPO.ipynb notebook, updating Colab badge URLs, and adding resource consideration notes for larger models.

gemini-code-assist · 2025-06-18T07:46:31Z

examples/cookbooks/DeepSeek_Qwen3_GRPO.ipynb

+        "prompt:\n",
+        "  task: \"Reasoning over multi-step instructions\"\n",
+        "  context: \"User provides a math problem or logical question.\"\n",
+        "  model: \"deepseek-ai/deepseek-moe-16b-chat\"\n"


The model ID deepseek-ai/deepseek-moe-16b-chat in this YAML example is inconsistent with the notebook's title and description, as well as the PR title, which all refer to an 8B Qwen3 model. Update this to reflect the intended 8B Qwen3 model ID.

model: "your_deepseek_qwen3_8b_grpo_model_id" # Please replace with the correct 8B model ID

gemini-code-assist · 2025-06-18T07:46:31Z

examples/cookbooks/DeepSeek_Qwen3_GRPO.ipynb

+        "\n",
+        "from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline\n",
+        "\n",
+        "model_id = \"deepseek-ai/deepseek-moe-16b-chat\"\n",


The model_id used here is "deepseek-ai/deepseek-moe-16b-chat", which corresponds to a 16B parameter model. This conflicts with the PR title, notebook title, and notebook description, all of which refer to an 8B model. Update the model_id to the correct Hugging Face ID for the intended "DeepSeek R1 Qwen3 (8B) - GRPO" model.

model_id = "your_deepseek_qwen3_8b_grpo_model_id" # Please replace with the correct 8B model ID for DeepSeek Qwen3 GRPO

gemini-code-assist · 2025-06-18T07:46:32Z

examples/cookbooks/DeepSeek_Qwen3_GRPO.ipynb

+    {
+      "cell_type": "markdown",
+      "source": [
+        "[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DhivyaBharathy-web/PraisonAI/blob/main/examples/cookbooks/DeepSeek_Qwen3_GRPO.ipynb)\n"


The URL for the 'Open in Colab' badge points to https://colab.research.google.com/github/DhivyaBharathy-web/PraisonAI/.... Update the GitHub username in the URL to MervinPraison to ensure users are directed to the correct repository.

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/MervinPraison/PraisonAI/blob/main/examples/cookbooks/DeepSeek_Qwen3_GRPO.ipynb)

gemini-code-assist · 2025-06-18T07:46:32Z

examples/cookbooks/Phi_4_14B_GRPO.ipynb

+      "source": [
+        "**Description:**\n",
+        "\n",
+        "This notebook demonstrates inference using the Phi-4 14B parameter model with GRPO optimization strategy."


The description mentions "GRPO optimization strategy," but the code doesn't clearly demonstrate how GRPO is applied. Clarify if GRPO is inherent to the model or if a specific technique is needed.

gemini-code-assist · 2025-06-18T07:46:32Z

examples/cookbooks/Phi_4_14B_GRPO.ipynb

+        "!pip install transformers accelerate\n",
+        "!pip install torch\n",
+        "```"
+      ]


The Phi-4 14B model is resource-intensive. Add a markdown cell after the dependencies installation to inform users about potential resource requirements (GPU, RAM).

coderabbitai

Actionable comments posted: 4

♻️ Duplicate comments (2)

examples/cookbooks/Phi_3_Medium_Conversational.ipynb (1)

45-48: Same dependency pinning comment applies.
Consider locking transformers, accelerate, torch, torchvision versions.

examples/cookbooks/Phi_4_Conversational.ipynb (1)

87-95: 4-B model GPU safety & no-grad context.
Same remarks on device_map + torch.no_grad() as above.

🧹 Nitpick comments (5)

examples/cookbooks/DeepSeek_Qwen3_GRPO.ipynb (1)

51-55: Un-pinned dependency versions jeopardise reproducibility.
!pip install -q transformers accelerate will always pull the latest versions which may introduce breaking API changes. Pin to a known-good minor version (or at minimum a ~= spec).

examples/cookbooks/Phi_3_5_Mini_Conversational.ipynb (1)

44-47: Pin package versions for deterministic installs.
Repeatable notebooks are easier to debug. Suggest pinning transformers, accelerate, and torch to tested versions (e.g., transformers==4.41.1, torch==2.3.*).

examples/cookbooks/Phi_4_Conversational.ipynb (1)

44-47: Pin versions to avoid future incompatibilities.
examples/cookbooks/Phi_4_14B_GRPO.ipynb (2)
42-47: Pinned versions critical for 14 B model + GRPO.
Large models depend heavily on matching transformers / accelerate versions; please lock them.

88-96: 14 B checkpoint unlikely to fit on free Colab GPU.
Load with 8-bit/4-bit quantisation (BitsAndBytesConfig), or instruct users to switch to a high-RAM runtime. Failing that, the cell will OOM.
-from transformers import AutoTokenizer, AutoModelForCausalLM
+from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
@@
-model = AutoModelForCausalLM.from_pretrained("microsoft/phi-4-14b")
+bnb_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4")
+model = AutoModelForCausalLM.from_pretrained(
+    "microsoft/phi-4-14b",
+    quantization_config=bnb_config,
+    device_map="auto",
+    torch_dtype="auto",
+)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between db28e0a and fe63b97.

📒 Files selected for processing (5)

examples/cookbooks/DeepSeek_Qwen3_GRPO.ipynb (1 hunks)
examples/cookbooks/Phi_3_5_Mini_Conversational.ipynb (1 hunks)
examples/cookbooks/Phi_3_Medium_Conversational.ipynb (1 hunks)
examples/cookbooks/Phi_4_14B_GRPO.ipynb (1 hunks)
examples/cookbooks/Phi_4_Conversational.ipynb (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (5)

GitHub Check: quick-test
GitHub Check: test-examples
GitHub Check: test-core (3.11)
GitHub Check: performance-test
GitHub Check: Run tests and collect coverage

coderabbitai · 2025-06-18T07:47:31Z

examples/cookbooks/DeepSeek_Qwen3_GRPO.ipynb

+        "This notebook demonstrates the usage of DeepSeek's Qwen3-8B model with GRPO (Guided Reasoning Prompt Optimization) for interactive conversational reasoning tasks.\n",
+        "It is designed to simulate a lightweight agent-style reasoning capability in an accessible and interpretable way."
+      ]


⚠️ Potential issue

Model mismatch – description says “Qwen3-8B” but code loads “deepseek-moe-16b-chat”.
The notebook title/markdown explicitly introduces a fine-tuned Qwen3-8B GRPO agent, yet the model_id points at deepseek-ai/deepseek-moe-16b-chat. This 16 B MoE chat checkpoint is materially different (size, architecture, licensing) and will not reproduce results claimed for Qwen3-8B GRPO.

Either fix the markdown/filename or switch model_id (and Colab link) to the intended Qwen3-8B GRPO artifact.

- model_id = "deepseek-ai/deepseek-moe-16b-chat" + model_id = "deepseek-ai/deepseek-qwen3-8b-grpo" # <- example id, adjust to the actual HF repo

Also applies to: 125-131

🤖 Prompt for AI Agents

In examples/cookbooks/DeepSeek_Qwen3_GRPO.ipynb around lines 30 to 32 and 125 to 131, the notebook description and title mention the Qwen3-8B model, but the code loads the deepseek-ai/deepseek-moe-16b-chat model instead. To fix this, update the model_id in the code and any related Colab links to point to the correct Qwen3-8B GRPO model artifact that matches the description, or alternatively, adjust the markdown and filename to accurately reflect the deepseek-moe-16b-chat model being used.

coderabbitai · 2025-06-18T07:47:32Z

examples/cookbooks/DeepSeek_Qwen3_GRPO.ipynb

+        "from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline\n",
+        "\n",
+        "model_id = \"deepseek-ai/deepseek-moe-16b-chat\"\n",
+        "tokenizer = AutoTokenizer.from_pretrained(model_id)\n",
+        "model = AutoModelForCausalLM.from_pretrained(model_id, device_map=\"auto\")\n",
+        "\n",
+        "pipe = pipeline(\"text-generation\", model=model, tokenizer=tokenizer)\n",
+        "\n",
+        "prompt = \"If a train travels 60 miles in 1.5 hours, what is its average speed?\"\n",
+        "output = pipe(prompt, max_new_tokens=60)[0]['generated_text']\n",
+        "print(\"🧠 Reasoned Output:\", output)\n"
+      ]


🛠️ Refactor suggestion

Large-model loading lacks memory-safe settings.
16 B (or even 8 B) checkpoints frequently exceed Colab’s 16 GB GPU. Add torch_dtype=torch.float16, device_map="auto", and consider trust_remote_code=True if model repo uses custom layers. Wrap generation in torch.no_grad() to avoid unnecessary gradients.

-from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline +from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline, BitsAndBytesConfig @@ -tokenizer = AutoTokenizer.from_pretrained(model_id) -model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto") +tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) +model = AutoModelForCausalLM.from_pretrained( + model_id, + device_map="auto", + torch_dtype="auto", # fp16/bf16 if available +) @@ -output = pipe(prompt, max_new_tokens=60)[0]['generated_text'] +with torch.no_grad(): + output = pipe(prompt, max_new_tokens=60, return_full_text=False)[0]["generated_text"]

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In examples/cookbooks/DeepSeek_Qwen3_GRPO.ipynb around lines 123 to 134, the model loading and generation code lacks memory-efficient settings for large models. Fix this by adding torch_dtype=torch.float16 and device_map="auto" parameters when loading the model, include trust_remote_code=True if the model uses custom layers, and wrap the text generation call inside a torch.no_grad() context to prevent gradient computation and reduce memory usage.

coderabbitai · 2025-06-18T07:47:32Z

examples/cookbooks/Phi_3_5_Mini_Conversational.ipynb

+        "from transformers import AutoTokenizer, AutoModelForCausalLM\n",
+        "\n",
+        "tokenizer = AutoTokenizer.from_pretrained(\"microsoft/phi-3.5-mini\")\n",
+        "model = AutoModelForCausalLM.from_pretrained(\"microsoft/phi-3.5-mini\")\n",
+        "\n",
+        "inputs = tokenizer(\"What is an AI model?\", return_tensors=\"pt\")\n",
+        "outputs = model.generate(**inputs, max_new_tokens=40)\n",
+        "print(tokenizer.decode(outputs[0], skip_special_tokens=True))"
+      ]


🛠️ Refactor suggestion

Add device placement & inference-mode guards.
AutoModelForCausalLM.from_pretrained defaults to CPU; on GPU Colab you’ll silently OOM when the model is moved during generate. Use device_map="auto" and wrap the generation under torch.no_grad().

-model = AutoModelForCausalLM.from_pretrained("microsoft/phi-3.5-mini") +model = AutoModelForCausalLM.from_pretrained( + "microsoft/phi-3.5-mini", + device_map="auto", + torch_dtype="auto", +) @@ -outputs = model.generate(**inputs, max_new_tokens=40) +with torch.no_grad(): + outputs = model.generate(**inputs, max_new_tokens=40)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"from transformers import AutoTokenizer, AutoModelForCausalLM\n",

"\n",

"tokenizer = AutoTokenizer.from_pretrained(\"microsoft/phi-3.5-mini\")\n",

"model = AutoModelForCausalLM.from_pretrained(\"microsoft/phi-3.5-mini\")\n",

"\n",

"inputs = tokenizer(\"What is an AI model?\", return_tensors=\"pt\")\n",

"outputs = model.generate(**inputs, max_new_tokens=40)\n",

"print(tokenizer.decode(outputs[0], skip_special_tokens=True))"

]

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("microsoft/phi-3.5-mini")

model = AutoModelForCausalLM.from_pretrained(

"microsoft/phi-3.5-mini",

device_map="auto",

torch_dtype="auto",

)

inputs = tokenizer("What is an AI model?", return_tensors="pt")

with torch.no_grad():

outputs = model.generate(**inputs, max_new_tokens=40)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

🤖 Prompt for AI Agents

In examples/cookbooks/Phi_3_5_Mini_Conversational.ipynb around lines 87 to 95, the model loading and generation code lacks device placement and inference mode guards, which can cause out-of-memory errors on GPU. Fix this by adding device_map="auto" to the from_pretrained call to automatically place the model on the appropriate device, and wrap the model.generate call inside a torch.no_grad() context to disable gradient calculations during inference.

coderabbitai · 2025-06-18T07:47:32Z

examples/cookbooks/Phi_3_Medium_Conversational.ipynb

+        "from transformers import AutoTokenizer, AutoModelForCausalLM\n",
+        "import torch\n",
+        "\n",
+        "model = AutoModelForCausalLM.from_pretrained(\"microsoft/phi-3-medium\")\n",
+        "tokenizer = AutoTokenizer.from_pretrained(\"microsoft/phi-3-medium\")\n",
+        "\n",
+        "prompt = \"What is the capital of France?\"\n",
+        "inputs = tokenizer(prompt, return_tensors=\"pt\")\n",
+        "outputs = model.generate(**inputs, max_new_tokens=20)\n",
+        "print(tokenizer.decode(outputs[0], skip_special_tokens=True))"


🛠️ Refactor suggestion

Loading 3-B model on GPU without device_map can exceed VRAM.
Add device_map="auto" and torch_dtype="auto". Also guard with torch.no_grad().

🤖 Prompt for AI Agents

In examples/cookbooks/Phi_3_Medium_Conversational.ipynb around lines 88 to 97, the model loading and generation code does not specify device placement, which can cause VRAM overflow on GPU. Fix this by adding device_map="auto" and torch_dtype="auto" parameters to the from_pretrained call to enable automatic device placement and dtype selection. Also, wrap the model.generate call inside a torch.no_grad() context to prevent unnecessary gradient computation and reduce memory usage.

gitguardian · 2025-06-21T05:23:57Z

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

Since your pull request originates from a forked repository, GitGuardian is not able to associate the secrets uncovered with secret incidents on your GitGuardian dashboard.
Skipping this check run and merging your pull request will create secret incidents on your GitGuardian dashboard.

🔎 Detected hardcoded secret in your pull request

GitGuardian id	GitGuardian status	Secret	Commit	Filename
17682666	Triggered	Generic High Entropy Secret	`f205e78`	src/praisonai-agents/test_posthog_import.py	View secret

🛠 Guidelines to remediate hardcoded secrets

Understand the implications of revoking this secret by investigating where it is used in your code.
Replace and store your secret safely. Learn here the best practices.
Revoke and rotate this secret.
If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider

following these best practices for managing and storing secrets including API keys and other credentials
install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.

^{🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.}

Add DeepSeek_Qwen3_GRPO notebook

fe63b97

gemini-code-assist bot reviewed Jun 18, 2025

View reviewed changes

qodo-merge-pro bot added the Review effort 2/5 label Jun 18, 2025

gemini-code-assist bot reviewed Jun 18, 2025

View reviewed changes

coderabbitai bot reviewed Jun 18, 2025

View reviewed changes

Dhivya-Bharathy added 2 commits June 18, 2025 13:44

Update DeepSeek_Qwen3_GRPO.ipynb

f4a637f

Merge branch 'main' into add-DeepSeek_Qwen3_GRPO

f205e78

Uh oh!

Add DeepSeek R1 Qwen3 (8B) - GRPO Model #672

Are you sure you want to change the base?

Add DeepSeek R1 Qwen3 (8B) - GRPO Model #672

Uh oh!

Conversation

Dhivya-Bharathy commented Jun 18, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

PR Type

Description

Changes walkthrough 📝

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

qodo-merge-pro bot commented Jun 18, 2025

PR Reviewer Guide 🔍

Uh oh!

qodo-merge-pro bot commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Uh oh!

codecov bot commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

gitguardian bot commented Jun 21, 2025

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Uh oh!

Dhivya-Bharathy commented Jun 18, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jun 18, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

qodo-merge-pro bot commented Jun 18, 2025 •

edited

Loading

codecov bot commented Jun 18, 2025 •

edited

Loading