Implement Tests for Hallucinations for Virtual Assistant Demo #199

riacheruvu · 2025-02-04T17:36:09Z

Description

Implement a suite of unit tests to check if the output of the Large Language Model (LLM) used in the Virtual Assistant Demo involves hallucinations. This is a great opportunity to explore techniques on ensuring the reliability and accuracy of the assistant's responses by identifying and mitigating instances where the LLM generates incorrect or misleading information.

This project will involve Python programming; basic experience with AI models from frameworks like PyTorch, TensorFlow, OpenVINO, or ONNX is beneficial. To learn more about the OpenVINO toolkit, visit the documentation here.

Examples of hallucinations:

Factual errors
Inaccurately summarizing information
Creating nonsensical content, such as random sentences (which can also manifest in the LLM trailing off in the middle of a response to a tangential topic)

Steps:

Develop unit tests for Hallucination Detection:
- Define criteria for what constitutes a hallucination and implement checks based on these criteria.
- Create unit tests that analyze the output of the LLM used in the Virtual Assistant Demo to detect hallucinations.
Enhance Testing Framework:
- Integrate the new unit tests into the existing testing framework.
- Ensure the tests are comprehensive and cover various scenarios where hallucinations might occur.
Documentation:
- Provide clear and detailed documentation for setting up and running the unit tests.
- Include comments in the code to explain key sections and logic.

How to Get Started:

Fork the OpenVINO Build Deploy repository.
Follow these installation instructions to setup your environment and install the required dependencies for this project.
Read Demo Contribution Guide.
Build your feature and ensure it meets the requirements specified in the contribution guide.
Submit a pull request.

jnzw · 2025-03-03T01:00:02Z

Team 3 is working on this issue.

github-actions · 2025-04-02T02:06:43Z

This issue has been marked because it has been open for 30 days with no activity. It is scheduled to close in 14 days.

github-actions · 2025-05-15T02:09:18Z

This issue has been marked because it has been open for 30 days with no activity. It is scheduled to close in 14 days.

github-actions · 2025-05-29T02:11:34Z

This issue was closed automatically because it has been inactive for 14 days since being marked as stale. Please reopen if needed.

riacheruvu added good first issue Good for newcomers large_difficulty labels Feb 4, 2025

jnzw mentioned this issue Mar 7, 2025

Support hallucination score #218

Open

github-actions bot added the stale This issue or pull request is not active label Apr 2, 2025

adrianboguszewski removed the stale This issue or pull request is not active label Apr 14, 2025

github-actions bot added the stale This issue or pull request is not active label May 15, 2025

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement Tests for Hallucinations for Virtual Assistant Demo #199

Implement Tests for Hallucinations for Virtual Assistant Demo #199

riacheruvu commented Feb 4, 2025 •

edited by adrianboguszewski

Loading

jnzw commented Mar 3, 2025

Uh oh!

github-actions bot commented Apr 2, 2025

Uh oh!

github-actions bot commented May 15, 2025

Uh oh!

github-actions bot commented May 29, 2025

Uh oh!

Implement Tests for Hallucinations for Virtual Assistant Demo #199

Implement Tests for Hallucinations for Virtual Assistant Demo #199

Comments

riacheruvu commented Feb 4, 2025 • edited by adrianboguszewski Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Steps:

How to Get Started:

jnzw commented Mar 3, 2025

Uh oh!

github-actions bot commented Apr 2, 2025

Uh oh!

github-actions bot commented May 15, 2025

Uh oh!

github-actions bot commented May 29, 2025

Uh oh!

riacheruvu commented Feb 4, 2025 •

edited by adrianboguszewski

Loading