Skip to content

Commit 49748e6

Browse files
authored
Merge pull request #22 from armingh2000/feature/fact_scorer_demons
Feature/fact scorer demons
2 parents 851bede + 305c802 commit 49748e6

File tree

11 files changed

+246
-30
lines changed

11 files changed

+246
-30
lines changed

CHANGELOG.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,15 @@ All notable changes to this project will be documented in this file.
6767
- Add tests for the fix.
6868
- Remove unnecessary code.
6969

70+
## v 1.1.0 - 2024-04-18
71+
72+
- Update fact scorer prompt
73+
- Add tests for fact scorer demon load
74+
- Fix demon format in atomic facts tests
75+
- Rename demon files
76+
- Add fact scorer demons json file
77+
- Add CONTRIBUTING.md guidelines
78+
7079
<!--
7180
### Added
7281

CONTRIBUTING.md

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
# Contributing to FactScoreLite
2+
3+
We love your input! We want to make contributing to this project as easy and transparent as possible, whether it's:
4+
5+
- Reporting a bug
6+
- Discussing the current state of the code
7+
- Submitting a fix
8+
- Proposing new features
9+
- Becoming a maintainer
10+
11+
## We Develop with Github
12+
13+
We use GitHub to host code, to track issues and feature requests, as well as accept pull requests.
14+
15+
## We Use [Github Flow](https://guides.github.com/introduction/flow/index.html), So All Code Changes Happen Through Pull Requests
16+
17+
Pull requests are the best way to propose changes to the codebase (we use [GitHub Flow](https://guides.github.com/introduction/flow/index.html)). We actively welcome your pull requests:
18+
19+
1. Fork the repo and create your branch from `main`.
20+
2. If you've added code that should be tested, add tests.
21+
3. If you've changed APIs, update the documentation.
22+
4. Ensure the test suite passes.
23+
5. Make sure your code lints.
24+
6. Issue that pull request!
25+
26+
## Any contributions you make will be under the MIT Software License
27+
28+
In short, when you submit code changes, your submissions are understood to be under the same [MIT License](LICENSE.md) that covers the project. Feel free to contact the maintainers if that's a concern.
29+
30+
## Report bugs using Github's [issues](https://github.com/armingh2000/FactScoreLite/issues)
31+
32+
We use GitHub issues to track public bugs. Report a bug by [opening a new issue](https://github.com/armingh2000/FactScoreLite/issues/new); it's that easy!
33+
34+
## Write bug reports with detail, background, and sample code
35+
36+
**Great Bug Reports** tend to have:
37+
38+
- A quick summary and/or background
39+
- Steps to reproduce
40+
- Be specific!
41+
- Give sample code if you can.
42+
- What you expected would happen
43+
- What actually happens
44+
- Notes (possibly including why you think this might be happening, or stuff you tried that didn't work)
45+
46+
## Use a Consistent Coding Style
47+
48+
- You can autoformat code using [black](https://github.com/psf/black) for Python
49+
- Include comments in your code where necessary
50+
- Write meaningful commit messages
51+
- If you are creating new functions/methods, make sure you add docstrings.
52+
53+
## License
54+
55+
By contributing, you agree that your contributions will be licensed under its MIT License.

FactScoreLite/atomic_facts.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,15 +45,15 @@ def load_demons(self):
4545
Returns:
4646
list: A list of examples (demonstrations).
4747
"""
48-
with open(configs.demons_path, "r") as file:
48+
with open(configs.atomic_facts_demons_path, "r") as file:
4949
demons = json.load(file)
5050

5151
return demons
5252

5353
def get_instructions(self) -> str:
5454
"""
5555
Prepare instructions for the prompt generation.
56-
Instructions include the examples given in the demons.json file.
56+
Instructions include the examples given in the atomic_facts_demons.json file.
5757
5858
Returns:
5959
str: The instructions for the prompt generation.

FactScoreLite/configs.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,8 @@
44
data_path = importlib.resources.files("FactScoreLite") / "data"
55

66
# Path to the data file within the package
7-
demons_path = data_path / "demons.json"
7+
atomic_facts_demons_path = data_path / "atomic_facts_demons.json"
8+
fact_scorer_demons_path = data_path / "fact_scorer_demons.json"
89

910
# OpenAI API
1011
max_tokens = 1024

FactScoreLite/data/demons.json renamed to FactScoreLite/data/atomic_facts_demons.json

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,3 @@
1-
21
[
32
{
43
"Sentence": "The Turbo V6 engine boasts an impressive horsepower of 450 and a peak torque of 510 lb-ft, achieved between 2,500 and 5,500 rpm, equipped with a 10-speed automatic transmission and a dual-exhaust system, enhancing both performance and sound.",
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
[
2+
{
3+
"knowledge_source": "For the optimal operation of your 2022 Honda Accord, the engine oil should be replaced with 0W-20 synthetic oil every 7,500 miles under normal driving conditions.",
4+
"fact": "The 2022 Honda Accord requires an oil change every 7,500 miles.",
5+
"is_supported": true,
6+
"reason": "The fact is directly supported as the manual specifies the oil change interval and the type of oil to use."
7+
},
8+
{
9+
"knowledge_source": "Tire maintenance is crucial for the longevity of your tires and vehicle handling. Rotating your tires at recommended intervals helps distribute wear evenly and extends tire life.",
10+
"fact": "Tire rotation for vehicles should be done every 10,000 miles.",
11+
"is_supported": false,
12+
"reason": "The knowledge source suggests the importance of regular tire rotation but does not specify the 10,000 miles interval."
13+
},
14+
{
15+
"knowledge_source": "Ensure your vehicle's compatibility with connected services. The 2021 Ford Mustang supports Apple CarPlay, enabling a seamless integration with your Apple devices.",
16+
"fact": "The 2021 Ford Mustang is compatible with Android Auto.",
17+
"is_supported": false,
18+
"reason": "The fact is not supported as the manual only mentions compatibility with Apple CarPlay."
19+
},
20+
{
21+
"knowledge_source": "Your vehicle's climate control system is designed to maintain the cabin temperature for comfort.",
22+
"fact": "The vehicle includes a heated steering wheel.",
23+
"is_supported": false,
24+
"reason": "This fact is irrelevant as the manual does not mention steering wheel heating."
25+
},
26+
{
27+
"knowledge_source": "The fuel system should only be filled with unleaded gasoline, as specified in the manual to avoid engine damage.",
28+
"fact": "The vehicle can be filled with both gasoline and diesel.",
29+
"is_supported": false,
30+
"reason": "This fact contradicts the knowledge source which clearly states that only unleaded gasoline should be used."
31+
}
32+
]
33+

FactScoreLite/fact_scorer.py

Lines changed: 45 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,53 @@
11
import string
22
from .openai_agent import OpenAIAgent
3+
from . import configs
4+
import json
5+
import random
36

47

58
class FactScorer:
69
def __init__(self):
10+
# Examples (demonstrations) that is used in prompt generation
11+
self.demons = self.load_demons()
712
# To interact with OpenAI APIs
813
self.openai_agent = OpenAIAgent()
914

15+
def load_demons(self):
16+
"""
17+
Load examples (demonstrations) from a JSON file.
18+
This will be used in the prompt generation.
19+
20+
Returns:
21+
list: A list of examples (demonstrations).
22+
"""
23+
with open(configs.fact_scorer_demons_path, "r") as file:
24+
demons = json.load(file)
25+
26+
return demons
27+
28+
def get_instructions(self) -> str:
29+
"""
30+
Prepare instructions for the prompt generation.
31+
Instructions include the examples given in the fact_scorer_demons.json file.
32+
33+
Returns:
34+
str: The instructions for the prompt generation.
35+
"""
36+
37+
instructions = "Evaluate the truthfulness of the statement based solely on the provided context and provide the reason for your decision.\n\n"
38+
instructions += "Instruction:\nOnly consider the statement true if it can be directly verified by the information in the context. If the information in the statement cannot be found in the context or differs from it, label it as false.\n\n"
39+
true_example = self.demons[0]
40+
false_example = random.choice(self.demons[1:])
41+
42+
for demon in [true_example, false_example]:
43+
instructions += f"Context:\n{demon['knowledge_source']}\n"
44+
instructions += f"Statement:\n{demon['fact']} True or False?\n"
45+
instructions += f"Output:\n{demon['is_supported']}\n\n"
46+
# TODO: add reason (+change parsing)
47+
# instructions += f"Reason:\n{demon['reason']}\n\n"
48+
49+
return instructions
50+
1051
def get_score(self, facts: list, knowledge_source: str) -> list:
1152
"""
1253
Calculates the score of each atomic fact based on the knowledge source.
@@ -26,15 +67,10 @@ def get_score(self, facts: list, knowledge_source: str) -> list:
2667
atom = atom.strip()
2768

2869
# Prompt that will be sent to GPT
29-
prompt = "Answer the question based on the given context.\n\n"
30-
prompt += f"Context:\n{knowledge_source}"
31-
32-
if not prompt[-1] in string.punctuation:
33-
prompt += "."
34-
35-
prompt += "\n\n"
36-
37-
prompt += f"Input: {atom} True or False?\nOutput:\n"
70+
prompt = self.get_instructions()
71+
prompt += f"Context:\n{knowledge_source}\n"
72+
prompt += f"Statement:\n{atom} True or False?\n"
73+
prompt += "Output:\n"
3874

3975
output = self.openai_agent.generate(prompt)
4076

README.md

Lines changed: 41 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -108,7 +108,7 @@ from FactScoreLite import FactScorer
108108
scores = FactScorer.get_scores(facts, knowledge_sources)
109109
```
110110

111-
## Prompt Engineering
111+
## Fact Extraction Prompt Engineering
112112

113113
To instruct GPT on how to break each sentence into facts, we have included [examples](FactScoreLite/data/demons.json) (demonstrations, i.e., demons) that is contained in the prompt. These demons are currently for the vehicle domain. However, you might want to create your own domain specific demons. To do this, you can use GPT to create demons based on your requirements. We prompted GPT with [instructions](FactScoreLite/data/demons_generation_prompt.txt) on how to generate the demons required for the vehicle domain. However, you can alter it based on your needs.
114114

@@ -117,7 +117,7 @@ Once you have your own demons.json file, you can include it in the program by se
117117
```python
118118
import FactScoreLite
119119

120-
FactScoreLite.configs.demons_path = "/path/to/your/json/file"
120+
FactScoreLite.configs.atomic_facts_demons_path = "/path/to/your/json/file"
121121

122122
# rest of your code
123123
```
@@ -149,21 +149,54 @@ target_sentence
149149
Independent Facts:
150150
```
151151

152-
### Facts Scoring Prompt
152+
### Facts Scoring Prompt Engineering
153153

154-
The prompt used for scoring facts:
154+
We also use [example demonstrations](/FactScoreLite/data/fact_scorer_demons.json) for scoring instructions prompt. The file contains one positive and multiple negative examples. In each prompt, the positive example in addition to a randomly selected negative prompt is added so that GPT performs better and more accurately. The file also contains reasons for each assignment; However, they are not used in the prompt generation but is a good way of improving the accuracy of GPT on scoring in the future.
155+
156+
You can also set your own domain-specific examples for the run by running the following:
157+
158+
```python
159+
import FactScoreLite
160+
161+
FactScoreLite.configs.fact_scorer_demons_path = "/path/to/your/json/file"
162+
163+
# rest of your code
164+
```
165+
166+
### Fact Scoring Prompt
167+
168+
The following prompt template is used to instruct GPT for scoring facts:
155169

156170
```
157171
# fact_scorer.py
158172

159-
Answer the question based on the given context.
173+
Evaluate the truthfulness of the statement based solely on the provided context and provide the reason for your decision.
174+
175+
176+
Instruction:
177+
Only consider the statement true if it can be directly verified by the information in the context. If the information in the statement cannot be found in the context or differs from it, label it as false.
178+
160179

161180
Context:
162-
knowledge_source
181+
knw 1
182+
Statement:
183+
fact 1 True or False?
184+
Output:
185+
True
163186

164-
Input:
165-
fact True or False?
187+
Context:
188+
knw 2
189+
Statement:
190+
fact 2 True or False?
166191
Output:
192+
False
193+
194+
Context:
195+
target_knowledge_source
196+
Statement:
197+
target_fact True or False?
198+
Output:
199+
167200
```
168201

169202
## Running the Tests

setup.cfg

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[metadata]
22
name = FactScoreLite
3-
version = 1.0.1
3+
version = 1.1.0
44
author = armingh2000
55
author_email =
66
license = MIT

tests/test_atomic_facts.py

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -24,12 +24,10 @@ def generator(monkeypatch):
2424

2525

2626
# Sample data to be returned by the mock
27-
mock_demons_data = {
28-
"demons": [
29-
{"Sentence": "Example sentence 1.", "Independent Facts": ["Fact 1", "Fact 2"]},
30-
{"Sentence": "Example sentence 2.", "Independent Facts": ["Fact 3", "Fact 4"]},
31-
]
32-
}
27+
mock_demons_data = [
28+
{"Sentence": "Example sentence 1.", "Independent Facts": ["Fact 1", "Fact 2"]},
29+
{"Sentence": "Example sentence 2.", "Independent Facts": ["Fact 3", "Fact 4"]},
30+
]
3331

3432

3533
# Test for the load_demons method
@@ -39,7 +37,9 @@ def test_load_demons(generator):
3937
# Use patch to mock open function within the context of your test
4038
with patch("builtins.open", mock_open(read_data=mock_json_str)):
4139
# Also mock configs.demons_path to avoid dependency on external config files
42-
with patch.object(configs, "demons_path", "fake/path/to/demons.json"):
40+
with patch.object(
41+
configs, "atomic_facts_demons_path", "fake/path/to/atomic_facts_demons.json"
42+
):
4343
demons = generator.load_demons()
4444
# Assert that the returned data matches your mock data
4545
assert (

tests/test_fact_scorer.py

Lines changed: 51 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
import pytest
2-
from unittest.mock import patch
2+
from unittest.mock import mock_open, patch
33
from FactScoreLite.fact_scorer import FactScorer
4+
import json
5+
from FactScoreLite import configs
46

57

68
@pytest.fixture
@@ -82,3 +84,51 @@ def test_complex_knowledge_source_and_atomic_facts(fact_scorer, mock_openai_agen
8284
assert all(
8385
isinstance(decision, dict) for decision in result
8486
), "Each item in the returned list should be a dictionary."
87+
88+
89+
# Sample data to be returned by the mock
90+
mock_demons_data = [
91+
{
92+
"knowledge_source": "knw 1",
93+
"fact": "fact 1",
94+
"is_supported": True,
95+
},
96+
{
97+
"knowledge_source": "knw 2",
98+
"fact": "fact 2",
99+
"is_supported": False,
100+
},
101+
]
102+
103+
104+
# Test for the load_demons method
105+
def test_load_demons(fact_scorer):
106+
# Convert your sample data to a JSON string for mocking
107+
mock_json_str = json.dumps(mock_demons_data)
108+
# Use patch to mock open function within the context of your test
109+
with patch("builtins.open", mock_open(read_data=mock_json_str)):
110+
# Also mock configs.demons_path to avoid dependency on external config files
111+
with patch.object(
112+
configs, "atomic_facts_demons_path", "fake/path/to/fact_scorer_demons.json"
113+
):
114+
demons = fact_scorer.load_demons()
115+
# Assert that the returned data matches your mock data
116+
assert (
117+
demons == mock_demons_data
118+
), "The method should load and return the demons correctly."
119+
120+
121+
def test_get_instructions_true_false_demons(fact_scorer):
122+
# Test case for a single demon in self.demons
123+
fact_scorer.demons = mock_demons_data
124+
expected_instructions = (
125+
"Evaluate the truthfulness of the statement based solely on the provided context and provide the reason for your decision.\n\n"
126+
"Instruction:\nOnly consider the statement true if it can be directly verified by the information in the context. If the information in the statement cannot be found in the context or differs from it, label it as false.\n\n"
127+
"Context:\nknw 1\n"
128+
"Statement:\nfact 1 True or False?\n"
129+
"Output:\nTrue\n\n"
130+
"Context:\nknw 2\n"
131+
"Statement:\nfact 2 True or False?\n"
132+
"Output:\nFalse\n\n"
133+
)
134+
assert fact_scorer.get_instructions() == expected_instructions

0 commit comments

Comments
 (0)