You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The implementation is written for [AutoGen](https://github.com/microsoft/autogen) in Python, although it can easily be adapted for C#.
4
4
5
-
**Still work in progress, expect a lot of updates shortly**
6
-
7
-
**The provided AutoGen code only implements Iterations 5 (Agentic Approach)**
8
-
9
5
## Full Logical Flow for Agentic Vector Based Approach
10
6
11
-
The following diagram shows the logical flow within multi agent system. The flow begins with query rewriting to preprocess questions - this includes resolving relative dates (e.g., "last month" to "November 2024") and breaking down complex queries into simpler components. For each preprocessed question, if query cache is enabled, the system checks the cache for previously asked similar questions. In an ideal scenario, the preprocessed questions will be found in the cache, leading to the quickest answer generation. In cases where the question is not known, the group chat selector will fall back to the other agents accordingly and generate the SQL query using the LLMs. The cache is then updated with the newly generated query and schemas.
7
+
The following diagram shows the logical flow within the multi-agent system. The flow begins with query rewriting to preprocess questions - this includes resolving relative dates (e.g., "last month" to "November 2024") and breaking down complex queries into simpler components. For each preprocessed question, if query cache is enabled, the system checks the cache for previously asked similar questions. In an ideal scenario, the preprocessed questions will be found in the cache, leading to the quickest answer generation. In cases where the question is not known, the system will fall back to the other agents accordingly and generate the SQL query using the LLMs. The cache is then updated with the newly generated query and schemas.
12
8
13
9
Unlike the previous approaches, **gpt4o-mini** can be used as each agent's prompt is small and focuses on a single simple task.
14
10
@@ -18,67 +14,131 @@ As the query cache is shared between users (no data is stored in the cache), a n
18
14
19
15

20
16
17
+
## Agent Flow in Detail
18
+
19
+
The agent flow is managed by a sophisticated selector system in `autogen_text_2_sql.py`. Here's how it works:
20
+
21
+
1.**Initial Entry**
22
+
- Every question starts with the Query Rewrite Agent
23
+
- This agent processes dates and breaks down complex questions
24
+
25
+
2.**Post Query Rewrite**
26
+
- If query cache is enabled (`Text2Sql__UseQueryCache=True`):
27
+
- Flow moves to SQL Query Cache Agent
28
+
- If cache is disabled:
29
+
- Flow moves directly to Schema Selection Agent
30
+
31
+
3.**Cache Check Branch**
32
+
- If cache hit found:
33
+
- With pre-run results: Goes to SQL Query Correction Agent
34
+
- Without pre-run results: Goes to SQL Query Generation Agent
35
+
- If cache miss:
36
+
- Goes to Schema Selection Agent
37
+
38
+
4.**Schema Selection Branch**
39
+
- Schema Selection Agent finds relevant schemas
40
+
- Always moves to SQL Disambiguation Agent
41
+
- Disambiguation Agent clarifies any schema ambiguities
42
+
- Then moves to SQL Query Generation Agent
43
+
44
+
5.**Query Generation and Correction Loop**
45
+
- SQL Query Generation Agent creates the query
46
+
- SQL Query Correction Agent verifies/corrects the query
47
+
- Based on correction results:
48
+
- If query needs execution: Returns to Correction Agent
49
+
- If query needs fixes: Returns to Generation Agent
50
+
- If answer and sources ready: Goes to Answer and Sources Agent
51
+
- If error occurs: Returns to Generation Agent
52
+
53
+
6.**Final Answer Formatting**
54
+
- Answer and Sources Agent formats the final response
55
+
- Standardizes output format with markdown tables
56
+
- Combines all sources and query results
57
+
- Returns formatted answer to user
58
+
59
+
The flow uses termination conditions:
60
+
- Explicit "TERMINATE" mention
61
+
- Presence of both "answer" and "sources"
62
+
- Maximum of 20 messages reached
63
+
21
64
## Provided Notebooks & Scripts
22
65
23
-
-`./Iteration 5 - Agentic Vector Based Text2SQL.ipynb` provides example of how to utilise the Agentic Vector Based Text2SQL approach to query the database. The query cache plugin will be enabled or disabled depending on the environmental parameters.
66
+
-`./Iteration 5 - Agentic Vector Based Text2SQL.ipynb` provides example of how to utilize the Agentic Vector Based Text2SQL approach to query the database. The query cache plugin will be enabled or disabled depending on the environmental parameters.
24
67
25
68
## Agents
26
69
27
-
This approach builds on the Vector Based SQL Plugin approach, but adds a agentic approach to the solution.
70
+
This approach builds on the Vector Based SQL Plugin approach, but adds an agentic approach to the solution.
28
71
29
-
This agentic system contains the following agents:
72
+
The agentic system contains the following agents:
30
73
31
74
-**Query Rewrite Agent:** The first agent in the flow, responsible for two key preprocessing tasks:
2. Decomposing complex questions into simpler sub-questions
34
77
This preprocessing happens before cache lookup to maximize cache effectiveness.
35
-
-**Query Cache Agent:** Responsible for checking the cache for previously asked questions. After preprocessing, each sub-question is checked against the cache if caching is enabled.
36
-
-**Schema Selection Agent:** Responsible for extracting key terms from the question and checking the index store for the queries. This agent is used when a cache miss occurs.
37
-
-**SQL Query Generation Agent:** Responsible for using the previously extracted schemas and generated SQL queries to answer the question. This agent can request more schemas if needed. This agent will run the query.
38
-
-**SQL Query Verification Agent:** Responsible for verifying that the SQL query and results question will answer the question.
39
-
-**Answer Generation Agent:** Responsible for taking the database results and generating the final answer for the user.
40
78
41
-
The combination of these agents allows the system to answer complex questions, whilst staying under the token limits when including the database schemas. The query cache ensures that previously asked questions can be answered quickly to avoid degrading user experience.
79
+
-**Query Cache Agent:** (Optional) Responsible for checking the cache for previously asked questions. After preprocessing, each sub-question is checked against the cache if caching is enabled.
80
+
81
+
-**Schema Selection Agent:** Responsible for extracting key terms from the question and checking the index store for relevant database schemas. This agent is used when a cache miss occurs.
42
82
43
-
All agents can be found in `/agents/`.
83
+
-**SQL Disambiguation Agent:** Responsible for clarifying any ambiguities in the schema selection and ensuring the correct tables and columns are selected for the query.
44
84
45
-
## agentic_text_2_sql.py
85
+
-**SQL Query Generation Agent:** Responsible for using the previously extracted schemas to generate SQL queries that answer the question. This agent can request more schemas if needed.
46
86
47
-
This is the main entry point for the agentic system. In here, the system is configured with the following processing flow:
87
+
-**SQL Query Correction Agent:** Responsible for verifying and correcting the generated SQL queries, ensuring they are syntactically correct and will produce the expected results. This agent also handles the execution of queries and formatting of results.
48
88
49
-
The preprocessed questions from the Query Rewrite Agent are processed sequentially through the rest of the agent pipeline. A custom transition selector automatically transitions between agents dependent on the last one that was used. The flow starts with the Query Rewrite Agent for preprocessing, followed by cache checking for each sub-question if caching is enabled. In some cases, this choice is delegated to an LLM to decide on the most appropriate action. This mixed approach allows for speed when needed (e.g. cache hits for known questions), but will allow the system to react dynamically to the events.
89
+
-**Answer and Sources Agent:** Final agent in the flow that:
90
+
1. Standardizes the output format across all responses
91
+
2. Formats query results into markdown tables for better readability
92
+
3. Combines all sources and results into a single coherent response
93
+
4. Ensures consistent JSON structure in the final output
50
94
51
-
Note: Future development aims to implement independent processing where each preprocessed question would run in its own isolated context to prevent confusion between different parts of complex queries.
95
+
The combination of these agents allows the system to answer complex questions while staying under token limits when including database schemas. The query cache ensures that previously asked questions can be answered quickly to avoid degrading user experience.
52
96
53
-
## Utils
97
+
## Project Structure
54
98
55
-
### ai-search.py
99
+
### autogen_text_2_sql.py
56
100
57
-
This util file contains helper functions for interacting with AI Search.
101
+
This is the main entry point for the agentic system. It configures the system with a sophisticated processing flow managed by a unified selector that handles agent transitions. The flow includes:
58
102
59
-
### llm_agent_creator.py
103
+
1. Initial query rewriting for preprocessing
104
+
2. Cache checking if enabled
105
+
3. Schema selection and disambiguation
106
+
4. Query generation and correction
107
+
5. Result verification and formatting
108
+
6. Final answer standardization
60
109
61
-
This util file creates the agents in the AutoGen framework based on the configuration files.
110
+
The system uses a custom transition selector that automatically moves between agents based on the previous agent's output and the current state. This allows for dynamic reactions to different scenarios, such as cache hits, schema ambiguities, or query corrections.
62
111
63
-
### models.py
112
+
### creators/
64
113
65
-
This util file creates the model connections to Azure OpenAI for the agents.
114
+
-**llm_agent_creator.py:** Creates the agents in the AutoGen framework based on configuration files
115
+
-**llm_model_creator.py:** Handles model connections and configurations for the agents
66
116
67
-
### sql.py
117
+
### custom_agents/
68
118
69
-
#### get_entity_schema()
119
+
Contains specialized agent implementations:
120
+
-**sql_query_cache_agent.py:** Implements the caching functionality
121
+
-**sql_schema_selection_agent.py:** Handles schema selection and management
122
+
-**answer_and_sources_agent.py:** Formats and standardizes final outputs
70
123
71
-
This method is called by the AutoGen framework automatically, when instructed to do so by the LLM, to search the AI Search instance with the given text. The LLM is able to pass the key terms from the user query, and retrieve a ranked list of the most suitable entities to answer the question.
124
+
## Configuration
72
125
73
-
The search text passed is vectorised against the entity level **Description** columns. A hybrid Semantic Reranking search is applied against the **EntityName**, **Entity**, **Columns/Name** fields.
126
+
The system behavior can be controlled through environment variables:
74
127
75
-
#### fetch_queries_from_cache()
128
+
-`Text2Sql__UseQueryCache`: Enables/disables the query cache functionality
129
+
-`Text2Sql__PreRunQueryCache`: Controls whether to pre-run cached queries
130
+
-`Text2Sql__UseColumnValueStore`: Enables/disables the column value store
131
+
-`Text2Sql__DatabaseEngine`: Specifies the target database engine
132
+
133
+
Each agent can be configured with specific parameters and prompts to optimize its behavior for different scenarios.
134
+
135
+
## Query Cache Implementation Details
76
136
77
137
The vector based with query cache uses the `fetch_queries_from_cache()` method to fetch the most relevant previous query and injects it into the prompt before the initial LLM call. The use of Auto-Function Calling here is avoided to reduce the response time as the cache index will always be used first.
78
138
79
139
If the score of the top result is higher than the defined threshold, the query will be executed against the target data source and the results included in the prompt. This allows us to prompt the LLM to evaluated whether it can use these results to answer the question, **without further SQL Query generation** to speed up the process.
80
140
81
-
The cache entires are rendered with Jinja templates before they are run. The following placesholders are prepopulated automatically:
141
+
The cache entries are rendered with Jinja templates before they are run. The following placeholders are prepopulated automatically:
82
142
83
143
- date
84
144
- datetime
@@ -87,8 +147,31 @@ The cache entires are rendered with Jinja templates before they are run. The fol
87
147
88
148
Additional parameters passed at runtime, such as a user_id, are populated automatically if included in the request.
89
149
90
-
####run_sql_query()
150
+
### run_sql_query()
91
151
92
152
This method is called by the AutoGen framework automatically, when instructed to do so by the LLM, to run a SQL query against the given database. It returns a JSON string containing a row wise dump of the results returned. These results are then interpreted to answer the question.
93
153
94
154
Additionally, if any of the cache functionality is enabled, this method will update the query cache index based on the SQL query run, and the schemas used in execution.
155
+
156
+
## Output Format
157
+
158
+
The system produces standardized JSON output through the Answer and Sources Agent:
159
+
160
+
```json
161
+
{
162
+
"answer": "The answer to the user's question",
163
+
"sources": [
164
+
{
165
+
"sql_query": "The SQL query used",
166
+
"sql_rows": ["Array of result rows"],
167
+
"markdown_table": "Formatted markdown table of results"
168
+
}
169
+
]
170
+
}
171
+
```
172
+
173
+
This consistent output format ensures:
174
+
1. Clear separation between answer and supporting evidence
0 commit comments