Skip to content

feat: support web search #609

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 27, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion service/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,9 @@
"common:cleanup": "rimraf node_modules && rimraf pnpm-lock.yaml"
},
"dependencies": {
"@tavily/core": "^0.5.3",
"axios": "^1.8.4",
"dayjs": "^1.11.7",
"dayjs": "^1.11.13",
"dotenv": "^16.0.3",
"express": "^5.1.0",
"express-rate-limit": "^6.7.0",
Expand Down
32 changes: 28 additions & 4 deletions service/pnpm-lock.yaml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

194 changes: 151 additions & 43 deletions service/src/chatgpt/index.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
import * as dotenv from 'dotenv'
import OpenAI from 'openai'
import { HttpsProxyAgent } from 'https-proxy-agent'
import { tavily } from '@tavily/core'
import dayjs from 'dayjs'
import type { AuditConfig, KeyConfig, UserInfo } from '../storage/model'
import { Status, UsageResponse } from '../storage/model'
import { convertImageUrl } from '../utils/image'
Expand All @@ -10,11 +12,81 @@ import { getCacheApiKeys, getCacheConfig, getOriginConfig } from '../storage/con
import { sendResponse } from '../utils'
import { hasAnyRole, isNotEmptyString } from '../utils/is'
import type { ModelConfig } from '../types'
import { getChatByMessageId, updateRoomChatModel } from '../storage/mongo'
import { getChatByMessageId, updateChatSearchQuery, updateChatSearchResult } from '../storage/mongo'
import type { ChatMessage, RequestOptions } from './types'

dotenv.config()

function systemMessageWithSearchResult(currentTime: string): string {
return `You are an intelligent assistant that needs to answer user questions based on search results.

**Search Results Format Description:**
- Search results may contain irrelevant information, please filter and use accordingly

**Context Information:**
- Current time: ${currentTime}

**Response Requirements:**

1. **Content Processing**
- Screen and filter search results, selecting content most relevant to the question
- Synthesize information from multiple web pages, avoiding repetitive citations from a single source
- Do not mention specific sources or rankings of search results

2. **Response Strategy**
- **Listing questions**: Limit to within 10 key points, prioritize providing the most relevant and complete information
- **Creative questions**: Make full use of search results to generate in-depth professional long-form answers
- **Objective Q&A**: Brief answers may appropriately supplement 1-2 sentences of related information

3. **Format Requirements**
- Respond using markdown (latex start with $).
- Use structured, paragraph-based answer format
- When answering in points, limit to within 5 points, merging related content
- Ensure answers are aesthetically pleasing and highly readable

4. **Language Standards**
- Keep answer language consistent with user's question language
- Do not change language unless specifically requested by the user

**Notes:**
- Not all search results are relevant, need to judge based on the question
- For listing questions, inform users they can check search sources for complete information
- Creative answers need to be multi-perspective, information-rich, and thoroughly discussed`
}

function systemMessageGetSearchQuery(currentTime: string): string {
return `You are an intelligent search assistant.
Current time: ${currentTime}

Before formally answering user questions, you need to analyze the user's questions and conversation context to determine whether you need to obtain more information through internet search to provide accurate answers.

**Task Flow:**
1. Carefully analyze the user's question content and previous conversation history
2. Combined with the current time, determine whether the question involves time-sensitive information
3. Evaluate whether existing knowledge is sufficient to answer the question
4. If search is needed, generate a precise search query
5. If search is not needed, return empty result

**Output Format Requirements:**
- If search is needed: return <search_query>example search query keywords</search_query>
- If search is not needed: return <search_query></search_query>
- Do not include any other explanations or answer content
- Search query should be concise and clear, able to obtain the most relevant information

**Judgment Criteria:**
- Time-sensitive information (such as latest news, stock prices, weather, real-time data, etc.): search needed
- Latest policies, regulations, technological developments: may need search
- Common sense questions, historical facts, basic knowledge: usually no search needed
- Latest research or developments in professional fields: search recommended

**Notes:**
- Search query should target the core needs of user questions
- Consider the timeliness and accuracy requirements of information
- Prioritize obtaining the latest and most authoritative information sources

Please strictly return results according to the above format.`
}

const ErrorCodeMessage: Record<string, string> = {
401: '[OpenAI] 提供错误的API密钥 | Incorrect API key provided',
403: '[OpenAI] 服务器拒绝访问,请稍后再试 | Server refused to access, please try again later',
Expand Down Expand Up @@ -49,17 +121,16 @@ export async function initApi(key: KeyConfig) {
const processThreads: { userId: string; abort: AbortController; messageId: string }[] = []

async function chatReplyProcess(options: RequestOptions) {
const globalConfig = await getCacheConfig()
const model = options.room.chatModel
const searchEnabled = options.room.searchEnabled
const key = await getRandomApiKey(options.user, model)
const userId = options.user._id.toString()
const maxContextCount = options.user.advanced.maxContextCount ?? 20
const messageId = options.messageId
if (key == null || key === undefined)
throw new Error('没有对应的apikeys配置。请再试一次 | No available apikeys configuration. Please try again.')

// Add Chat Record
updateRoomChatModel(userId, options.room.roomId, model)

const { message, uploadFileKeys, parentMessageId, process, systemMessage, temperature, top_p } = options

try {
Expand All @@ -85,32 +156,59 @@ async function chatReplyProcess(options: RequestOptions) {
}

// Prepare the user message content (text and images)
let content: string | OpenAI.Chat.ChatCompletionContentPart[] = message

// Handle image uploads if present
if (uploadFileKeys && uploadFileKeys.length > 0) {
content = [
{
type: 'text',
text: message,
},
]
for (const uploadFileKey of uploadFileKeys) {
content.push({
type: 'image_url',
image_url: {
url: await convertImageUrl(uploadFileKey),
},
})
}
}
const content: string | OpenAI.Chat.ChatCompletionContentPart[] = await createContent(message, uploadFileKeys)

// Add the user message
messages.push({
role: 'user',
content,
})

let hasSearchResult = false
const searchConfig = globalConfig.searchConfig
if (searchConfig.enabled && searchConfig?.options?.apiKey && searchEnabled) {
messages[0].content = systemMessageGetSearchQuery(dayjs().format('YYYY-MM-DD HH:mm:ss'))
const completion = await openai.chat.completions.create({
model,
messages,
})
let searchQuery: string = completion.choices[0].message.content
const match = searchQuery.match(/<search_query>([\s\S]*)<\/search_query>/i)
if (match)
searchQuery = match[1].trim()
else
searchQuery = ''

if (searchQuery) {
await updateChatSearchQuery(messageId, searchQuery)

const tvly = tavily({ apiKey: searchConfig.options?.apiKey })
const response = await tvly.search(
searchQuery,
{
includeRawContent: true,
timeout: 300,
},
)

const searchResult = JSON.stringify(response)
await updateChatSearchResult(messageId, searchResult)

messages.push({
role: 'user',
content: `Additional information from web searche engine.
search query: <search_query>${searchQuery}</search_query>
search result: <search_result>${searchResult}</search_result>`,
})

messages[0].content = systemMessageWithSearchResult(dayjs().format('YYYY-MM-DD HH:mm:ss'))
hasSearchResult = true
}
}

if (!hasSearchResult)
messages[0].content = systemMessage

// Create the chat completion with streaming
const stream = await openai.chat.completions.create({
model,
Expand Down Expand Up @@ -244,26 +342,7 @@ async function getMessageById(id: string): Promise<ChatMessage | undefined> {
}
else {
if (isPrompt) { // prompt
let content: string | OpenAI.Chat.ChatCompletionContentPart[] = chatInfo.prompt
if (chatInfo.images && chatInfo.images.length > 0) {
content = [
{
type: 'text',
text: chatInfo.prompt,
},
]
for (const image of chatInfo.images) {
const imageUrlBase64 = await convertImageUrl(image)
if (imageUrlBase64) {
content.push({
type: 'image_url',
image_url: {
url: await convertImageUrl(image),
},
})
}
}
}
const content: string | OpenAI.Chat.ChatCompletionContentPart[] = await createContent(chatInfo.prompt, chatInfo.images)
return {
id,
parentMessageId,
Expand Down Expand Up @@ -311,6 +390,35 @@ async function getRandomApiKey(user: UserInfo, chatModel: string): Promise<KeyCo
return randomKeyConfig(keys)
}

// Helper function to create content with text and optional images
async function createContent(text: string, images?: string[]): Promise<string | OpenAI.Chat.ChatCompletionContentPart[]> {
// If no images or empty array, return just the text
if (!images || images.length === 0)
return text

// Create content with text and images
const content: OpenAI.Chat.ChatCompletionContentPart[] = [
{
type: 'text',
text,
},
]

for (const image of images) {
const imageUrl = await convertImageUrl(image)
if (imageUrl) {
content.push({
type: 'image_url',
image_url: {
url: imageUrl,
},
})
}
}

return content
}

// Helper function to add previous messages to the conversation context
async function addPreviousMessages(parentMessageId: string, maxContextCount: number, messages: OpenAI.Chat.ChatCompletionMessageParam[]): Promise<void> {
// Recursively get previous messages
Expand Down
27 changes: 27 additions & 0 deletions service/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -781,6 +781,33 @@
}
})

router.post('/setting-search', rootAuth, async (req, res) => {
try {
const config = req.body as import('./storage/model').SearchConfig

const thisConfig = await getOriginConfig()
thisConfig.searchConfig = config
const result = await updateConfig(thisConfig)
clearConfigCache()
res.send({ status: 'Success', message: '操作成功 | Successfully', data: result.searchConfig })
}
catch (error) {
res.send({ status: 'Fail', message: error.message, data: null })
}
})

router.post('/search-test', rootAuth, async (req, res) => {
try {
const { search, text } = req.body as { search: import('./storage/model').SearchConfig; text: string }

Check warning on line 801 in service/src/index.ts

View workflow job for this annotation

GitHub Actions / lint

'search' is assigned a value but never used. Allowed unused vars must match /^_/u
// TODO: Implement actual search test logic with Tavily API
// For now, just return a success response
res.send({ status: 'Success', message: '搜索测试成功 | Search test successful', data: { query: text, results: [] } })
}
catch (error) {
res.send({ status: 'Fail', message: error.message, data: null })
}
})

router.post('/setting-advanced', auth, async (req, res) => {
try {
const config = req.body as {
Expand Down
Loading