Skip to content

Example of including an image + prompt to gpt-4o? #216

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
tigran-iii opened this issue Jun 17, 2024 · 1 comment
Open

Example of including an image + prompt to gpt-4o? #216

tigran-iii opened this issue Jun 17, 2024 · 1 comment

Comments

@tigran-iii
Copy link

Hi.

Pretty new to both Swift and this package.

Can anyone include an example of how to supply an image from assets or uploaded from the device with a prompt to the gpt-4o endpoint?

My current progress is below, but I feel like I'm doing something terribly wrong.

Only requirement is making it work at this point :D.

Current Error:
image

Current Code:

private func sendDataToAPI() async {
        let washRoutine = UserDefaults.standard.string(forKey: "washRoutine") ?? ""
        let beforeBedRoutine = UserDefaults.standard.string(forKey: "beforeBedRoutine") ?? ""
        let otherHairCare = UserDefaults.standard.string(forKey: "otherHairCare") ?? ""
        
        var messages: [ChatQuery.ChatCompletionMessageParam] = [
            .user(.init(content: .string("Here are my current hair care routines:"))),
            .user(.init(content: .string("Wash Routine: \(washRoutine)"))),
            .user(.init(content: .string("Before Bed Routine: \(beforeBedRoutine)"))),
            .user(.init(content: .string("Other Hair Care: \(otherHairCare)")))
        ]
        
        if let image = UIImage(named: "curly_1"),
           let imageData = image.jpegData(compressionQuality: 1.0) {
            let base64String = imageData.base64EncodedString()
            let imageUrl = "data:image/jpeg;base64,\(base64String)"
            let imageParam = ChatQuery.ChatCompletionMessageParam.ChatCompletionUserMessageParam.init(content: .string(imageUrl))
            messages.append(.user(imageParam))
        }
        
        let openAI = OpenAI(apiToken: "<api_token>")
        let query = ChatQuery(messages: messages, model: .gpt4_o)
        
        do {
            let result = try await openAI.chats(query: query)
            let content = result.choices.first?.message.content?.string
            let tokenCount = result.usage?.promptTokens ?? 0
            
            DispatchQueue.main.async {
                self.apiResult = "\nPrompt tokens: \(tokenCount)\n\n\(content ?? "No content")"
            }
            
        } catch {
            DispatchQueue.main.async {
                self.apiResult = "Error fetching chats: \(error.localizedDescription)"
            }
        }
    }
@ddaddy
Copy link

ddaddy commented Jul 2, 2024

I got this working. You need to use the .vision message param and not .string.

guard let imageData = image.jpegData(compressionQuality: 1.0) else { return }

let imgParam = ChatQuery.ChatCompletionMessageParam.ChatCompletionUserMessageParam(content: 
        .vision([
            .chatCompletionContentPartImageParam(.init(imageUrl: .init(url: imageData, detail: .high)))
        ])
)

let query = ChatQuery(messages: [
    .system(.init(content: system)),
    .user(imgParam),
    .user(.init(content: .string(prompt)))
],
                      model: .gpt4_o,
                      maxTokens: 500)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants