Swift Voice Agent starter app

This starter app template for LiveKit Agents provides a simple voice interface using the LiveKit Swift SDK. It supports voice, transcriptions, live video input, and virtual avatars.

This template is compatible with iOS, iPadOS, macOS, and visionOS and is free for you to use or modify as you see fit.

Getting started

First, you'll need a LiveKit agent to speak with. Try our starter agent for Python, Node.js, or create your own from scratch.

Second, you need a token sever. The easiest way to set this up is with the Sandbox for LiveKit Cloud and the LiveKit CLI.

First, create a new Sandbox Token Server for your LiveKit Cloud project. Then, run the following command to automatically clone this template and connect it to LiveKit Cloud. This will create a new Xcode project in the current directory.

lk app create --template agent-starter-swift --sandbox <token_server_sandbox_id>

Then, build and run the app from Xcode by opening VoiceAgent.xcodeproj. You may need to adjust your app signing settings to run the app on your device.

Note

To setup without the LiveKit CLI, clone the repository and then either create a VoiceAgent/.env.xcconfig with a LIVEKIT_SANDBOX_ID (if using a Sandbox Token Server), or open TokenService.swift and add your manually generated URL and token.

Feature overview

This starter app has support for a number of features of the agents framework, and is configurable to easily enable or disable them in code based on your needs as you adapt this template to your own use case.

Text, video, and voice input

This app supports text, video, and/or voice input according to the needs of your agent. To update the features enabled in the app, edit VoiceAgent/VoiceAgentApp.swift and update AgentFeatures.current to include or exclude the features you need.

By default, only voice and text input are enabled.

Available input types:

.voice: Allows the user to speak to the agent using their microphone. Requires microphone permissions.
.text: Allows the user to type to the agent. See the docs for more details.
.video: Allows the user to share their camera or screen to the agent. This requires a supported model like the Gemini Live API. See the docs for more details.

If you have trouble with screensharing, refer to the docs for more setup instructions.

Preconnect audio buffer

This app uses withPreConnectAudio to capture and buffer audio before the room connection completes. This allows the connection to appear "instant" from the user's perspective and makes your app more responsive. To disable this feature, remove the call to withPreConnectAudio as below:

Location: VoiceAgent/App/AppViewModel.swift → connectWithVoice()
To disable preconnect buffering but keep voice:
- Replace the withPreConnectAudio { ... } block with a standard room.connect call and enable the microphone after connect, for example:
  - Connect with connectOptions: .init(enableMicrophone: true) without wrapping in withPreConnectAudio, or
  - Connect with microphone disabled and call room.localParticipant.setMicrophone(enabled: true) after connection.

Virtual avatar support

If your agent publishes a virtual avatar, this app will automatically render the avatar’s camera feed in AgentParticipantView when available.

Token generation in production

In a production environment, you will be responsible for developing a solution to generate tokens for your users which is integrated with your authentication solution. You should disable your sandbox token server and modify TokenService.swift to use your own token server.

Running on Simulator

To use this template with video (or screen sharing) input, you need to run the app on a physical device. Testing on the Simulator will still support voice and text modes, as well as virtual avatars.

Submitting to the App Store

LiveKitWebRTC.xcframework binary framework, which is part of the LiveKit Swift SDK, does not contain DSYMs. Submitting the app to the App Store will result in a following warning:

The archive did not include a dSYM for the LiveKitWebRTC.framework with the UUIDs [...]. Ensure that the archive's dSYM folder includes a DWARF file for LiveKitWebRTC.framework with the expected UUIDs.

It will not prevent the app from being submitted to the App Store or passing the review process.

Contributing

This template is open source and we welcome contributions! Please open a PR or issue through GitHub, and don't forget to join us in the LiveKit Community Slack!

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.github		.github
BroadcastExtension		BroadcastExtension
VoiceAgent.xcodeproj		VoiceAgent.xcodeproj
VoiceAgent		VoiceAgent
VoiceAgentTests		VoiceAgentTests
.gitignore		.gitignore
.swift-version		.swift-version
.swiftformat		.swiftformat
LICENSE		LICENSE
README.md		README.md
taskfile.yaml		taskfile.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Swift Voice Agent starter app

Getting started

Feature overview

Text, video, and voice input

Preconnect audio buffer

Virtual avatar support

Token generation in production

Running on Simulator

Submitting to the App Store

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 5

Languages

License

livekit-examples/agent-starter-swift

Folders and files

Latest commit

History

Repository files navigation

Swift Voice Agent starter app

Getting started

Feature overview

Text, video, and voice input

Preconnect audio buffer

Virtual avatar support

Token generation in production

Running on Simulator

Submitting to the App Store

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 5

Languages

Packages