Choose the LLM that powers your agent or connect your own endpoint.
Available models
PolyAI
OpenAI
Amazon Bedrock
Our default, proprietary models optimized for voice AI. Model Best for Raven V2 Production-hardened, real-time voice interactions and high retrieval precision Raven V3 Improved grounding, paraphrasing, and robustness for enterprise voice use cases
Raven models are purpose-built for conversational voice AI. Contact PolyAI for detailed benchmarks and guidance.
Model Best for GPT-5.2 High-quality interactions requiring nuance and strong reasoning GPT-5.2 chat Extended dialogue and conversational stability GPT-5 mini Lower latency and reduced cost for mid-complexity use cases GPT-5 nano Simple tasks and fast-response workloads GPT-4o Versatile balance of reasoning, speed, and cost GPT-4o mini Everyday queries and high-volume deployments GPT-4.1 Strong reasoning with improved cross-task performance GPT-4.1 mini Cost-effective, latency-focused for lighter workloads GPT-4.1 nano Minimal compute and high throughput
See OpenAI model documentation for detailed specifications. Model Best for Claude 3.5 Haiku Simple, predictable tasks with strong safety alignment Nova Micro Efficiency with strong general-purpose performance
See Anthropic Claude docs and Amazon Nova docs for more details.
Configuring the model
Open model settings
Navigate to Agent Settings > Large Language Model .
Select a model
Choose the desired model from the dropdown.
Save changes
Click Save to apply your changes.
OpenAI models Official OpenAI model reference
Anthropic Claude Claude model documentation
Amazon Nova Amazon Bedrock model details
Bring your own model (BYOM)
PolyAI supports bring-your-own-model (BYOM) with a simple API integration. If you run your own LLM, expose an endpoint that follows the OpenAI chat/completions schema and PolyAI will treat it like any other provider.
Overview
Expose an API endpoint
Accept and return data in the OpenAI chat/completions format.
Configure authentication
PolyAI can send either an x-api-key header or a Bearer token.
Enable streaming (optional)
Support streaming responses using stream: true for lower latency.
API endpoint
Authentication
Method Header sent by PolyAI API Key x-api-key: YOUR_API_KEYBearer Authorization: Bearer YOUR_TOKEN
Configure your server to accept one of the above.
Sample implementation (Python / Flask)
from flask import Flask, request, jsonify
import time, uuid
app = Flask( __name__ )
@app.route ( '/chat/completions' , methods = [ 'POST' ])
def chat_completions ():
data = request.json
messages = data.get( 'messages' , [])
user_input = messages[ - 1 ][ 'content' ] if messages else ''
# TODO : insert your model inference here
reply = f 'You said: { user_input } '
return jsonify({
'id' : f 'chatcmpl- { uuid.uuid4().hex } ' ,
'object' : 'chat.completion' ,
'created' : int (time.time()),
'model' : 'my-llm' ,
'choices' : [{
'index' : 0 ,
'message' : { 'role' : 'assistant' , 'content' : reply },
'finish_reason' : 'stop'
}]
})
Final checklist
Before going live, verify all of the following:
Send to your PolyAI contact:
Endpoint URL
Model ID
Auth method & credential