Skip to main content
Choose the LLM that powers your agent or connect your own endpoint.

Available models

Our default, proprietary models optimized for voice AI.
ModelBest for
Raven V2Production-hardened, real-time voice interactions and high retrieval precision
Raven V3Improved grounding, paraphrasing, and robustness for enterprise voice use cases
Raven models are purpose-built for conversational voice AI. Contact PolyAI for detailed benchmarks and guidance.

Configuring the model

llm-use
1

Open model settings

Navigate to Agent Settings > Large Language Model.
2

Select a model

Choose the desired model from the dropdown.
3

Save changes

Click Save to apply your changes.

OpenAI models

Official OpenAI model reference

Anthropic Claude

Claude model documentation

Amazon Nova

Amazon Bedrock model details

Bring your own model (BYOM)

PolyAI supports bring-your-own-model (BYOM) with a simple API integration. If you run your own LLM, expose an endpoint that follows the OpenAI chat/completions schema and PolyAI will treat it like any other provider.

Overview

1

Expose an API endpoint

Accept and return data in the OpenAI chat/completions format.
2

Configure authentication

PolyAI can send either an x-api-key header or a Bearer token.
3

Enable streaming (optional)

Support streaming responses using stream: true for lower latency.

API endpoint

{
  "model": "your-model-id",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "What's the weather today?" }
  ],
  "temperature": 0.7,
  "top_p": 1.0,
  "stream": false
}
You might receive extra OpenAI-style fields such as frequency_penalty, presence_penalty, etc.

Authentication

MethodHeader sent by PolyAI
API Keyx-api-key: YOUR_API_KEY
BearerAuthorization: Bearer YOUR_TOKEN
Configure your server to accept one of the above.

Sample implementation (Python / Flask)

from flask import Flask, request, jsonify
import time, uuid

app = Flask(__name__)

@app.route('/chat/completions', methods=['POST'])
def chat_completions():
    data = request.json
    messages = data.get('messages', [])
    user_input = messages[-1]['content'] if messages else ''

    # TODO: insert your model inference here
    reply = f'You said: {user_input}'

    return jsonify({
        'id': f'chatcmpl-{uuid.uuid4().hex}',
        'object': 'chat.completion',
        'created': int(time.time()),
        'model': 'my-llm',
        'choices': [{
            'index': 0,
            'message': { 'role': 'assistant', 'content': reply },
            'finish_reason': 'stop'
        }]
    })

Final checklist

Before going live, verify all of the following:
  • Endpoint reachable with POST.
  • Request/response match OpenAI chat/completions schema.
  • Authentication header configured (API Key or Bearer token).
  • (Optional) Streaming supported if needed.
Send to your PolyAI contact:
  • Endpoint URL
  • Model ID
  • Auth method & credential