Core Concepts
LLM Configuration
Control model selection, sampling parameters, and custom tool schemas to shape how your agent reasons and acts.
Available models
gpt-4oBest balance of quality and speed. Supports function calling, streaming, and vision. Recommended for production.
gpt-4o-miniFaster and cheaper. Good for simple scripts, short calls, or high-volume outbound campaigns where quality matters less than throughput.
gpt-4-turboGPT-4 Turbo. Higher quality on complex reasoning but ~30% slower than gpt-4o. Use only if gpt-4o underperforms on your use case.
Sampling parameters
llmTemperatureControls randomness. 0.2 for factual, scripted agents (support, bookings). 0.7–1.0 for conversational, creative personas (sales, concierge).
llmMaxTokensMaximum tokens in the agent's response. Default 200. Keep this low — long responses add TTS latency. Agents rarely need more than 150 tokens per turn.
llmTopPNucleus sampling threshold. Default 1.0 (disabled). Usually leave untouched unless you have a specific reason.
Custom function tools
Custom tools let you extend agent behavior beyond the built-in book_appointment and transfer_to_human tools. When the LLM decides to call a custom tool, Talknex POSTs the arguments to your configured webhook URL and injects the response back into the conversation.
Example tool definition (JSON Schema):
{
"name": "lookup_order",
"description": "Look up an order by order number and return shipping status.",
"parameters": {
"type": "object",
"properties": {
"orderNumber": {
"type": "string",
"description": "The order number the caller provided, e.g. #82144"
}
},
"required": ["orderNumber"]
},
"webhookUrl": "https://your-api.com/webhooks/talknex/lookup-order"
}Expected webhook response:
{
"result": "Order #82144 shipped 2026-05-04. Expected delivery: 2026-05-07. Carrier: FedEx. Tracking: 794644792798."
}Tool webhook calls must respond within 3 seconds or the agent will fall back to a generic "I couldn't retrieve that information" response.
Prompt engineering tips
Be explicit about when to use tools
The LLM needs clear cues. "When the caller gives their order number, call lookup_order immediately" outperforms implicit expectations.
Constrain output length in the prompt
Add "Keep all responses under 30 words" or "Be brief." to the system prompt. LLMs left unconstrained tend toward long replies that take more time to speak.
Use few-shot examples for edge cases
Append 2–3 Q&A examples to the system prompt for calls you know are tricky. Few-shot beats extensive instructions for specific patterns.
Separate facts from behavior
Keep product/policy facts in the knowledge base (RAG), not the system prompt. Shorter prompts = lower latency + easier updates.