Skip to content

Perplexity Search

Perplexity Sonar models generate web-search-based responses. The gateway lets you control search behavior in two ways:

  1. Model slug suffixperplexity/sonar:high, perplexity/sonar-pro:pro
  2. Body field{ "web_search_options": { "search_context_size": "high", "search_type": "pro" } }

If both are provided, the body field wins.

Sonar runs each request with a search_context_size. This affects search depth and how much outside context the model uses.

SizeSuffix aliasDescription
low:lowDefault, fastest and cheapest
medium:medium, :medBalanced depth/cost
high:highMaximum search depth, for research

Pricing can vary by model and mode. Confirm the current price before production use.

sonar-pro offers a deeper search mode for more complex questions. Trigger it with a suffix or body field:

Terminal window
curl https://llmtr.com/v1/chat/completions \
-H "Authorization: Bearer sk_your_key" \
-H "Content-Type: application/json" \
-d '{
"model": "perplexity/sonar-pro:pro",
"messages": [
{"role": "user", "content": "What was Turkey's annual inflation in 2025, with sources?"}
]
}'

Pro Search is only supported on sonar-pro; using :pro on another model returns 400 unsupported_capability.

Deep Research is meant for broader research and source gathering. Cost and latency can differ noticeably from standard Sonar calls.

Unknown suffixes such as perplexity/sonar:turbo return 400 invalid_request_error — they are not silently swallowed, so a typo in the slug never sneaks into your invoice unnoticed.

Perplexity also offers pplx-embed-v1-* and pplx-embed-context-v1-* embedding models. Standard embeddings call /v1/embeddings; contextualized embeddings use the same gateway endpoint with a nested array input:

Terminal window
curl https://llmtr.com/v1/embeddings \
-H "Authorization: Bearer sk_your_key" \
-H "Content-Type: application/json" \
-d '{
"model": "perplexity/pplx-embed-context-v1-0.6b",
"input": [
["Document 1, chunk 1.", "Document 1, chunk 2."],
["Document 2, single chunk."]
]
}'

In contextualized embeddings, the outer array represents documents and the inner array represents chunks for each document.