AI Model Inference
Perceptron provides an OpenAI-compatible API for AI model inference. No subscription or credit card needed. You pay per token using credits purchased with USDT.
The API also supports streaming. You can use the OpenAI SDK or other LLM inference libraries pointed at Perceptron as a drop-in replacement for the OpenAI API or OpenRouter.
Base URL
https://perceptron.cloud/api/v1
Authentication
Include your API key in the Authorization header:
Authorization: Bearer YOUR_API_KEY
Get an API key from the TUI Keys page.
Available Models
curl https://perceptron.cloud/api/modelsCommon options:
| Model | Context | Input Price | Output Price |
|---|---|---|---|
| Deepseek V4 Flash | 1M | $0.06/1M tokens | $0.28/1M tokens |
| Kimi K2.6 | 1M | $0.27/1M tokens | $3.80/1M tokens |
Chat Completions
POST /api/v1/chat/completions
Request
{
"model": "moonshotai/kimi-k2.6",
"messages": [
{
"role": "user",
"content": "What is the capital of Japan?"
}
]
}Response
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1700000000,
"model": "moonshotai/kimi-k2.6",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "The Capital of Japan is Tokyo."
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 10,
"total_tokens": 20
}
}Streaming
Set "stream": true to receive a Server-Sent Events (SSE)
stream. Each event contains a delta:
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"content":"Hello"},"index":0}]}
data: [DONE]
curl Example
curl https://perceptron.cloud/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-coder-next",
"messages": [{"role": "user", "content": "Hello!"}],
"stream": true
}'Python Example
import requests
import json
response = requests.post(
url="https://perceptron.cloud/api/v1/chat/completions",
headers={
"Authorization": "Bearer <PERCEPTRON_API_KEY>",
"HTTP-Referer": "<YOUR_SITE_URL>", # Optional site URL for rankings on Perceptron Store.
"X-Perceptron-Title": "<YOUR_SITE_NAME>", # Optional site title for rankings on Perceptron Store.
},
data=json.dumps({
"model": "deepseek/deepseek-v4-pro",
"messages": [
{
"role": "user",
"content": "What is the purpose of life?"
}
]
})
)Pricing
Inference is billed per-request, per-million tokens. Charges are deducted from your credits balance. See Billing for more details.