AI Model Inference

Perceptron provides an OpenAI-compatible API for AI model inference. No subscription or credit card needed. You pay per token using credits purchased with USDT.

The API also supports streaming. You can use the OpenAI SDK or other LLM inference libraries pointed at Perceptron as a drop-in replacement for the OpenAI API or OpenRouter.

Base URL

https://perceptron.cloud/api/v1

Authentication

Include your API key in the Authorization header:

Authorization: Bearer YOUR_API_KEY

Get an API key from the TUI Keys page.

Available Models

curl https://perceptron.cloud/api/models

Common options:

Model Context Input Price Output Price
Deepseek V4 Flash 1M $0.06/1M tokens $0.28/1M tokens
Kimi K2.6 1M $0.27/1M tokens $3.80/1M tokens

Chat Completions

POST /api/v1/chat/completions

Request

{
  "model": "moonshotai/kimi-k2.6",
  "messages": [
    {
      "role": "user",
      "content": "What is the capital of Japan?"
    }
  ]
}

Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "moonshotai/kimi-k2.6",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "The Capital of Japan is Tokyo."
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 10,
    "total_tokens": 20
  }
}

Streaming

Set "stream": true to receive a Server-Sent Events (SSE) stream. Each event contains a delta:

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"content":"Hello"},"index":0}]}

data: [DONE]

curl Example

curl https://perceptron.cloud/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-coder-next",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": true
  }'

Python Example

import requests
import json

response = requests.post(
  url="https://perceptron.cloud/api/v1/chat/completions",
  headers={
    "Authorization": "Bearer <PERCEPTRON_API_KEY>",
    "HTTP-Referer": "<YOUR_SITE_URL>", # Optional site URL for rankings on Perceptron Store.
    "X-Perceptron-Title": "<YOUR_SITE_NAME>", # Optional site title for rankings on Perceptron Store.
  },
  data=json.dumps({
    "model": "deepseek/deepseek-v4-pro",
    "messages": [
      {
        "role": "user",
        "content": "What is the purpose of life?"
      }
    ]
  })
)

Pricing

Inference is billed per-request, per-million tokens. Charges are deducted from your credits balance. See Billing for more details.