If you've subscribed to the Pro plan or started a free trial, you automatically get access to these services.
Included Services
Pro users get access to managed cloud services that work out of the box:
| Service | Description | Status |
|---|---|---|
/chat/completions | LLM endpoint for AI features (summaries, notes, chat) | Available |
/mcp | MCP server with web search and URL reading tools | Available |
Pro includes curated AI models that work out of the box. Your requests are proxied through our servers with automatic API key management. If you want to use a specific LLM provider, you can bring your own API key (BYOK) in Settings > Intelligence.
Which LLM Models Are Used
When you use Pro's curated intelligence, Char's server selects from these models automatically. You don't choose a specific model — the server decides which pool of models to use based on the type of request, then OpenRouter picks the fastest available model from that pool.
There are two pools of models, and the server picks one based on a single condition: does your request need tool calling?
When tool calling is needed
If the desktop app sends tool definitions with the request (e.g., for web search or URL reading during note generation) and tool_choice is not set to "none", the server uses the tool-calling model pool. This happens when:
- You have MCP tools enabled and AI is generating notes that may need to look things up online
- The chat feature invokes a function like
exa-searchorread-url
| Model | Provider |
|---|---|
anthropic/claude-haiku-4.5 | Anthropic (via OpenRouter) |
openai/gpt-oss-120b:exacto | OpenAI (via OpenRouter) |
moonshotai/kimi-k2-0905:exacto | Moonshot AI (via OpenRouter) |
When tool calling is not needed
For standard requests without tools — such as generating summaries, enhancing notes, or regular chat completions — the server uses the default model pool:
| Model | Provider |
|---|---|
anthropic/claude-sonnet-4.5 | Anthropic (via OpenRouter) |
openai/gpt-5.2-chat | OpenAI (via OpenRouter) |
moonshotai/kimi-k2-0905 | Moonshot AI (via OpenRouter) |
How the specific model is chosen
Within each pool, you don't get a fixed model. All models in the pool are sent to OpenRouter, which picks the one with the lowest latency at that moment. This means the actual model serving your request can vary between calls — if Anthropic's endpoint is fastest right now, you'll get Claude; if OpenAI responds faster, you'll get GPT.
Here is the routing condition in the server — it checks whether the request includes tool definitions:
177 | let needs_tool_calling = request.tools.as_ref().is_some_and(|t| !t.is_empty()) |
178 | && !matches!(&request.tool_choice, Some(ToolChoice::String(s)) if s == "none"); |
179 | |
180 | let models = if needs_tool_calling { |
181 | state.config.models_tool_calling.clone() |
182 | } else { |
183 | state.config.models_default.clone() |
184 | }; |
And here are the two model pools defined in the server config:
43 | models_tool_calling: vec![ |
44 | "anthropic/claude-haiku-4.5".into(), |
45 | "openai/gpt-oss-120b:exacto".into(), |
46 | "moonshotai/kimi-k2-0905:exacto".into(), |
47 | ], |
48 | models_default: vec![ |
49 | "anthropic/claude-sonnet-4.5".into(), |
50 | "openai/gpt-5.2-chat".into(), |
51 | "moonshotai/kimi-k2-0905".into(), |
52 | ], |
How the Request Flows
Your Device ──HTTPS──▶ Char API Server ──HTTPS──▶ OpenRouter ──▶ Model Provider
(pro.hyprnote.com) (openrouter.ai) (OpenAI/Anthropic/Moonshot)
- Your device sends a chat completion request to the Char API server, authenticated with your Supabase JWT token.
- Char API server validates your Pro subscription, then forwards the request to OpenRouter.
- OpenRouter routes to the fastest available model from the configured list (sorted by latency).
- The model provider processes your request and streams the response back through the same chain.
The server sends your request to OpenRouter with provider.sort = "latency" to pick the fastest available model:
47 | fn build_request( |
48 | &self, |
49 | request: &ChatCompletionRequest, |
50 | models: Vec<String>, |
51 | stream: bool, |
52 | ) -> Result<serde_json::Value, ProviderError> { |
53 | let mut body = serde_json::to_value(request)?; |
54 | let obj = body.as_object_mut().unwrap(); |
55 | |
56 | obj.remove("model"); |
What Data Is Sent
Sent to OpenRouter / model provider:
- Your conversation messages (system prompt, user messages, assistant responses)
- Tool definitions and tool call results (if applicable)
- Parameters:
temperature,max_tokens,stream
NOT sent to OpenRouter / model provider:
- Your user ID, email, or name
- Your device fingerprint
- Your JWT token (used only for Char API authentication — not forwarded)
What Char Logs (Analytics)
Char logs metadata about each LLM request to PostHog for usage tracking and billing. No message content is ever logged.
31 | let payload = AnalyticsPayload::builder("$ai_generation") |
32 | .with("$ai_provider", event.provider_name.clone()) |
33 | .with("$ai_model", event.model.clone()) |
34 | .with("$ai_input_tokens", event.input_tokens) |
35 | .with("$ai_output_tokens", event.output_tokens) |
36 | .with("$ai_latency", event.latency) |
37 | .with("$ai_trace_id", event.generation_id.clone()) |
38 | .with("$ai_http_status", event.http_status) |
39 | .with("$ai_base_url", event.base_url.clone()); |
Logged: provider name, model name, token counts, latency, cost, HTTP status. Not logged: message content, conversation history, user prompts.
MCP Tools
The MCP server provides two built-in tools:
exa-search - Search the web via Exa and get page text and highlights in results. Useful for researching topics mentioned in your meetings.
read-url - Visit any URL and return the content as markdown. Great for pulling in context from links shared during meetings.
Why Use Cloud Services?
While Char aims to be fully transparent and controllable, cloud services help in two ways:
- Faster time-to-value - Start using AI features immediately without configuring API keys or running local models.
- Managed complexity - Get the benefits of multiple AI providers without managing each one yourself.
Privacy & Security
The cloud server (pro.hyprnote.com) is open-source and deployed in our Kubernetes cluster on AWS via GitHub Actions.
Data handling:
- Nothing is stored by us — the server proxies requests and discards them
- Your user identity (email, name) is never sent to external AI providers
- Only the content needed for processing (messages, tools, parameters) is forwarded
- Current providers: OpenRouter (LLM routing), Exa (web search), Jina AI (URL reading)
All requests are rate-limited and authenticated using your Pro subscription.
OpenRouter Privacy Policy
All Pro LLM requests go through OpenRouter, which routes to the actual model provider (OpenAI, Anthropic, Moonshot AI).
| Policy | Details |
|---|---|
| Data retention | Zero by default — prompts and completions are not stored unless you opt in on your OpenRouter account |
| Training | Does not train on API data |
| Compliance | SOC 2 |
| Data location | US (default) |
"OpenRouter does not store your prompts or responses, unless you have explicitly opted in to prompt logging in your account settings."
Official docs: Privacy Policy · Data Collection · Logging Policies · Zero Data Retention Guide
If you prefer to run AI locally instead, see Local LLM Setup for LLMs and Local Models for speech-to-text.