Z.ai

GLM 5.1

GLM-5.1 delivers a major leap in coding capability, with particularly significant gains in handling long-horizon tasks. Unlike previous models built around minute-level interactions, GLM-5.1 can work independently and continuously on...

Apr 07, 2026 202.8K context 16,384 tokens output

Text Tools Structured Output Reasoning

Overview ↓ About ↓ Capabilities ↓ Pricing ↓ Price Comparison ↓ Providers ↓ Parameters ↓ Compare ↓ Tools ↓ Resources ↓ Community ↓

Model Overview

High-signal model metadata in a structured two-column overview table.

Provider

The entity that provides this model.

Z.ai

Model ID

The routed model identifier exposed by upstream providers.

z-ai/glm-5.1

Input Context Window

The number of tokens supported by the input context window.

202.8K tokens

Maximum Output Tokens

The number of tokens that can be generated by the model in a single request.

16,384 tokens tokens

Open Source

Whether the model's code is available for public use.

Yes

Release Date

When the model was first released.

Apr 07, 2026 3 months ago

Knowledge Cut-off Date

When the model's knowledge was last updated.

Unknown

API Providers

The providers that offer this model. This is not an exhaustive list.

Baidu, StreamLake, DigitalOcean, Chutes, GMICloud, Wafer, DeepInfra, SiliconFlow, Crusoe, Phala, AtlasCloud, Novita, Nebius, Parasail, CoreWeave, Friendli, Z.AI, Venice

Modalities

Types of data this model can process.

Text

What is GLM 5.1

A fuller summary of positioning, capabilities, and source-specific details for GLM 5.1.

Capabilities

What GLM 5.1 supports

Reasoning Controls

OpenRouter lists GPT-5.5 with reasoning support and explicit reasoning-related request parameters.

JSON

Structured Outputs

Structured output settings are exposed through OpenRouter for schema-driven or format-controlled responses.

Tool Calling

Tool invocation and tool selection are supported in the routed OpenRouter interface for this model.

Multimodal I/O

This model accepts text input and returns text output.

CTX

Large Context Window

OpenRouter currently lists a context window of 202.8K with up to 16,384 tokens maximum output tokens.

Pricing for GLM 5.1

Primary API pricing shown in the same “quick compare” spirit as the reference page.

Input tokens $1.40 Per million tokens

Output tokens $3.08 Per million tokens

Price Comparison

Additional usage-cost dimensions synced into the project for this model.

Cache read $0.18

maxTemperature 1

maxResponseSize 16,384 tokens

API Access & Providers

Places where this model is available, based on the synced detail-page metadata.

Baidu StreamLake DigitalOcean Chutes GMICloud Wafer DeepInfra SiliconFlow Crusoe Phala AtlasCloud Novita Nebius Parasail CoreWeave Friendli Z.AI Venice

Provider Endpoints

Endpoint-level provider data currently available for this model.

Baidu

Max output: 131,072 1d uptime: 99.9% Supported params: 13 Implicit caching: No

StreamLake

Max output: 128,000 1d uptime: 99.4% Supported params: 13 Implicit caching: No

DigitalOcean

1d uptime: 95.1% Supported params: 11 Implicit caching: No

Chutes

Max output: 65,535 1d uptime: 80.8% Supported params: 15 Implicit caching: No

GMICloud

1d uptime: 99.5% Supported params: 8 Implicit caching: No

Wafer

Max output: 65,536 1d uptime: 99.4% Supported params: 19 Implicit caching: No

DeepInfra

Max output: 65,536 1d uptime: 99.7% Supported params: 17 Implicit caching: No

SiliconFlow

Max output: 131,072 1d uptime: 98.3% Supported params: 9 Implicit caching: No

Crusoe

1d uptime: 99.7% Supported params: 16 Implicit caching: No

Phala

Max output: 128,000 1d uptime: 79.3% Supported params: 18 Implicit caching: No

AtlasCloud

Max output: 202,752 1d uptime: 98.0% Supported params: 17 Implicit caching: No

Novita

Max output: 131,072 1d uptime: 99.9% Supported params: 13 Implicit caching: No

Nebius

1d uptime: 93.2% Supported params: 16 Implicit caching: No

Parasail

Max output: 131,072 1d uptime: 96.8% Supported params: 18 Implicit caching: No

CoreWeave

Max output: 202,752 1d uptime: 99.7% Supported params: 14 Implicit caching: No

Friendli

Max output: 202,752 1d uptime: 100.0% Supported params: 16 Implicit caching: No

Z.AI

Max output: 131,072 1d uptime: 99.6% Supported params: 9 Implicit caching: No

Venice

Max output: 80,000 1d uptime: 99.5% Supported params: 13 Implicit caching: No

Configuration & Parameters

The configurable options currently documented for this model.

Reasoning Effort

Toggle Group

Default: medium

Supported Request Parameters

Parameters currently listed by OpenRouter or the local catalog for this model.

Reasoning Effort

Resources & Documentation

Official model cards, release notes, docs, and other references synced from the source page.

OpenRouter Model Page OpenRouter

→