Skip to main content

Product Pricing

Explanation: The prices listed below are all inclusive of tax. Here, 1M = 1,000,000. The prices in the table represent the cost per 1M tokens consumed.

Model Description

  • Kimi K2 is a Mixture-of-Experts (MoE) foundation model with exceptional coding and agent capabilities, featuring 1 trillion total parameters and 32 billion activated parameters. In benchmark evaluations covering general knowledge reasoning, programming, mathematics, and agent-related tasks, the K2 model outperforms other leading open-source models
  • kimi-k2-0905-preview: Context length 256k. Based on kimi-k2-0711-preview, with enhanced agentic coding abilities, improved frontend code quality and practicality, and better context understanding
  • kimi-k2-turbo-preview: Context length 256k. High-speed version of Kimi K2, always aligned with the latest Kimi K2 (kimi-k2-0905-preview). Same model parameters as Kimi K2, output speed up to 60 tokens/sec (max 100 tokens/sec)
  • kimi-k2-0711-preview: Context length 128k
  • kimi-k2-thinking: Context length 256k. A thinking model with general agentic and reasoning capabilities, specializing in deep reasoning tasks Usage Notes
  • kimi-k2-thinking-turbo: Context length 256k. High-speed version of kimi-k2-thinking, suitable for scenarios requiring both deep reasoning and extremely fast responses
  • Supports ToolCalls, JSON Mode, Partial Mode, and internet search functionality
  • Does not support vision functionality
  • Supports automatic context caching functionality. Cached tokens are charged at the input price (cache hit) rate. You can view โ€œcontext cachingโ€ type cost details in the console