Integrating Kimi-k2 with Claude Code: An Experience Report

Introduction

As an AI enthusiast and developer, I’m always exploring ways to integrate powerful tools into my workflow. My primary coding assistant, Claude Code, is already configured with several models, including DeepSeek and GLM. In the spirit of endless tinkering, I decided to see how the much-talked-about Kimi large model from Moonshot AI would perform.

This article chronicles my experience integrating the Kimi-k2 model into Claude Code, from initial setup to real-world usage and my final thoughts.

First Contact: A ¥15 Welcome Gift

The first step was signing up. The Kimi website is clean, and the registration process was seamless. Upon completion, I was pleasantly surprised to find a ¥15 credit in my account. It was a friendly gesture that made a positive first impression and motivated me to proceed with the integration and testing.

The Setup: Clear Instructions

Next came the crucial integration step. I followed the official documentation from Moonshot, and the process was simpler than I expected.

In Claude Code’s model configuration, I only needed to set three key pieces of information:

API Key: Generated from the Moonshot platform backend.
Model Name: Set to kimi-k2-0905-preview.
API Base: Set to https://api.moonshot.cn/anthropic.

After saving the settings, the Kimi model appeared in my list of available models. With everything ready, I eagerly started my first conversation.

The Real-World Test: From Frustration to Acceptance

The Initial Wall: Unusable Rate Limits

However, reality quickly set in. Almost immediately after starting a conversation, I was hit with a “Rate Limits Exceeded” error.

I consulted Kimi’s rate limit documentation and found the problem. The Free Tier has the following limits:

RPM (Requests Per Minute): 3
TPM (Tokens Per Minute): 32,000

For a use case like programming, which involves frequent and rapid interactions, a limit of 3 requests per minute is far too low. Claude Code can be quite chatty in the background, sending multiple requests to fetch context and generate code. This made the complimentary ¥15 credit practically unusable for my coding workflow.

A Second Chance with a Top-Up

To give Kimi a fair evaluation, I decided to top up my account with ¥50. As soon as the payment went through, the rate limit issues disappeared, and I could finally use the model smoothly.

Performance and Quality Assessment

With the limits out of the way, I ran several real-world programming tests. Here’s what I found:

Response Speed: The kimi-k2-0905-preview model felt noticeably slower than both DeepSeek and GLM-4.5, which I use regularly. In a fast-paced coding session, this latency can be a significant drag on productivity.
Output Quality: Kimi’s code generation quality is acceptable. It can handle most basic tasks, but compared to DeepSeek and GLM, the results felt slightly inferior and often required more manual correction. I’d classify it as “usable, but not impressive.”
Cost: In terms of pricing, Kimi is more expensive than DeepSeek. When factoring in its speed and quality, the overall value proposition is less compelling.

Conclusion and Final Thoughts

Overall, this experiment was an interesting exploration that gave me a clearer understanding of where different models stand.

While the initial ¥15 credit is a nice welcome gift, the Free Tier’s restrictive rate limits make it unsuitable for high-frequency tools like Claude Code. After topping up, the model’s performance—in terms of speed and quality—didn’t quite justify its higher price point compared to the competition.

Therefore, I likely won’t be topping up my Kimi account again. I will continue to use DeepSeek and GLM as my primary models for coding. However, I will keep Kimi configured as a backup. It never hurts to have another option available, whether for specific scenarios or as a fallback if my primary models are unavailable.

For casual users or those with low-frequency API needs, Kimi might be a good choice. But for developers who heavily rely on AI for programming, there are currently more efficient and cost-effective alternatives available.

In the next post, I will share the Claude script I use to select different models when launching Claude Code.

Introduction#

First Contact: A ¥15 Welcome Gift#

The Setup: Clear Instructions#

The Real-World Test: From Frustration to Acceptance#

The Initial Wall: Unusable Rate Limits#

A Second Chance with a Top-Up#

Performance and Quality Assessment#

Conclusion and Final Thoughts#