Stream of Consciousness

Mark Eschbach's random writings on various topics.

Playing with OpenCode + OpenRouter: Lessons from the Free Tier

Categories: ai

Tags: ai tools openrouter opencode

It happened again. Just as I was hitting a flow state, refactoring the internal/slacker package to meet my new golangci-lint standards, the dreaded message appeared: “Tokens exhausted, retry in 16 hours.”

When you’re pushing “Big Pickle” models (like GLM-4) to their limits in OpenCode, you quickly realize that code quality is expensive. Updating legacy code to modern standards—fixing context propagation, tightening error checking, and tackling complex code de-duplication—is incredibly token-heavy. Simple fixes are cheap, but architectural re-organization requires the model to hold the world in its head.

To keep the momentum going without waiting 16 hours, I decided to bridge OpenCode with OpenRouter and see if I could find a “free” path to productivity.

The OpenRouter Learning Curve

Setting up OpenRouter is straightforward, but it comes with a few “gotchas” if you aren’t careful. My first mistake was an expensive one: OpenRouter defaulted to a paid Claude model initially, which resulted in an immediate charge to my account.

Pro-tip for OpenCode users: Models are “sticky” for both the Plan and Build phases. If you change your model for planning, you must remember to manually switch it for the execution phase as well. If you don’t, you might find yourself accidentally paying for a premium model to do basic boilerplate work.

Testing the Free Contenders

With the wallet-guardrails in place, I started testing the free models available via OpenRouter. The results were a classic case of Reasoning vs. Speed.

Arcee AI: Trinity Large Preview (Free)

My first stop was Trinity. If you can stomach the latency, this model is a heavyweight in terms of logic. The reasoning is remarkably solid—it understands the “why” behind an architectural change much better than the faster alternatives.

However, the delivery is an agonizing crawl, often hovering at just a few tokens per second. It’s like working with a brilliant senior engineer who speaks at five words per minute. While its insights are valuable, it eventually got stuck on a complex edit during my session, forcing me to switch gears.

Stepfun: Step-3.5-Flash (Free)

Next, I tried stepfun/step-3.5-flash. This was a revelation in raw output, peaking at 198 tokens per second. It’s a firehose of code, but one that often misses the target. Stepfun is a speed demon with a short attention span and a flexible relationship with your project rules. It can generate 32,000 tokens in a burst, but it regularly ignores established standards and loses focus during complex edits.

Here is what I observed:

  • The “Thinking” Tax: It generates a massive amount of “internal thought” before taking action. It’s fascinating to watch, but it feels like it’s trying to convince itself of a plan it might not fully follow.
  • The Loop Trap: It occasionally gets stuck in logic loops. A quick follow-up prompt snaps it back, but you have to watch it like a hawk.
  • Unix Wizardry: To avoid complex quoting issues in multi-file edits, it frequently drops back to using standard Unix tools (sed, cat, etc.). This is its saving grace—it’s more reliable when it’s piping text than when it’s rewriting a whole block.
  • The “nolint” Cop-out: Initially, I was impressed when it added nolint directives with detailed explanations. However, I soon realized this was a failure of capability, not a design choice. It resorted to suppressing errors because it couldn’t figure out how to actually improve the code to meet the standards.

Maintenance is Key

When using these free-tier models through OpenRouter, you have to be an active manager. I found that I needed to run a /compact command and a “Let’s resume!” prompt roughly every 10 minutes to keep the context from becoming a tangled mess, especially with Stepfun.

Final Verdict

If you’re hitting rate limits on your primary tools, OpenRouter’s free tier is a viable “overflow” valve, but you have to pick the right tool for the task:

  • Need deep architectural reasoning? Trinity Large Preview is the choice, if you have the patience.
  • Need thousands of lines of boilerplate or quick Unix-style edits? Step-3.5-Flash will win on speed, but keep your git checkout finger ready for when it loses focus or starts ignoring your rules.