Rendered at 19:27:19 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
simonw 1 days ago [-]
It took me quite a while to come round to OpenRouter. Originally I didn't understand why anyone would put a proxy between them and an LLM, but it actually adds some quite significant value:
1. By far the lowest friction way to support and try out all the models.
2. They offer billing caps! Most model providers still don't do this [EDIT: maybe they do, see reply comment], but if you're going to run anything in public it's very useful to have hard limits so it doesn't cost you $1m overnight because someone started abusing it.
3. Their rankings are one of the more interesting signals for which models are popular, despite their flaws (most OpenAI and Anthropic users don't go via OpenRouter, it's currently not possible to tell the difference between many users switching v.s. one "whale" changing their preferred model)
Given how API costs are becoming meaningful for a lot of companies now, having a provider like OpenRouter to help measure your spend and easily experiment with and switch providers feels like a valuable service.
GodelNumbering 1 days ago [-]
Another neat thing is, they publish hourly caching states for ALL model/provider combinations. I did some research on it to come up with a provider tiers list and found a bunch of open-source 3rd party hosts are simply trash tier https://dirac.run/posts/cache-hit-rates-agents
kflansburg 23 hours ago [-]
I would recommend tracking this data over time. I work on Cloudflare's KV cache for Kimi K2.6, and while there are periods where our cache rate is low, we are frequently in the 80-90% range. OpenRouter shows us at 87.3% at the time of this post. We observe cache rates change quite a bit from hour to hour.
GodelNumbering 22 hours ago [-]
True for Kimi, but the results I published are average across the models (CF has over 10 models on openrouter). Your current Kimi K2.6 is over 80% but Gemma 4 26B A4B is 0%. https://openrouter.ai/google/gemma-4-26b-a4b-it
This is also the reason providers like Anthropic scored lower because while Opus 4.7 is close to 90%, Opus 4.5 is 45%
kflansburg 15 hours ago [-]
My point was not about our ranking specifically, but the methodology of taking a point-in-time sample.
gnulinux 1 days ago [-]
Thank you so much for this! I've been working on exactly this problem this week (which OpenRouter providers have the highest cache rate on average) because cache cost is sometimes half your cost: I'd much rather use a provider with more input caching with a more expensive/better LLM. Your results and lists seem more comprehensive than what I've done so far. Very helpful!
rkagerer 23 hours ago [-]
Agents push the full conversation history into context every turn
Why?
Maybe this is a dumb question, but why wouldn't an agent "keep the conversation going", like I do when interacting with an LLM through a web page? (I understand how it's impractical for long-running tasks where the agent has to wait days for the next input, but assume that's not the majority of use cases)
sosodev 23 hours ago [-]
I’m not sure I understand your question. Every interaction you have with a model in a web page does the same thing in the backend. It feeds the whole conversation history, perhaps with a bit of processing, into the model so it can process the next generation. Filling the context window is how these models retain coherence.
isbvhodnvemrwvn 23 hours ago [-]
LLMs are stateless, to predict next tokens they need the history. When you write your own agents you will be very selective and might trim context and heavily segment different tasks, but generic ones don't do that (at best they spawn subjects to handle smaller tasks)
lxgr 21 hours ago [-]
That said, the KV cache is very much not stateless, so internally inference APIs will be highly incentivized to route requests to instances with as much a shared prefix cached as possible.
rkagerer 10 hours ago [-]
Thanks. If I ran it local, presumably I could keep the state cached forever. Can you "reserve" resources from a frontier provider to guarantee your state stays "hot"? (Analogous to reserving a whole VM instead of a slice)
For OpenAI, it seems like you can't prolong the caching duration for money. Duration is longer during off-peak hours for in-memory caching and up to 24 hours for extended prompt caching. https://developers.openai.com/api/docs/guides/prompt-caching
BTW, the openai responses api has a store parameter and a thread id input. Makes it possible to send a thread id and append a new message, ask for completion. So it feels like keeping the conversation going.
Technically it does retrieve the entire history and reevaulate it since the LLM is stateless. Just more ergonomic for the developer.
And prompt caching helps cut the costs down when a conversation drags on.
drewnick 18 hours ago [-]
Wow, this is refreshing DX compared to iterating all messages like we did back in '24.
ghrl 5 hours ago [-]
I would disagree. Having all the messages locally and sending them with the request means you can switch inference providers or even models mid-conversation. It also means that the provider doesn't store the entire context, which often contains massive parts of proprietary codebases, secrets and PII and instead the agent harness manages all that.
While a simple `continue thread` API field might seem more convenient, the cost is still determined by the input token count and cache rate, so it just abstracts this crucial implementation detail away.
BoredPositron 23 hours ago [-]
The "web page" does the same you just don't see it.
jampekka 21 hours ago [-]
The main friction reduction, for me at least, is the consolidated billing that avoids extra bureaucracy in corporate environments. The API-translation/abstraction tends to cause more problems than it solves.
I’d prefer something that consolidates billing, but still lets me use providers' APIs directly (or via some "raw HTTP" proxy). There are plenty of unified API gateways, but I haven’t seen one that is just billing/auth in front of the native provider APIs.
voidstitch 12 hours ago [-]
[flagged]
Aurornis 1 days ago [-]
Good points. The easy experimentation factor is helpful for development, though I would gently encourage everyone to migrate to the 1st party APIs for pricing at scale.
OpenRouter is also a good place to find free LLM access with a catch: You should expect that any inputs and outputs are going into someone's training database. Clearly anyone who can pay should be using paid models with privacy protections, but the free models have been great for learning and experimenting. Especially for younger people learning API programming and LLMs who may not have access to a credit card or funds.
bix6 1 days ago [-]
It’s interesting all the focus on opt-out from training. Sometimes I worry there is an intentional focus on that so people don’t think about the other ways the company might be profiting off our data. Like I pay for Anthropic and they don’t train on that but are they selling my “anonymized” usage data in some other way?
derefr 1 days ago [-]
From what I recall, these companies don't offer any option to opt out of your session transcript data being used (and sold!) for "regular" adtech targeting purposes.
nl 17 hours ago [-]
Anthropic explicitly state that they don't do this, even if you use the free plan and even if you don't opt-out of letting them use your data for training:
That answers for the "sold" part but not for the "used" part.
I.e. nothing about this statement prevents Anthropic from running ads within Claude, as long as they run the ad-placement auctions themselves, and so aren't leaking any of the data they're using to decide which placements are relevant to which users+sessions. (This is the same thing Google does for SERP ad auctions.)
But actually, and perhaps more interestingly, nothing about this statement prevents Anthropic from building a Google AdSense competitor either. Other sites (or mobile apps, etc) could plop in an Anthropic ad iframe; and it'd be Anthropic's knowledge of your interactions with Claude that would drive what ads would show up in that iframe. The embedding site doesn't know what ads the users are seeing, so that's still not "selling users' data to third parties", per se.
nl 17 hours ago [-]
> You should expect that any inputs and outputs are going into someone's training database.
> You should expect that any inputs and outputs are going into someone's training database.
True enough, in theory; but what exactly are you imagining would be a useful-enough signal in the OpenRouter request+response stream, that any company would want their data as training material?
Even a single OpenRouter-API-key-identified subscriber's traffic, may consist of an mixture of traffic from multiple different sessions, under potentially multiple different end-users. (Where, if the subscriber is doing security correctly, then their OpenRouter key lives on a gateway rather than in a frontend app; and so the only IP address / UA / etc OpenRouter sees is that of the gateway itself.)
And the traffic stream may also invoke multiple models, and provide multiple different system prompts for those models; which, while marked in the traffic (i.e. conveyed as part of each request), makes the resulting data much less useful in aggregate, than if it were all training data for one model with one system prompt.
Plus, there are no RLHF signals in OpenRouter data. Even if OpenRouter wanted to build a general model-neutral framework for collecting RLHF-type data, it can't force subscriber apps to do the UI-level stuff necessary to collect it (i.e. the things ChatGPT/Claude do, with "thumbs-down" buttons, A/B tested responses, etc.) Analysis would have to rely on pure transcript-level user sentiment extraction.
reed1234 23 hours ago [-]
You get a 1% discount if you give OpenRouter your traces so at least they think there's some (a lot) of value.
nl 17 hours ago [-]
> Plus, there are no RLHF signals in OpenRouter data. Even if OpenRouter wanted to build a general model-neutral framework for collecting RLHF-type data, it can't force subscriber apps to do the UI-level stuff necessary to collect it (i.e. the things ChatGPT/Claude do, with "thumbs-down" buttons, A/B tested responses, etc.)
The majority of RLHF data doesn't need this. The majority is software development and/or tool calling where the agent gets a signal back as to if it succeeded (eg compilation errors, test errors). It's true that end-of-trajectory signals (eg, did this task do what you wanted) are even more useful but even partial signals are great for RL training.
lxgr 21 hours ago [-]
> what exactly are you imagining would be a useful-enough signal in the OpenRouter request+response stream, that any company would want their data as training material?
Isn't this a treasure trove for any model distillation effort?
gbro3n 1 days ago [-]
I've wondered this too - exactly how are our inputs and outputs useful as training data? So I asked Gemini. Apparently using negative sentiment in user or llm responses can serve as RLHF, and the human prompts can also serve as useful data for what problems the llms need to be able to solve. There's also that smaller models can train on and improve from data from larger models but that's less relevant when not switching models in context.
mannanj 21 hours ago [-]
How about protection of intellectual property? Doesn’t have to be patented to be valuable.
dghlsakjg 1 days ago [-]
[dead]
tasuki 1 days ago [-]
> Clearly anyone who can pay should be using paid models with privacy protections
Clearly, anyone who needs privacy should be using models with privacy protections. Some people build open source and the models will get the code anyway.
derac 1 days ago [-]
I recommend nvidia nim for completely free dev access for young people.
acka 24 hours ago [-]
It's free, but not unlimited. Besides rate limits, new sign-ups get 1000 credits (requests), and once those are gone, they're gone for good. Only business accounts might get a couple of free refills.
ssivark 17 hours ago [-]
Is there a way to check/track your available credits?
mrtksn 14 hours ago [-]
Did you know that if you put some money into your OpenAI account it expires after a year? I was very annoyed when that happened, no refund no warning it’s just gone as if it was a promo credit.
Openrouter is very nice since it puts a barrier between you and those suppliers that were supposed to be like utilities. I got the feeling that if OpenAI was left alone they would be nice as a telco.
alecco 1 days ago [-]
At the moment for DeepSeek V4 it messes up caching and that's a key pricing feature for V4.
The way how you manage the caps in OpenRouter is how every metered API provider should do it: keys have limits, and you can change the limits, and you set the limits to refill periodically, and you can create as many keys as you want.
michaelbuckbee 7 hours ago [-]
It's not just comparing all the models, it's also comparing all the providers and configurations of those models.
If you're doing any kind of production AI work you'll end up with outages caused by calling a single provider, OpenRouter seamlessly switching between providers is a godsend for uptime.
But even more than that there's meaningful cost+speed differences.
Here's Sonnet 4.6 being served direct, via Amazon and via Google
I love their product and use them myself. But where's the value proposition for investors? Unless they get purchased by one of the large cloud providers, they will get pushed out of the market sooner or later.
What's the value proposition for the typical AWS startup to go with openrouter, if Amazon offers similar rates with direct integration into all their other offerings?
The only reason OpenRouter can exist at the moment is because we are in the wild-west phase of this technology, and lots of people and companies are exploring. In 5 years they will have to have transformed their business fundamentally, or go the way of the dinosaurs.
sowbug 22 hours ago [-]
If you believe there will be lots of LLM providers in the future, then OpenRouter could be a DoorDash play.
Established restaurants didn't need DoorDash because they were already on everyone's speed dial. But new or small restaurants couldn't afford to advertise or maintain a team of delivery people. DoorDash created a two-sided marketplace that made it a lot easier for new entrants to bootstrap. Today even the established restaurants have to pay them their tithe because hungry people have learned to start with the DoorDash app. A bit of a prisoner's dilemma.
If OpenRouter plays its cards right and gets very lucky, a large number of people will configure their hungry LLM clients to start with OpenRouter, and then LLM providers will have to join the marketplace or else miss out on all those customers.
remexre 17 hours ago [-]
not sure that works as well when they don't own their API though; how much software is openrouter-only in a way that's not 5min of deepseek to patch the source for, or 15min of opus to patch the binary instead
sowbug 3 hours ago [-]
I agree that technical lock-in wouldn't cause the consolidation. Instead, if it happened, it would be because of the network effects of the two-sided platform.
People could email cat photos and resumes. But Facebook and LinkedIn are where everyone already is, so that's what they use instead.
yencabulator 4 hours ago [-]
Everyone (except Anthropic) seems to be settling on the same API, so nobody "owns it" anymore. I expect there to be practically no software that's OpenRouter-only.
DoorDash is viable only because the restaurant business (minus national chains) is extremely balkanized. Restauranteurs have very little power.
rat9988 24 hours ago [-]
They never claimed it was technically hard. Brand recognition is their forte. They found out there is a need, developped a product around it.
pizzly 22 hours ago [-]
AWS does not provide nearly as many different models as OpenRouter. Perhaps they have an incentive to not do that, move slower as a big company or more legal risks to consider. If AI model outputs becomes commoditized then having one place where you can switch effortlessly from one to the next based on price might just justify OpenRouter. It could become a commodity marketplace/exchange.
rsalus 23 hours ago [-]
functionally they operate as a marketplace for cloud providers. I feel like there is value there, especially as API costs rise and companies explore cost-saving/efficiency. IMO, this is a particularly attractive value prop in the SMB space, where it is common to interoperate between multiple SaaS/software stacks.
brianwawok 24 hours ago [-]
Yah I don’t think they have a long term play without a pivot
scosman 17 hours ago [-]
They also do a good job working over the little differences between APIs. Tool calling sometimes breaks on major providers, and OR will patch it before the provider does. Libraries like LiteLLM do this too, but OR is faster.
MillionOClock 24 hours ago [-]
Billing caps are underrated! I don't understand why they aren't present everywhere. As an indie dev there are some services I'm really hesitant on trying by fear of getting an enormous bill for a mistake, this is even more true with vibe coding IMO.
brianwawok 24 hours ago [-]
I’m just not sure they have a moat or a long term play? I put $20 in and tried a few models. Then I went right to the model provider to put in $1000 and avoid the middleman tax. Now imagine a big corp spending millions on AI. That’s a lot of middleman tax.
kristianp 14 hours ago [-]
The value of openrouter isn't as a middleman for users of claude, gemini or chatgpt, it's for those looking to find a model that fills the use case at a lower price than the top 3.
Art9681 7 hours ago [-]
Except the latency is significant and not suitable for clients with advanced agent features. The experience between using a frontier model via first party API and the best open weight models via OpenRouter is night and day. Can't get any real work done with it.
brianjking 24 hours ago [-]
I tend to agree, but there's also a lot of tax to build and maintain the different provider abstractions that OpenRouter eliminates.
Everything has a cost of some sort. It's just who you're going to pay and what the currency is.
TurdF3rguson 22 hours ago [-]
The top model / prices are changing all the time though. Lately I've been auditioning 4-5 models before a big ingest and I wouldn't be able to do that easily without OR.
polski-g 22 hours ago [-]
And what do you do when Fireworks is down? If you stuck with Openrouter, when Fireworks is down it would auto route you to Friendli.
What if Fireworks stops offering your preferred model?
brianwawok 20 hours ago [-]
Honestly I am 98% on Claude, and when claude is down I suffer through GPT.
BoredPositron 23 hours ago [-]
There are enough services that don't want the model provider to know who they are.
a13n 1 days ago [-]
Both OpenAI and Anthropic have billing caps… who doesn’t?
> Long-running tasks like batch mode completions and agent sessions may incur overages beyond your project spend cap.
> Billing data processing times can be delayed in AI Studio, up to around 10 minutes. You may experience overages beyond your project cap if billing data hasn't processed before more charges are accrued.
I spent two hours the other day trying to figure out how to manage spend on gcp, i gave up and used openrouter and cloudflare.
movedx01 12 hours ago [-]
AI studio added it recently, Vertex not.
23 hours ago [-]
1 days ago [-]
BoredPositron 23 hours ago [-]
There is a scheme to send gifts with a compromised anthropic key even if the limit is reached.
wahnfrieden 23 hours ago [-]
Microsoft
scrollop 8 hours ago [-]
Though you pay 5% fees? Not worth it for me with the volume of tokens used.
JumpCrisscross 23 hours ago [-]
> By far the lowest friction way to support and try out all the models
Check out Kagi Ultimate.
MicrosoftShill 23 hours ago [-]
Would you recommend Kagi Ultimate over OpenRouter? I'm already a customer of Kagi and would rather give them my money, but only if I'm not really compromising.
JumpCrisscross 21 hours ago [-]
> Would you recommend Kagi Ultimate over OpenRouter?
For personal use, yes. The all-in pricing model encourages experimentation. And the privacy pitch seems tighter.
MicrosoftShill 16 hours ago [-]
The privacy part is definitely important. Appreciate you!
Maybe someday the VM I run agents in will have a dedicated GPU so that I can stop using APIs altogether. One can dream...
fontain 1 days ago [-]
Out of interest, why OpenRouter over a free option like Cloudflare’s AI gateway or another paid option like Vercel’s — any specific benefit to OpenRouter you’ve found, or just first you used that’s good enough?
simonw 1 days ago [-]
I'll be honest, I hadn't clocked that Cloudflare and Vercel were offering equivalent products.
I didn't know about these options either. I am using Cline: Cloudflare isn't an option but Vercel is. My spending is pretty low overall now that I'm using local models much more but good to know that there are cheaper alternatives to try or at least suggest to others.
Other features I've just noticed:
- configurable prompt injection protection using OWASP regex (https://cheatsheetseries.owasp.org/cheatsheets/LLM_Prompt_In...)
- configurable PIM protection for outbound prompts
- input/output logging
- "JSON healing" to auto-correct minor hallucinations
Lots of other stuff too. The business model seems pretty simple and the value-add features don't look particularly expensive or difficult to copy.
js4ever 23 hours ago [-]
Separation, I don't want to have my domains blocked the day AI bill go Brrr
wahnfrieden 23 hours ago [-]
Make two accounts...
1 days ago [-]
stymaar 1 days ago [-]
And what is their business model?
minimaxir 1 days ago [-]
Credit is prepaid at a 5% surcharge.
MangoCoffee 1 days ago [-]
middle man? Model providers/hyperscalers -> OpenRouter -> consumers?
coffee farmers -> middle man -> you
behnamoh 1 days ago [-]
Commission on API calls extracted from you when you charge your account.
sarjann 20 hours ago [-]
There is also the ability to fallback is one of the clouds degrades in performance.
maxloh 1 days ago [-]
OpenRouter is merely only a proxy. They also host some open-weight models
simonw 1 days ago [-]
I don't think they do. They proxy to a bunch of open-weight model hosts, but I've not seen that they host them themselves.
Unfortunately the model companies will simply reinject the friction by mandating BYOK (Bring Your Own Key -- i.e. the end user must onboard with each model company individually).
OpenAI and Anthropic have already done this.
Mandated BYOK will sink OpenRouter.
what 18 hours ago [-]
What kind of compensation are they giving you?
simonw 7 hours ago [-]
None. It's sometimes possible for people to say something positive about a project without being paid!
(If they were paying me they got a bad deal, since I called out the flaws in their leaderboard approach half way through my post.)
SilverElfin 1 days ago [-]
The biggest benefit is that it creates competition among models. If more people use open weight models or models from other providers, it’ll be harder to ban them. Which is what OpenAI and Anthropic will try to accomplish. OpenAI by lobbying the Trump administration for favorable treatment (see Brockman’s MAGA PAC donations), Anthropic by using religious leaders and nonprofits to push “safety” justifications for difficult regulations.
numlocked 22 hours ago [-]
Hi HN! OpenRouter co-founder and COO here. Lots of questions about why we raised!
First off: We remain founder-led and founder-controlled, and intend on being here for a long time, creating awesome products for builders all over the world. We are basically a bunch of tinkerers who like building things, and try to make stuff that we would like, when building with AI.
Since this is about the raise though, happy to share perspective on it.
We believe that strong companies should have a strong balance sheets. We touch large volumes of spend, and have large spend commits across the ecosystem; having the cash to withstand what may come is a responsible buy-down of risk, and makes the company extremely durable.
It also tells our larger customers and provider partners that we will be able to continue to serve them (and pay our bills) for a long time to come. We don't need venture dollars to continue scaling (indeed the business is healthy) but you know when you don't want to raise $100m? When you really need it!
This is also good validation to employees (current and future) that the value we are creating together is real. We also take seriously our obligation to make a return for anyone who invests; we aren't valuationmaxxing and have the privilege of getting to pick who we work with. I don't think that gets a lot of airtime in the overall start-up world, but I think it's important!
Happy to answer questions and THANK YOU to everyone here who uses OpenRouter, and to everyone who has feedback for how we can improve!
potamic 14 hours ago [-]
> We don't need venture dollars to continue scaling (indeed the business is healthy) but you know when you don't want to raise $100m? When you really need it!
That's a nice narrative but I suspect you're not touching upon the investor pressure side of things. Your earlier investors would be upon you to show a multiple in valuation beyond what the balance sheets can show. The only way to do that is to raise more money.
The problem with this is that you're now beholden to another set of investors who will also expect a multiple on their investment which makes increasing valuation your primary objective, even to the detriment of the business. With a margin business you could sustain for a long time even when the market stagnates, but you've lost that option when you first took money from someone. It's an all or nothing play now.
numlocked 7 hours ago [-]
Great investors are helpful, not harmful :) You want accountability from smart, experienced partners!
potamic 7 hours ago [-]
Sure, but only to the means where they see a multiple. And sometimes that be at odds with the fundamental value prop of a business.
numlocked 6 hours ago [-]
By default (and in most cases) investors and operators are aligned. When we diligence our investors, we call companies they worked with where things didn’t go well, and speak to those founders. Understanding how investors operate when it’s not all up-and-to-the-right is important when picking partners!
cheschire 10 hours ago [-]
I’m interested what you believe the intent of your message to be. You’re talking to a COO that just raised money as if you’re mentoring someone about to approach VCs for the first time. Hugely patronizing attitudes often just get a pass here on HN, but what is your purpose for using one here?
potamic 7 hours ago [-]
I think you misread my comment, I might've been lazy in constructing it. I don't mean to mentor anyone, rather I'm putting out my read of the situation so there's a common ground over which to discuss.
For me, raising $100m when it's not needed doesn't add up. Nobody lends money with the idea to "keep it, just in case". There are always commitments and expectations and obligations to meet those expectations. So when they said they didn't really need to raise, while also not talking about investor expectations, feels there's more to the situation than is being let on.
humam_alhusaini 21 hours ago [-]
What will OpenRouter use the $100m for? You say that it "makes the company extremely durable" and is "good validation to employees", but I'd imagine that there are more interesting things to do with 100 million dollars.
Nav_Panel 16 hours ago [-]
Think about what OpenRouter primarily traffics in and what you can do with that raw material.
laweijfmvo 17 hours ago [-]
seriously. you don’t raise $100m just to be safe unless you’re bleeding cash or you lied to the investors.
tehlike 15 hours ago [-]
there are companies who raised xxxM$ and never had to touch the money because they were already profitable.
numlocked 7 hours ago [-]
Yep!
Everyone wants a conspiracy, but what I originally posted is in fact the boring truth. Having a bunch of cash in the bank makes for a durable business!
CMay 7 hours ago [-]
The Openrouter website says that y'all do not train on the data, but it does not make it clear that the data is not shared with any 3rd parties (other than the LLM provider) who might train on it.
There is the example of Apple and Google providing transport for push notifications, but claiming to delete the content and only preserve the metadata.
What is Openrouter's policy on this? Is the logging of user data an essential part of the business model, or is the primary business model really facilitating a proxy between multiple services and nothing beyond that? If everything is logged, do y'all store it securely so that if one database is stolen (by China for example) then it's not useful on its own?
With the race for AGI and everyone training on each other's outputs, Openrouter is clearly in a position to abuse all of that even though the major providers weaken their output to limit the value of distilling them.
numlocked 7 hours ago [-]
We have never sold any prompt data to anyone, in any form, and have no plans to do so. Full stop.
segmondy 7 hours ago [-]
Can you also confirm that you do not log/retain it. 100% pass through. If you are logging it, you could one day change your position on that.
numlocked 7 hours ago [-]
We have two mechanisms whereby we retain data. Both are opt-in and off by default.
One mechanism where you get a discount and we can use the data (in theory this does mean sell it; but our intent is to use it to make efficient dynamic routing solutions. But absolutely we could one day sell it) and another where we retain it for you so you can see it in your logs. We have no rights to this data in any way. This is similar to how any tracing/logging solution works.
Both and opt-in. If you don’t opt in, we don’t retain anything and are a pass through with regards to your prompt data.
All of this is carefully documented and I encourage you to explore and chat with the docs.
jampekka 20 hours ago [-]
Would it be possible to get "raw" access to the provider APIs, but still keep the consolidated billing? The unified API is great when it works, but it often causes hassle with more exotic use cases and new API features.
matja 4 hours ago [-]
+1 this. Example: Using Mistral TTS voice cloning appears to be not possible via the "providers" pass-through object in the OpenRouter API because some parameters are always forwarded which conflict with the provider's parameters.
numlocked 7 hours ago [-]
Interesting. Will look into it! We are releasing pass through API params soon which might hit the bid, but is a bit different than what you are describing.
jampekka 4 hours ago [-]
API param passthrough will probably help with many of the cases. Things like sampling params and constrained decoding and returning logits tend to be very finicky with the translated params. But the return value translation also makes debugging these harder.
While I'm at it, another annoyance is that OpenRouter doesn't seem to have a very good API playground. The chat does work, but the params exposed there are quite limited and it's not clear how the GUI fields map to API params. I now have resorted to exporting the chat and figure out the params from the export JSON. Just having an option to get a curl command for the chat call would help a lot, and shouldn't be hard to implement.
Edit: I think the ideal implementation for the direct API access would be that I could generate API keys for the provider at OpenRouter that I would give in the provider API calls, but that would get billed through OpenRouter. Second best would probably be a raw HTTP proxy/tunnel that injects OpenRouter's own keys (or however it is that you call the providers). I don't really know though how you call the providers and what kind of new provider integrations these would require.
Oli_dev 20 hours ago [-]
i second this, And I think this will only get worse as the bigger companies seek to decommoditize their models and make moats
Oli_dev 20 hours ago [-]
Heya! First off, I love your product. consolidated billing/auth solves a big pain-point, so thank you.
Less about the funding and more about the long game: where do you see OpenRouter in 3-5 years, and which product bets are you most excited about right now? Do you guys think with this new raise you'll branch out into other adjacent verticals?
numlocked 7 hours ago [-]
Our general theory of the case is that, in the not so distant future, inference will be the second largest opex line item for most companies (behind headcount) and that sourcing, measuring, and governing those tokens is a massive horizontal opportunity.
We will inevitably expand into adjacencies because we like building things and experimenting and we have a lot of people with great taste who are likely to ship cool things that customers want to use!
Edit: also - THANK YOU!
treme 8 hours ago [-]
how come cancelling api keys with left over prepaid credit isn't refunded?
numlocked 6 hours ago [-]
Refund policies are clearly documented in our terms. We actually DO offer refunds within 24hrs of credit purchase, which is significantly more flexible than most companies that operate in a similar way. And we try to use good judgement when there are extenuating circumstances.
treme 5 hours ago [-]
I understand that the refund policy is documented. But “clearly documented” and “fair to the customer” are separate questions.
If a user cancels API access while still holding prepaid credits, that unused balance represents compute they never consumed. Unlike a shipped physical product or a fixed one-time service cost, unused API credit does not seem to impose much marginal cost on OpenRouter to reallocate or refund.
So the issue isn’t whether the policy is disclosed. It’s whether keeping unused prepaid credit after cancellation is the right default, especially when the user is no longer able or willing to use the service.
Something1234 17 hours ago [-]
The biggest missing feature for me is the differentiation on zero data retention providers and if a model works for the rules I defined. Right now there’s no way to hide the providers who don’t work for the zdr rules
numlocked 7 hours ago [-]
What differentiation are looking for? We have good documentation of every provider and what their data retention stance is, and you can figure allow/blocklists for all providers.
Check out the Guardrails section under settings and tell me what’s missing!
VitaliyKorbut 20 hours ago [-]
Thank you for Openrouter, used it briefly. Tested the product a year ago or so, and wasn't able to get structured output from google's gemini model via openrouter.
pclark 20 hours ago [-]
Are you thinking of hiring any PMs? love your product!
numlocked 6 hours ago [-]
Just sent you a DM on twitter!
minimaxir 1 days ago [-]
As someone who uses OpenRouter extensively (and wrote an unintentional adjacent PR piece a few days ago: https://news.ycombinator.com/item?id=48317294 ), it's definitely the best way to try out new models without fiddling with each providers distinct APIs which is becoming a recurring concern as of late.
That said, I don't understand the people who use something a full agentic backbone with expensive models like Claude Opus with OpenRouter because that 5% surcharge is meaningful at that level of cost instead of going with the source API providers. But people are clearly doing it, and it's pure revenue.
01100011 21 hours ago [-]
IDK, but that sounds like something that would be better implemented with an open-source library to which providers supply support patches. Why do I need a company to act as a proxy and not just run a relatively simple shim layer on my machine?
I'm just a stupid systems programmer working in the bowels of AI and I understand there is a lot of seemingly pointless software which exists solely to provide a slight boost to convenience in exchange for money. Is OpenRouter just that? Do they actually host models themselves or centralize billing amongst various providers?
542458 20 hours ago [-]
A library with a bunch of different providers doesn’t solve the payment/billing problem (which is one of the main openrouter benefits). IMO being able to buy credits and not have them locked to one provider is worth the 5% to me.
bwfan123 1 days ago [-]
There is a lot of dumb token spend right now - tokenmaxing and such. Economic cost of token is not being evaluated carefully because there is fomo and no one wants to be left behind. But folks are waking up to it, and dumb token spending is not sustainable and will revert.
furyofantares 1 days ago [-]
Better uptime? Given it will be routed to one of Anthropic, Amazon Bedrock, Claude Platform on AWS, Google Vertex (Europe) or Google Vertex
yencabulator 3 hours ago [-]
That seems also very possible to do in a client library. (Though that would benefit from an across-my-nodes gossip of backend statuses.)
Oli_dev 20 hours ago [-]
im happy to pay 5% extra for consolidated billing and usage limits. It just makes it easier
nadermx 1 days ago [-]
Convenience has a markup
enraged_camel 1 days ago [-]
>> it's definitely the best way to try out new models without fiddling with each providers distinct APIs which is becoming a recurring concern as of late
Why not... Cursor?
827a 1 days ago [-]
Cursor only supports a single model (Kimi K2.5) not made by the Big 4 labs (OpenAI, Anthropic, Google, xAI). Cursor is actually extremely bad at wide model support.
OpenCode is much better at it.
anon7000 1 days ago [-]
And its own model (Composer 2.5)
ascorbic 24 hours ago [-]
Which is finetuned Kimi 2.5
827a 16 hours ago [-]
Cursor is functionally "one of the big 4 labs" (SpaceX).
senordevnyc 8 hours ago [-]
First, calling xAI one of the big labs is pretty funny.
Second, Cursor hasn’t been acquired by SpaceX yet, and there’s a good chance they never will be.
largbae 1 days ago [-]
I use Cursor with OpenRouter for some projects and it's great. Most of the time I just use Auto and let Cursor use its model or choose. If I run out of quota, or I'm not getting what I want, I switch off Auto and use OpenRouter to pick Opus, Codex, or whoever(all are available). Can continue the same context if you want, type "please continue" in the agent prompt, and on you go.
zenoprax 1 days ago [-]
Cursor has limits even when using your own key. I was even cut off using a local model. I guess they use some sort of harness that requires non-local resources? I'm not sure I've actually tried to use Cursor in a fully-offline scenario yet. Cline works well enough and doesn't require any sign-up.
ajyoon 1 days ago [-]
Cursor's coverage on open weight models is very minimal, and it's irrelevant for testing models in your actual application.
nickrj 23 hours ago [-]
[dead]
throw10920 1 days ago [-]
I think that OpenRouter will continue to be very popular while there lots of experimentation in the LLM space, and while the "current favorite" model continues to change between various frontier labs.
After things begin to settle down, we'll probably see a consolidation of both frontier and open-source models - and then OpenRouter will become less useful, because that 5% overhead is well worth it when you want to try 20 models from 10 labs, but harder to stomach when you only need 5 models from 2 providers, and each of those providers has its own API knobs that you can tune to make things even cheaper.
scrollop 8 hours ago [-]
Agreed - 5% fees is quite high considering the volumes of tokens involved.
tom1337 1 days ago [-]
Is the Open in OpenRouter the same as in Open AI? I couldn’t find any repository or hosted code. Thought it'd be a open source, self hostable tool with a cloud offering but seems its just the latter?
alecco 1 days ago [-]
I assumed they were open source but now that I checked they are not, they say "Open" because they route to third-party open models. Yikes. Another VC crap layer?
omneity 24 hours ago [-]
The Open in OpenRouter is the same as in OpenSea, as it's the same founder. Make of that what you will.
dnnddidiej 18 hours ago [-]
I make of it they are good at riding hype cycles
bijowo1676 1 days ago [-]
tbh anyone can rig up something like openrouter in a few nights with claude code.
its just a proxy
kvirani 21 hours ago [-]
Today your statement is a little too ambitious but I agree with the overall point that the inherent effort based moat in SaaS is mostly gone and now it is really about personalizing your own.
The most common counter-argument that I've seen here is " Yes, but no organization wants to manage all of their different operational tools. They would rather just outsource that responsibility to third-party entities".
I'm not sure I fully agree with that counter. Because agents can be viewed as third party entities in some sense. If not today then maybe soon.
staticshock 20 hours ago [-]
An agent you can sue would count as a legitimate third party entity. Suing today's agents won't get you very far.
m1keil 1 days ago [-]
Open as in single API layer which allows you to swap the model under it.
weiliddat 1 days ago [-]
One thing that OpenRouter makes easy is the ability to manage API keys (mint new ones, expiry/limits per key, etc.) that I wish that other providers would make possible/easier.
So many use cases, like sharing AI/assisted features externally, with the ability to use those features but also limit the fallout if its shared / used for other purposes, without jumping through more fallible hoops like safeguards etc.
dvratil 1 days ago [-]
One thing I haven't seen mentioned here yet and really like about OpenRouter is their openrouter "meta" model, that automatically routes the prompt to an appropriately capable model. Saves me a ton of money on not routing everything through Opus, but not giving me bad results when I ask something more complex, which gets autorouted to Opus.
mikalauskas 13 hours ago [-]
Openrouter is popular because China, Russia, Belarus, Iran are blocked in Gemini, anthropic and openai. And you are able to pay with Bitcoin.
That's a lot of paying users
freakynit 24 hours ago [-]
"Over the last six months, weekly volume on OpenRouter has grown from 5 trillion to 25 trillion tokens".
DAMN!!
That's 41+ million tokens every second. That scale is crazy for such a small team of 48-50 people overall.
wahern 21 hours ago [-]
Assuming that's token cost upstream, and given the multiplication factor in tokens processed per query, that seems like maybe a few thousand requests per second at most? It's impressive but for a 50 person startup team expending millions per month, that seems about on par.
Would it be as impressive if the context were an email provider accepting thousands of message per second, or even one accepting thousands of messages per second and submitting them upstream for spam detection? The token count might even be higher in that case, but rightly or wrongly I think it would get a yawn on HN.
It says more about how far the industry has come these days in terms of scale on the one hand, but also on the other hand the huge blowup in data and processing for nominally simple requests. Nonetheless I'm sure the team is exceptionally skilled and it's certainly a laudable accomplishment.
plaidfuji 18 hours ago [-]
To put that in perspective, if you assume a token is 4 bytes, that’s about 164 MB/s of traffic, which sounds a bit less staggering.
dnnddidiej 18 hours ago [-]
Dealing with a lot of traffic for a small team isn't hard in itself. If it is easy to parallelise you just need to horizontally scale. And most concerns can be added as sidecars or middleware. Rate limits, auth, etc. Basically this is a kubernetes cluster. For a 3 person startup hard. For 50 pretty managable.
wg0 22 hours ago [-]
Do you really need VC money to put a proxy in front of other APIs? For what exactly? Marketing? What exactly you want to maket? You're already known.
Infrastructure? For proxying requests more infrastructure? You could just pay Cloudflare.
More engineers? But you yourself are the stret seller for the same snake oil that engineers aren't anymore necessary
So what that 100 million dollars are for?
svnt 19 hours ago [-]
They are because they can, because it serves as social proof, which convinces their customers that they are doing something of deeper value. Then in reality they will use it to develop channels preparing to use their customers (and the data customers trust them with) as the product in the future.
wg0 8 hours ago [-]
That signals the reverse that they might jack up prices at any time for their 10x returns for the investors. How does that instil any confidence at all?
I’m still pretty skeptical about OpenRouter. I have a client implemented for them so I can use them with my harnesses, but at the same time that client was generated and tested in an hour or so just like all of the other llm provider clients that I have. Using these services interchangeably by just swapping out clients has so far been working well for me. I think when it comes down to it, the only real inconvenience that they’re solving is where I put my credit card number. Is there something key that I’m missing about this service (besides it being a nexus of attention) that warrants this kind of investment? Or is this truly the bar for starting a successful AI company :P
Scene_Cast2 1 days ago [-]
I was sort of hoping that they were bootstrapped or at least non-VC funded. I'm wary of them introducing consumer-unfriendly revenue-generating schemes.
brcmthrowaway 23 hours ago [-]
How about HuggingFace?
minimaxir 22 hours ago [-]
Hugging Face is most definitely not bootstrapped.
drob518 21 hours ago [-]
Hm. Would be interesting to see the finance spreadsheets for this. Typically the B-round guys are looking for something approaching a 10x return. Can anyone justify OpenRouter being worth $1.1Bn? That seems really high for a “management”/man-in-the-middle play. But sure, AI and all. But I’m old enough to remember when every dot-com was a billion dollar valuation, too.
dnnddidiej 18 hours ago [-]
Well yeah if they route most of the worlds tokens easily. What if we get to a point where the 5% is paid by the supplier and they take over more of the infra /routing side that they do. Lots of ways it could be a 10B company.
drob518 12 hours ago [-]
But is that likely, particularly as the market matures? That seems unlikely to me. We had some of the same sorts of management middleman tools and organizations ideas 15 years or so ago in the cloud computing world and all that pretty much went away.
svnt 19 hours ago [-]
The play is the same as it always was, I assume: your data is the long term product.
mmarian 1 days ago [-]
An amazing service. I use its 20+ free LLM options to allow completely free usage of LibreOffice AI extension with no signup https://librethinker.com .
bbg2401 1 days ago [-]
I'm banned from using the free options. At some point they flagged my account as having engaged in model training against their ToS. This despite my account using around £15 worth of tokens over several months, nearly entirely through BYOK providers.
The handful of times I did try a free model is when I used their chat interface to quickly compare a few open weight models with a single prompt. That's the only usage I can think which could have triggered the block on my account. Even still, what's the point in have the simultaneous chat feature if using it veers so quickly into a ToS violation.
Their support is beyond useless in helping understand the situation. I don't think I managed to speak to anyone other than Tony Bot (or whatever it was named).
Since you're BYOK, have you tried self-hosting LiteLLM? It's what I use for the BYOK option on my service.
bbg2401 19 hours ago [-]
Good shout, that's what I use.
My usage of OpenRouter was limited to casual throwaway experiments with coding agents and quick, surface-level exploration of new and unfamiliar models in the chat interface.
My comment is just an expression of a festering grudge over the unannounced, unexplained sanction on my account and the lack of transparency and feedback from the non-existent support team. There's no OpenRouter shaped hole in my personal workflow, fortunately.
drak0n1c 17 hours ago [-]
If you want to avoid bans, Venice is another good option since their focus is uncensored and privacy. They run models themselves alongside offering OpenRouter-style routing for frontier and niche models - but at least they fully anonymize the user and never ban.
frankest 1 days ago [-]
Using Tinfoil, Replicate, Cerebras, and OpenRouter. Competition is good.
nkmak 1 days ago [-]
OpenRouter’s biggest value to me is reducing switching costs between models. The markup matters at scale, but for exploration and early-stage development, the convenience is hard to beat.
sailfast 23 hours ago [-]
At what valuation?
I’m a user and I like the routing layer and not having to change things up too much, but I’m not sure why a solid business model for this product would require this much money at this kind of valuation unless they’re trying to buy data center capacity to self-host models eventually?
asmosoinio 21 hours ago [-]
1.3B$ according to NYT, according to Techcrunch:
> While the startup didn’t disclose its new valuation, The New York Times reports that it landed at about $1.3 billion post-money.
shostack 22 hours ago [-]
Yes. Happy for the team but I do not like that this likely means for the future growth expectations as a customer. Hint: probably higher costs and being squeezed more.
Suppafly 16 hours ago [-]
OpenRouter is such a weird name for this. I thought it was going to be something like the Tomato firmware for routers, not some AI interface thing.
spaceman_2020 13 hours ago [-]
OpenRouter’s founder jumping from NFTs to AI has to be studied. Seriously the goat of sector rotation
zero-dark 1 days ago [-]
Congrats to the OpenRouter team for securing this round of funding.
The 5% surcharge for their pricing model may not be palatable to enterprises. In fact, the OpenRouter team could be a pivotal part of the enterprise GenAI stack if they can allow configurable, pluggable endpoints for routing directly to enterprise vetted endpoints to 1P/3P LLM APIs. A couple of large companies I’ve worked so far kinda have this system in place, albeit the dev and maintenance cost and of setting up such an “LLM gateway” could be significantly reduced with OpenRouter. I feel that this is largely an ignored, forgotten part of operating GenAI apps at scale.
dmurray 24 hours ago [-]
> The 5% surcharge for their pricing model may not be palatable to enterprises
Enterprises appear to be paying the API rates which are 10x (1000%) what are available to individuals, so I would not be confident they are sensitive to a 5% price change.
That said, the attraction of OpenRouter to enterprise customers should be that they save you >5% on average for a product <5% worse.
zero-dark 23 hours ago [-]
[dead]
antonkochubey 23 hours ago [-]
Enterprises are paying 500% - 20000% markup for AWS services so why do you think 5% will be a problem?
1 days ago [-]
zuzululu 24 hours ago [-]
I still don't get the value proposition: You rarely have to use all the models, you will likely end up with a few for your workflow but there is a way to use them/try all if you wanted to, neato.
Also one scary issue I had with OpenRouter in the early days, I think I saw somebody else's context and there were weird Chinese characters, haven't touched it since.
iqihs 24 hours ago [-]
agreed, unless you need to use all models i'm sitting here wondering why orgs would want to introduce third party risk into their pipelines for marginal cost and time savings
gertlabs 1 days ago [-]
OpenRouter is our primary provider for evaluation data, and we've been really happy with them!
I'm sure they're experiencing growing pains, but a larger model selection (and faster releases for open weights models), would keep us from using other providers. For example, it took much longer than it should have to get Qwen 3.6 ~30B class models released (almost 2 weeks if I recall)
yair99dd 22 hours ago [-]
U can broadcast your tokens
I wonder how to do something with such data. Aside from greping for secrets
Is they’re siphoning data that’s worth millions but the product is worth nothing.
armcat 23 hours ago [-]
How quickly they get new models supported on the API and it just works, is insane!
numlocked 22 hours ago [-]
Thanks! We work really hard to make sure we are ready at launch :)
sgt 1 days ago [-]
I honestly thought this was some kind of OpenWRT firmware for routers until I clicked the link. "Ahhh, AI. Of course."
kundi 15 hours ago [-]
Tried using the service and immediately switched off noticing the delay and unreliability of the proxy
CSMastermind 23 hours ago [-]
What's the business model? Their core functionality, while useful, seems like something that will just be an open-source package. I assume there will be some Saas layer on top of it?
gordonhart 22 hours ago [-]
Collect and sell data would be my guess. Without ZDR by default they are in a position to collect a crazy amount of data that I’m sure various buyers would be interested in (not just the big labs).
amazingamazing 1 days ago [-]
Too bad api use is like 100x more expensive than subscriptions for the big 3.
anon7000 1 days ago [-]
I think subscriptions are not going to last for serious users. Great to use them while we can, but AI does not fit the “power user subsidizes free/cheap users” model, nor the “support tens of thousands of customers from a small number of cheap servers” model. Everyone is a power user, and everything is computationally expensive.
cesarvarela 22 hours ago [-]
I think the trick is to limit programmatic usage as Anthropic did. This way, power users' usage is uneven, allowing the model you describe.
willis936 1 days ago [-]
Chatbot windows are a waste of time compared to API tools when trying to make stuff.
Subscribing to a vendor locks you in to sudden price swings that the big 3 are happy to do. The market needs lubrication for competition and provider routers offer that.
vasco 1 days ago [-]
> ... with participation from NVentures (NVIDIA's venture capital arm), ServiceNow Ventures, MongoDB Ventures, Snowflake Ventures, Databricks Ventures ...
Are tech companies FOMOing so hard that they're now all running AI venture arms themselves instead of you know, developing their own products? Except for NVIDIA who needs to keep pumping the bubble I didn't expect the others.
missedthecue 18 hours ago [-]
Companies with negative net income running internal VC funds is just so funny.
mschuster91 1 days ago [-]
> ServiceNow Ventures
Well, at least for them, investing into AI is actually developing their own product. The push to replace "Actually Indians" [1] with LLMs is huge because large Western companies want to save even the pittances they're paying Indian body shops.
Maybe I'm just a casual user, but I'm a bit surprised at the negativity in this thread.
The 5% fee is a rounding error for your average small time user, and it makes testing new models as simple as changing one string.
Can spin up separate models for separate budgets too.
It's a really simple product that just works
dcreater 24 hours ago [-]
Why does a company with a seemingly health business model that is already churning profits and doesnt require large CapEx, taking losses to capture users, need to be raising this kind of capital?
heldrida 22 hours ago [-]
Maybe they want to build their own cloud infrastructure, host open source models themselves as an inference service provider?
iqihs 24 hours ago [-]
i wonder if it's partially because it's not a unique business model and subject to yet another VC-subsidized race to the bottom on things like token prices
vinayaksodar 1 days ago [-]
Every tech company now seems to be invested in every AI startup.
robmn 22 hours ago [-]
The moat will disappear in 1-2 years
TurdF3rguson 22 hours ago [-]
The real moat will be the user base they've built during that time. I could copy every feature of their website within a few weeks, but I couldn't copy that.
vinni2 21 hours ago [-]
Users can easily switch if they find better alternatives.
gigatexal 22 hours ago [-]
Ok I’ve read the comments I still don’t see it. What’s the 113M in value add here?
vinni2 20 hours ago [-]
They want to have enough cash to pay the LLM bills.
croes 1 days ago [-]
Aren’t they totally dependent on the good will of the model providers?
willis936 1 days ago [-]
In what way? They're just an API customer like any other and charge a bit more on top. Providers would have to carve out their usage terms to not allow resell, which does nothing besides lose customers to competitors. If they all did that then you would tap on the FTC's shoulder and suggest they do their job.
TurdF3rguson 22 hours ago [-]
I think it's more likely the other way around. If someone is offering low friction access to your model, you don't want to piss them off.
SuaveSteve 21 hours ago [-]
really sucks that you guys use discord over a forum
aussieguy1234 19 hours ago [-]
One well known problem with OpenRouter is routing to poor quality model providers who quant the models.
So you think for example you're using Kimi k2.6....but behind the scenes, it's the 4b or 8b quantized versions.
So for open source models, I've started using the providers own service. In the case of Kimi, I don't trust any provider other than moonshot not to quant the model. So far this seems to be getting better results.
If I see a provider not specifiying if they quant or not, I assume that they do.
I'll still use OpenRouter to try new models out, but not for any real work.
simianwords 1 days ago [-]
I like OpenRouter - lets me test out new model quickly and easily. I would still need a good functioning mobile application for it.
I think they should go in this direction: they should make their own Model Agnostic versions of whatever functionalities other AI companies are making. Examples
1. personal chat app
2. the chat app working with their own implementation of memory
3. coding harnesses that are model agnostic
When I think of OpenRouter, I should think of "model agnostic LLM tools".
24 hours ago [-]
owler 9 hours ago [-]
[flagged]
stanleydupreez 19 hours ago [-]
[flagged]
CoderAshton 1 days ago [-]
[dead]
oddboii 1 days ago [-]
[flagged]
Sasisundar09 1 days ago [-]
[dead]
donbventures 21 hours ago [-]
[flagged]
IamCompliant 22 hours ago [-]
[flagged]
leeeeep101 15 hours ago [-]
[dead]
ihsw 1 days ago [-]
[dead]
lee101k 15 hours ago [-]
[dead]
georgefloid 10 hours ago [-]
[dead]
codemog 23 hours ago [-]
[flagged]
drak0n1c 17 hours ago [-]
Venice (uncensored privacy AI API and app co) took a year to expand their self-hosted model selection to routing hundreds of other models. It's harder than it looks to get customers. But they did grow to 3M users and >$50M ARR as of a few weeks ago. So go for it if you've found an easy way to do it.
heldrida 22 hours ago [-]
Just curious, given the $113M investment, and the state of the economy as you described. Why don’t you build it then?
minimaxir 23 hours ago [-]
There's more going on here than just the "routing" part.
codemog 23 hours ago [-]
That's why I gave it a month and not an afternoon.
ljlolel 1 days ago [-]
I’m offering a fully end to end encrypted open source version and hosted version of open router : https://trustedrouter.com/
codemog 11 hours ago [-]
Someone give this man one hundred million dollars asap
Squarex 24 hours ago [-]
The link is not working.
latchkey 20 hours ago [-]
I'm reading the website and nothing about this addresses the compute running the models. If that's going to a third party (just like openrouter is), then there are no guarantees, other than words on paper.
Proving my point. Your prompt gets sent through TR, to another provider on the other end.
There are zero guarantees beyond "trust me bro" that the inference provider isn't taking your prompts and selling them to promptbase or one of a dozen other similar services.
Venice claims no logs, which may or may not be true, but what happens to your prompt after they proxy it to the service running the GPUs?
From their website:
"The GPUs that process your inference requests come from multiple decentralized providers, and while each specific provider can see the text of one specific conversation, it never sees your entire history, nor knows your identity."
Which is an absurd claim if your prompt has your company name in it.
It doesn't matter if it is encrypted in transport, once it hits the company running the GPUs, it is open season for them.
1. By far the lowest friction way to support and try out all the models.
2. They offer billing caps! Most model providers still don't do this [EDIT: maybe they do, see reply comment], but if you're going to run anything in public it's very useful to have hard limits so it doesn't cost you $1m overnight because someone started abusing it.
3. Their rankings are one of the more interesting signals for which models are popular, despite their flaws (most OpenAI and Anthropic users don't go via OpenRouter, it's currently not possible to tell the difference between many users switching v.s. one "whale" changing their preferred model)
Given how API costs are becoming meaningful for a lot of companies now, having a provider like OpenRouter to help measure your spend and easily experiment with and switch providers feels like a valuable service.
This is also the reason providers like Anthropic scored lower because while Opus 4.7 is close to 90%, Opus 4.5 is 45%
Why?
Maybe this is a dumb question, but why wouldn't an agent "keep the conversation going", like I do when interacting with an LLM through a web page? (I understand how it's impractical for long-running tasks where the agent has to wait days for the next input, but assume that's not the majority of use cases)
For OpenAI, it seems like you can't prolong the caching duration for money. Duration is longer during off-peak hours for in-memory caching and up to 24 hours for extended prompt caching. https://developers.openai.com/api/docs/guides/prompt-caching
For DeepSeek, caching duration of at least 12 hours (and likely longer) have been observed. Cache writes are free. https://zhuanlan.zhihu.com/p/2035737726952194774
Technically it does retrieve the entire history and reevaulate it since the LLM is stateless. Just more ergonomic for the developer.
And prompt caching helps cut the costs down when a conversation drags on.
I’d prefer something that consolidates billing, but still lets me use providers' APIs directly (or via some "raw HTTP" proxy). There are plenty of unified API gateways, but I haven’t seen one that is just billing/auth in front of the native provider APIs.
OpenRouter is also a good place to find free LLM access with a catch: You should expect that any inputs and outputs are going into someone's training database. Clearly anyone who can pay should be using paid models with privacy protections, but the free models have been great for learning and experimenting. Especially for younger people learning API programming and LLMs who may not have access to a credit card or funds.
"We do not sell users’ data to third parties."
https://www.anthropic.com/news/updates-to-our-consumer-terms
I.e. nothing about this statement prevents Anthropic from running ads within Claude, as long as they run the ad-placement auctions themselves, and so aren't leaking any of the data they're using to decide which placements are relevant to which users+sessions. (This is the same thing Google does for SERP ad auctions.)
But actually, and perhaps more interestingly, nothing about this statement prevents Anthropic from building a Google AdSense competitor either. Other sites (or mobile apps, etc) could plop in an Anthropic ad iframe; and it'd be Anthropic's knowledge of your interactions with Claude that would drive what ads would show up in that iframe. The embedding site doesn't know what ads the users are seeing, so that's still not "selling users' data to third parties", per se.
OpenRouter explicitly lets you filter by zero-data-retention providers: https://openrouter.ai/models?zdr=true
True enough, in theory; but what exactly are you imagining would be a useful-enough signal in the OpenRouter request+response stream, that any company would want their data as training material?
Even a single OpenRouter-API-key-identified subscriber's traffic, may consist of an mixture of traffic from multiple different sessions, under potentially multiple different end-users. (Where, if the subscriber is doing security correctly, then their OpenRouter key lives on a gateway rather than in a frontend app; and so the only IP address / UA / etc OpenRouter sees is that of the gateway itself.)
And the traffic stream may also invoke multiple models, and provide multiple different system prompts for those models; which, while marked in the traffic (i.e. conveyed as part of each request), makes the resulting data much less useful in aggregate, than if it were all training data for one model with one system prompt.
Plus, there are no RLHF signals in OpenRouter data. Even if OpenRouter wanted to build a general model-neutral framework for collecting RLHF-type data, it can't force subscriber apps to do the UI-level stuff necessary to collect it (i.e. the things ChatGPT/Claude do, with "thumbs-down" buttons, A/B tested responses, etc.) Analysis would have to rely on pure transcript-level user sentiment extraction.
The majority of RLHF data doesn't need this. The majority is software development and/or tool calling where the agent gets a signal back as to if it succeeded (eg compilation errors, test errors). It's true that end-of-trajectory signals (eg, did this task do what you wanted) are even more useful but even partial signals are great for RL training.
Isn't this a treasure trove for any model distillation effort?
Clearly, anyone who needs privacy should be using models with privacy protections. Some people build open source and the models will get the code anyway.
Openrouter is very nice since it puts a barrier between you and those suppliers that were supposed to be like utilities. I got the feeling that if OpenAI was left alone they would be nice as a telco.
https://news.ycombinator.com/item?id=48319827
If you're doing any kind of production AI work you'll end up with outages caused by calling a single provider, OpenRouter seamlessly switching between providers is a godsend for uptime.
But even more than that there's meaningful cost+speed differences.
Here's Sonnet 4.6 being served direct, via Amazon and via Google
https://la9q13gg8w.evvl.io/
(spoiler: Google was both fastest and cheapest)
What's the value proposition for the typical AWS startup to go with openrouter, if Amazon offers similar rates with direct integration into all their other offerings?
The only reason OpenRouter can exist at the moment is because we are in the wild-west phase of this technology, and lots of people and companies are exploring. In 5 years they will have to have transformed their business fundamentally, or go the way of the dinosaurs.
Established restaurants didn't need DoorDash because they were already on everyone's speed dial. But new or small restaurants couldn't afford to advertise or maintain a team of delivery people. DoorDash created a two-sided marketplace that made it a lot easier for new entrants to bootstrap. Today even the established restaurants have to pay them their tithe because hungry people have learned to start with the DoorDash app. A bit of a prisoner's dilemma.
If OpenRouter plays its cards right and gets very lucky, a large number of people will configure their hungry LLM clients to start with OpenRouter, and then LLM providers will have to join the marketplace or else miss out on all those customers.
People could email cat photos and resumes. But Facebook and LinkedIn are where everyone already is, so that's what they use instead.
https://openresponses.org/
Everything has a cost of some sort. It's just who you're going to pay and what the currency is.
What if Fireworks stops offering your preferred model?
Anthropic: https://support.claude.com/en/articles/8977456-how-do-i-pay-... - you can pre-pay and get a hard cutoff.
OpenAI: https://community.openai.com/t/how-to-set-billing-limits-and... - last time I looked OpenAI had a soft but not hard limit, I guess they fixed that last year.
I remember bugging them both about this last year, I need to update my mental model!
Deepseek has a prepaid model. (Pretty impressive, what fits into 10 Dollar)
> Billing data processing times can be delayed in AI Studio, up to around 10 minutes. You may experience overages beyond your project cap if billing data hasn't processed before more charges are accrued.
https://ai.google.dev/gemini-api/docs/billing#project-spend-...
That's a soft cap, not a hard cap
Check out Kagi Ultimate.
For personal use, yes. The all-in pricing model encourages experimentation. And the privacy pitch seems tighter.
Maybe someday the VM I run agents in will have a dedicated GPU so that I can stop using APIs altogether. One can dream...
Looks like Vercel even have their own leaderboard: https://vercel.com/ai-gateway/leaderboards/models
Surprising that they have Opus 4.8 and 4.6 listed on the leaderboard but not Opus 4.7.
Free? They take the same 5% fee as OpenRouter does.
https://developers.cloudflare.com/ai-gateway/features/unifie...
Other features I've just noticed: - configurable prompt injection protection using OWASP regex (https://cheatsheetseries.owasp.org/cheatsheets/LLM_Prompt_In...) - configurable PIM protection for outbound prompts - input/output logging - "JSON healing" to auto-correct minor hallucinations
Lots of other stuff too. The business model seems pretty simple and the value-add features don't look particularly expensive or difficult to copy.
coffee farmers -> middle man -> you
They don't list themselves on https://openrouter.ai/providers
https://openrouter.ai/openrouter/owl-alpha
OpenAI and Anthropic have already done this.
Mandated BYOK will sink OpenRouter.
(If they were paying me they got a bad deal, since I called out the flaws in their leaderboard approach half way through my post.)
First off: We remain founder-led and founder-controlled, and intend on being here for a long time, creating awesome products for builders all over the world. We are basically a bunch of tinkerers who like building things, and try to make stuff that we would like, when building with AI.
Since this is about the raise though, happy to share perspective on it.
We believe that strong companies should have a strong balance sheets. We touch large volumes of spend, and have large spend commits across the ecosystem; having the cash to withstand what may come is a responsible buy-down of risk, and makes the company extremely durable.
It also tells our larger customers and provider partners that we will be able to continue to serve them (and pay our bills) for a long time to come. We don't need venture dollars to continue scaling (indeed the business is healthy) but you know when you don't want to raise $100m? When you really need it!
This is also good validation to employees (current and future) that the value we are creating together is real. We also take seriously our obligation to make a return for anyone who invests; we aren't valuationmaxxing and have the privilege of getting to pick who we work with. I don't think that gets a lot of airtime in the overall start-up world, but I think it's important!
Happy to answer questions and THANK YOU to everyone here who uses OpenRouter, and to everyone who has feedback for how we can improve!
That's a nice narrative but I suspect you're not touching upon the investor pressure side of things. Your earlier investors would be upon you to show a multiple in valuation beyond what the balance sheets can show. The only way to do that is to raise more money.
The problem with this is that you're now beholden to another set of investors who will also expect a multiple on their investment which makes increasing valuation your primary objective, even to the detriment of the business. With a margin business you could sustain for a long time even when the market stagnates, but you've lost that option when you first took money from someone. It's an all or nothing play now.
For me, raising $100m when it's not needed doesn't add up. Nobody lends money with the idea to "keep it, just in case". There are always commitments and expectations and obligations to meet those expectations. So when they said they didn't really need to raise, while also not talking about investor expectations, feels there's more to the situation than is being let on.
Everyone wants a conspiracy, but what I originally posted is in fact the boring truth. Having a bunch of cash in the bank makes for a durable business!
There is the example of Apple and Google providing transport for push notifications, but claiming to delete the content and only preserve the metadata.
What is Openrouter's policy on this? Is the logging of user data an essential part of the business model, or is the primary business model really facilitating a proxy between multiple services and nothing beyond that? If everything is logged, do y'all store it securely so that if one database is stolen (by China for example) then it's not useful on its own?
With the race for AGI and everyone training on each other's outputs, Openrouter is clearly in a position to abuse all of that even though the major providers weaken their output to limit the value of distilling them.
One mechanism where you get a discount and we can use the data (in theory this does mean sell it; but our intent is to use it to make efficient dynamic routing solutions. But absolutely we could one day sell it) and another where we retain it for you so you can see it in your logs. We have no rights to this data in any way. This is similar to how any tracing/logging solution works.
Both and opt-in. If you don’t opt in, we don’t retain anything and are a pass through with regards to your prompt data.
All of this is carefully documented and I encourage you to explore and chat with the docs.
While I'm at it, another annoyance is that OpenRouter doesn't seem to have a very good API playground. The chat does work, but the params exposed there are quite limited and it's not clear how the GUI fields map to API params. I now have resorted to exporting the chat and figure out the params from the export JSON. Just having an option to get a curl command for the chat call would help a lot, and shouldn't be hard to implement.
Edit: I think the ideal implementation for the direct API access would be that I could generate API keys for the provider at OpenRouter that I would give in the provider API calls, but that would get billed through OpenRouter. Second best would probably be a raw HTTP proxy/tunnel that injects OpenRouter's own keys (or however it is that you call the providers). I don't really know though how you call the providers and what kind of new provider integrations these would require.
Less about the funding and more about the long game: where do you see OpenRouter in 3-5 years, and which product bets are you most excited about right now? Do you guys think with this new raise you'll branch out into other adjacent verticals?
We will inevitably expand into adjacencies because we like building things and experimenting and we have a lot of people with great taste who are likely to ship cool things that customers want to use!
Edit: also - THANK YOU!
If a user cancels API access while still holding prepaid credits, that unused balance represents compute they never consumed. Unlike a shipped physical product or a fixed one-time service cost, unused API credit does not seem to impose much marginal cost on OpenRouter to reallocate or refund.
So the issue isn’t whether the policy is disclosed. It’s whether keeping unused prepaid credit after cancellation is the right default, especially when the user is no longer able or willing to use the service.
Check out the Guardrails section under settings and tell me what’s missing!
That said, I don't understand the people who use something a full agentic backbone with expensive models like Claude Opus with OpenRouter because that 5% surcharge is meaningful at that level of cost instead of going with the source API providers. But people are clearly doing it, and it's pure revenue.
I'm just a stupid systems programmer working in the bowels of AI and I understand there is a lot of seemingly pointless software which exists solely to provide a slight boost to convenience in exchange for money. Is OpenRouter just that? Do they actually host models themselves or centralize billing amongst various providers?
Why not... Cursor?
OpenCode is much better at it.
Second, Cursor hasn’t been acquired by SpaceX yet, and there’s a good chance they never will be.
After things begin to settle down, we'll probably see a consolidation of both frontier and open-source models - and then OpenRouter will become less useful, because that 5% overhead is well worth it when you want to try 20 models from 10 labs, but harder to stomach when you only need 5 models from 2 providers, and each of those providers has its own API knobs that you can tune to make things even cheaper.
its just a proxy
The most common counter-argument that I've seen here is " Yes, but no organization wants to manage all of their different operational tools. They would rather just outsource that responsibility to third-party entities".
I'm not sure I fully agree with that counter. Because agents can be viewed as third party entities in some sense. If not today then maybe soon.
So many use cases, like sharing AI/assisted features externally, with the ability to use those features but also limit the fallout if its shared / used for other purposes, without jumping through more fallible hoops like safeguards etc.
That's a lot of paying users
DAMN!!
That's 41+ million tokens every second. That scale is crazy for such a small team of 48-50 people overall.
Would it be as impressive if the context were an email provider accepting thousands of message per second, or even one accepting thousands of messages per second and submitting them upstream for spam detection? The token count might even be higher in that case, but rightly or wrongly I think it would get a yawn on HN.
It says more about how far the industry has come these days in terms of scale on the one hand, but also on the other hand the huge blowup in data and processing for nominally simple requests. Nonetheless I'm sure the team is exceptionally skilled and it's certainly a laudable accomplishment.
Infrastructure? For proxying requests more infrastructure? You could just pay Cloudflare.
More engineers? But you yourself are the stret seller for the same snake oil that engineers aren't anymore necessary
So what that 100 million dollars are for?
The handful of times I did try a free model is when I used their chat interface to quickly compare a few open weight models with a single prompt. That's the only usage I can think which could have triggered the block on my account. Even still, what's the point in have the simultaneous chat feature if using it veers so quickly into a ToS violation.
Their support is beyond useless in helping understand the situation. I don't think I managed to speak to anyone other than Tony Bot (or whatever it was named).
Edit:
Total usage over 1 year:
Claude Sonnet 4.6 $8.80
Gemini 3.1 Pro Preview $6.71
Claude Opus 4 $6.19
Claude Opus 4.1 $7.49
Gemini 2.5 Pro $10.06
Claude Sonnet 4.5 $12.74
GPT-5 Codex $2.56
Grok 4 $4.39
Gemini 2.5 Flash Image Preview (Nano Banana) $1.88
GPT-5 $7.30
Others $7.99
My usage of OpenRouter was limited to casual throwaway experiments with coding agents and quick, surface-level exploration of new and unfamiliar models in the chat interface.
My comment is just an expression of a festering grudge over the unannounced, unexplained sanction on my account and the lack of transparency and feedback from the non-existent support team. There's no OpenRouter shaped hole in my personal workflow, fortunately.
I’m a user and I like the routing layer and not having to change things up too much, but I’m not sure why a solid business model for this product would require this much money at this kind of valuation unless they’re trying to buy data center capacity to self-host models eventually?
> While the startup didn’t disclose its new valuation, The New York Times reports that it landed at about $1.3 billion post-money.
Enterprises appear to be paying the API rates which are 10x (1000%) what are available to individuals, so I would not be confident they are sensitive to a 5% price change.
That said, the attraction of OpenRouter to enterprise customers should be that they save you >5% on average for a product <5% worse.
Also one scary issue I had with OpenRouter in the early days, I think I saw somebody else's context and there were weird Chinese characters, haven't touched it since.
I'm sure they're experiencing growing pains, but a larger model selection (and faster releases for open weights models), would keep us from using other providers. For example, it took much longer than it should have to get Qwen 3.6 ~30B class models released (almost 2 weeks if I recall)
https://openrouter.ai/docs/guides/features/broadcast
Subscribing to a vendor locks you in to sudden price swings that the big 3 are happy to do. The market needs lubrication for competition and provider routers offer that.
Are tech companies FOMOing so hard that they're now all running AI venture arms themselves instead of you know, developing their own products? Except for NVIDIA who needs to keep pumping the bubble I didn't expect the others.
Well, at least for them, investing into AI is actually developing their own product. The push to replace "Actually Indians" [1] with LLMs is huge because large Western companies want to save even the pittances they're paying Indian body shops.
[1] for those OOTL: https://www.reddit.com/r/ProgrammerHumor/comments/1l3rpow/ac...
The 5% fee is a rounding error for your average small time user, and it makes testing new models as simple as changing one string.
Can spin up separate models for separate budgets too.
It's a really simple product that just works
So you think for example you're using Kimi k2.6....but behind the scenes, it's the 4b or 8b quantized versions.
So for open source models, I've started using the providers own service. In the case of Kimi, I don't trust any provider other than moonshot not to quant the model. So far this seems to be getting better results.
If I see a provider not specifiying if they quant or not, I assume that they do.
I'll still use OpenRouter to try new models out, but not for any real work.
I think they should go in this direction: they should make their own Model Agnostic versions of whatever functionalities other AI companies are making. Examples
1. personal chat app
2. the chat app working with their own implementation of memory
3. coding harnesses that are model agnostic
When I think of OpenRouter, I should think of "model agnostic LLM tools".
There are zero guarantees beyond "trust me bro" that the inference provider isn't taking your prompts and selling them to promptbase or one of a dozen other similar services.
Venice claims no logs, which may or may not be true, but what happens to your prompt after they proxy it to the service running the GPUs?
From their website:
"The GPUs that process your inference requests come from multiple decentralized providers, and while each specific provider can see the text of one specific conversation, it never sees your entire history, nor knows your identity."
Which is an absurd claim if your prompt has your company name in it.
It doesn't matter if it is encrypted in transport, once it hits the company running the GPUs, it is open season for them.
Boggles my mind that people are ok with this.