The first subquadratic LLM is here — 12M token context, 52x faster attention, and roughly 1/5th the cost of frontier models
Subquadratic came out of stealth with $29M in seed funding and launched SubQ, the first commercial LLM built on a fully sub-quadratic sparse attention architecture. Its SSA engine scales linearly with context length instead of quadratically, cutting attention compute by roughly 1,000x at 12M tokens. The production API ships with a 1M-token window today; 12M is available in the research preview. Early benchmarks show coding performance competitive with Opus at around 1/20th the cost.




