What if “Attention Is All You Need”… isn’t entirely true?
At Ankpal, we spend a lot of time thinking about systems – how they behave, how they scale, and more importantly, how they can be improved at a fundamental level.
While most of the world is building on top of AI – agents, wrappers, workflows – we believe it’s equally important to question the foundations themselves.
Because sometimes, the biggest breakthroughs don’t come from optimization…
they come from rethinking the assumptions.
📄 A New Direction in Sequence Modeling
We’re proud to share a recent research paper by our CTO (Gowrav Vishwakarma):
👉 https://arxiv.org/abs/2604.05030?context=cs.AI
This work introduces Phase-Associative Memory (PAM) – a sequence modeling approach that explores an alternative to the dominant transformer architecture.
Co-authored with Christopher J. Agostino, the work brings together perspectives from machine learning, physics, and quantum-inspired semantics.
🧠 Rethinking How We Model Language
Modern AI systems largely rely on a core assumption:
Language can be modeled as patterns in real-valued vector space, optimized through attention and next-token prediction.
PAM explores a different possibility:
👉 What if meaning is not just about predicting the next token…
👉 but about interpreting a state in a richer mathematical space?
Instead of operating purely in real-valued representations, PAM works in a complex-valued Hilbert space, where:
- representations carry phase, not just magnitude
- memory is built through associative accumulation (outer products)
- retrieval depends on phase alignment, rather than softmax attention
⚙️ Early Results (With Honest Context)
At ~100M parameters on WikiText-103:
- PAM reaches a validation perplexity of ~30
- A matched transformer reaches ~27
So no – this is not outperforming transformers.
But here’s what makes it interesting:
- It operates with ~4× computational overhead
- Yet stays within ~10% performance gap
- And does so without heavy optimization or custom kernels
This suggests something deeper:
The underlying representation may be capturing structure efficiently – even in an early, unoptimized state.
🔍 Why This Matters
The industry today is heavily focused on scaling:
- bigger models
- more data
- more compute
But there’s a growing question:
Are we scaling the right abstraction?
If language and meaning are fundamentally contextual and non-separable, then representing them purely in real-valued space might be an approximation – not a complete solution.
PAM doesn’t claim to replace transformers.
But it opens a direction:
👉 There may be other mathematical frameworks better suited for modeling language.
🚀 What This Says About Ankpal
This research is independent of Ankpal’s core product – but it reflects something fundamental about how we think as a team.
At Ankpal, we don’t just focus on building solutions.
We value:
- first-principles thinking
- questioning established norms
- and exploring ideas that may not have immediate payoff, but long-term impact
Because innovation doesn’t always come from doing things faster –
sometimes it comes from asking:
“Are we even solving this the right way?”
🌱 Looking Ahead
This is early work.
It’s not optimized.
It’s not scaled.
It’s not production-ready.
But it points toward something worth exploring.
And if there’s even a small chance that language is better modeled beyond real-valued space…
Then there’s a lot more left to discover.
💬 We’d Love to Hear Your Thoughts
If you’re working on:
- alternative architectures
- interpretability
- or new mathematical approaches to AI
We’d love to connect and discuss.
– Gowrav Vishwakarma,
CTO,
Ankpal https://ankpal.com
https://www.linkedin.com/in/gowravvishwakarma/