Trismik secures £2.2m pre-seed led by Twin Path Ventures to launch tools for designing and analysing LLM experiments

While AI labs race to build more powerful models, a fundamental problem threatens progress: we’re no longer able to meaningfully measure what these systems can actually do. Traditional benchmarks have become saturated, creating a challenge for businesses that want to measure and adapt model ability and communicate results with stakeholders.

A team from University of Cambridge has launched Trismik, applying psychometric methods for measuring human intelligence to AI evaluation. Its platform uses Item Response Theory and Computerised Adaptive Testing to adapt difficulty in real-time, precisely mapping model capabilities.

The company has secured £2.2m in pre-seed financing led by Twin Path Ventures, with participation from Cambridge Enterprise Ventures, Parkwalk Advisors, Fund F, Vento Ventures and angel investors from Ventures Together.

Founded by Cambridge NLP researcher Nigel Collier, enterprise executive Rebekka Mikkola, and former Amazon scientist Marco Basaldella, Trismik’s approach could reduce evaluation costs by up to 95% while providing more granular insights. Adaptive tests deliver near-identical rankings to full evaluations while requiring just 8.5% of the questions.

Trismik will now launch its product for AI builders, initially providing classical and adaptive evaluation across datasets covering factuality, alignment, safety, reasoning and domain-specific knowledge. Early access will be available towards the end of 2025 through the company’s website.

If we want to trust AI, our methods have to be as rigorous as our ideas. Benchmark saturation is creating problems in every domain, from general knowledge, to reasoning, math, and coding. Scientists, researchers and technical teams face mounting pressure as evaluation is exploding in importance and has become essential for tying AI to trust. We need an evaluation framework that scales and can support this.

Nigel Collier, Co-founder & Chief Scientific Officer

The AI evaluation market is at an inflection point. Every AI team we speak with is drowning in evaluation overhead, it has become the hidden bottleneck preventing teams from shipping faster and with confidence. Trismik's approach is compelling because it applies proven scientific methods from a completely different domain to solve this problem. When you can reduce evaluation time by two orders of magnitude while actually increasing measurement precision, you fundamentally change what's possible in AI development cycles.

John Spindler, Partner at Twin Path Ventures

🔗

💡 For venture legals and US expansion

💡 For business insights and equity tracking

💡 For financial management software