While AI labs race to build more powerful models, a fundamental problem threatens progress: we’re no longer able to meaningfully measure what these systems can actually do. Traditional benchmarks have become saturated, creating a challenge for businesses that want to measure and adapt model ability and communicate results with stakeholders.
A team from University of Cambridge has launched Trismik, applying psychometric methods for measuring human intelligence to AI evaluation. Its platform uses Item Response Theory and Computerised Adaptive Testing to adapt difficulty in real-time, precisely mapping model capabilities.
The company has secured £2.2m in pre-seed financing led by Twin Path Ventures, with participation from Cambridge Enterprise Ventures, Parkwalk Advisors, Fund F, Vento Ventures and angel investors from Ventures Together.
Founded by Cambridge NLP researcher Nigel Collier, enterprise executive Rebekka Mikkola, and former Amazon scientist Marco Basaldella, Trismik’s approach could reduce evaluation costs by up to 95% while providing more granular insights. Adaptive tests deliver near-identical rankings to full evaluations while requiring just 8.5% of the questions.
Trismik will now launch its product for AI builders, initially providing classical and adaptive evaluation across datasets covering factuality, alignment, safety, reasoning and domain-specific knowledge. Early access will be available towards the end of 2025 through the company’s website.