Project

Tessl - Supporting the Future of AI-Native Software Development

Project Overview

We partnered with Tessl, a pioneering AI-native development platform founded by Guy Podjarny, to support their early-stage infrastructure, evaluation tooling, and digital presence. Tessl is on a mission to reimagine software creation by moving from a traditional code-centric approach to a "spec-centric" model where developers define what they want to build, and AI handles the implementation. We worked alongside their internal teams to validate their core product, build essential benchmarking tools, and help launch their public-facing website.

The Challenge

Validating a Novel Paradigm: Building a spec-driven assistant from the ground up requires rigorous, real-world testing. Tessl needed experienced engineers to "dog-food" the early product to identify edge cases and UX friction.
Evaluating Agent Reliability: Shifting from writing code to writing specs requires robust, quantitative testing. Tessl needed a systematic way to evaluate how well different AI agents interpreted specs and generated functional code.
Establishing a Digital Presence: Emerging from stealth and announcing a major funding round required a highly polished, performant website to communicate their vision to developers and investors.
Researching Agent Behavior: To build an effective platform, the team needed deep empirical research into how coding agents behave, succeed, and fail within large codebases.

Our Approach

Embedded "Dog-Fooding": We acted as early adopters and technical partners, actively using Tessl’s early spec-driven assistant to build out projects. This allowed us to provide direct, actionable feedback on the developer experience.
Building the Evaluation Foundation: We collaborated with their engineering team to design and build custom evaluation infrastructure, ensuring they could accurately measure agent success rates as models evolved.
Data-Driven Research: We assisted their internal team with targeted core research initiatives, analyzing AI agent patterns to inform the platform's development direction.
Web Development: We took ownership of building their corporate website, ensuring it was fast, responsive, and ready to handle the traffic of a high-profile startup launch.

The Solution

Specbench & Eval Tools: We helped build "Specbench" and a suite of custom evaluation tools. This infrastructure allows Tessl to benchmark AI agents against structured tasks, detect regressions, and ensure consistent code generation quality.
Product Refinement: Through our intensive dog-fooding phase, we provided the Tessl team with critical early validation, helping them refine the usability and accuracy of their spec-driven assistant before wider release.
Corporate Website: We built and launched Tessl.io, delivering a modern, high-performing digital storefront that successfully supported their public announcement, waitlist generation, and ongoing content strategy.
Research Contributions: We delivered targeted data analysis and research support, contributing to Tessl’s foundational understanding of AI-native development patterns.

The Results

Successful Launch Support: The website and foundational tooling we helped build ensured Tessl had a highly professional and technically sound platform when they emerged from stealth.
Quantifiable Agent Performance: The Specbench and evaluation tools provided Tessl with the critical infrastructure needed to measure, test, and optimize their AI agents programmatically.
Valuable Early Validation: Our dog-fooding and research efforts provided the Tessl engineering team with the empirical data and user feedback necessary to iterate quickly and build a more reliable developer tool.

Technologies & Tools Used

TypeScript & Python for developing Specbench and custom evaluation tooling
Next.js / React for building a fast, scalable, and modern corporate website
LLMs & AI Agents for rigorous dog-fooding, benchmarking, and core research
GitHub Actions & CI/CD for integrating evaluation tools into development workflows
AWS for cloud services and deployment