Purpose of this session
It was TADSummit 2022 in Avero were Andreas Granig presented on Programmable Testing for Programmable Telcos, they won a couple of customers from that TADSummit. Then in January 2026 they announced Sipfront had raised a €1.8M seed round, well done! And last month they announced the addition of Filipe as an advisor, yeah, he’s one of the good guys!
This is a chance to catch up with Andreas, understand the progress made since we last talked, and share perspectives on Voice AI.
Sipfront’s Evolution
Andreas highlights Sipfront’s evolution in reaction to the market changes. From API-led SIP transport to voice AI, a much more complex environment, yet the core of SIP monitoring and testing remains.
The funding is to help Sipfront address the enterprise market. Initially they focused on Telcos, because of the team’s history. However, found they were also addressing enterprise concerns. And because of their customer focus, I think this is a natural fit building on a segment they know well and can differentiate.
Andreas raises an important point on the inflexibility of telcos around voice. They’ve been doing it for decades, with a strategic vendor that IMO keep them locked in, so are set in their ways. Showing ways to improve monitoring and testing through automations simply does not fly.
While enterprises are more customer driven, anything that delivers a better experience or improves how the customer perceives their service has value. Also scale is an important criteria, as for a carrier one failure gets lost in the noise, while for an enterprise that failure matters much more, by orders of magnitude.
Why Enterprise have become an important segment
It’s also a busy time in enterprise communications, with migrations from traditional solutions to cloud based offers, and the second bankruptcy and restructuring of Avaya, etc. Sipfront helps enterprises through their migrations, checking contact-ability, proving quality of service for home office or nearshore workers, and let’s not forget the rise of WebRTC amongst all of this. And finally, VoiceAI sits on top of this complex and evolving stack.
The VoiceAI model is bifurcated across traditional voice providers adding AI to their offers, and AI companies adding voice to their traditionally messaging offer. Sipfront serves both. Only last year we witnessed OpenAI real-time. That’s when VoiceAI really began to accelerate, we covered that shift with a session, almost one year ago, with with Rob Pickering and Lyle Pratt. I’m impressed with how on the money they were. Their ‘super-power’ is they build with VoiceAI for their customer and themselves, so bring hard-won experience, compared to many.
The demo phase of VoiceAI was relatively short lived. The focus moved onto ROI, where a talking FAQ (Frequently Asked Questions) does not deliver, it needs to perform a task / solve a problem. Andreas highlights the “happy path”, building a voiceAI bot that works for a demo, but does not work reliably across all the edge cases / potential frauds? Like someone sneezes in the background, or asks for a pasta recipe when its not appropriate. We’ve seen on Linkedin stories of bots racking up massive infrastructure bills, and being subjected to fraud. For example, calling premium rate lines. I remember BT Labs initial experiments with telecoms API over twenty years ago, and all the fraud that happened, which was a known risk for the old telecom folks who’ve seen what crooks will do to telecom networks.
We’re at an architectural crossroad, the traditional approaches bring it back to text, while and the more modern work on voice natively. “Traditionally” STT to LLM to TTS. With the benefits of being modular, but delay is an issue, and emotion can be missed. While the latter approach come from OpenAI realtime, or Gemini Live with lower latency, but it’s currently an expensive black box, with guard rails not built to your specific situation.
We’ve see hype around VoiceAI bots, and Andreas shares how they have people approach them when it’s a little late, decisions has been made. He references the hype about a new bot that responds in 200ms. While in practice its between 800 to 1200 ms, perhaps as high as 1500ms. This is why trusted advisors, that have practical experience are essential. The CEO of OpenAI has a history of not being that truthful with his board, we live in a time when BS is acceptable, just look at the history of Musk’s claims.
NLP to Speech to Speech
Andreas highlights an interesting situation where voice bots filter the unemotional situations away from the real call center agent, so the people are working on escalation issues all day with higher emotion. I’ve heard tales of real agent calls being monitor to limit the emotion that they are subject to, to avoid burnout. Introducing more up the regular calls, and limiting escalation calls.
Generally NLP (Natural Language Processing) is widely deployed today, not LLMs. This is much more deterministic, think voice controlled IVR. Modern solutions mix and match NLP and LLM. MLP is used where determinism is key, e.g. announcing the call will be recorded must happen on all calls. I like the model Andreas shows of where VoiceAI breaks, see below.
It all begins with SIP performance, in the real world, what happens when calls are put on hold or transferred? The impact of jitter or packet loss or round trip time on the voiceAI bot. Audio quality is the standard MOS, Mean Opinion Score. Which can now done using AI, rather than people.
Audio turn performance is how quickly the bot, responds, previously Andreas made reference to a 200ms claim versus a 800-1500ms actual. This also includes barge-in, and the realities of how people behave in not clearly answering the question, rather rambling, which fills up the LLM context window.
Audio stability include the impact of sneezing, shouting in Italiano on a call in another language. Audio transcription is not word error rate, rather using different regional accents and voices (old, young, male, female) and checking how the bot performs. Content correctness covers hallucinations. Cost protections cover time wasters and fraud, as a voice bot can be charged at 10-50c per min, costs can rack up quickly.
CCW BerlineSurvey Results
Andreas then shared the result of a survey performed at Contact Center World Berlin. The ROI for VoiceAI s not in a talking FAQ. Rather the voiceAI bot must do something. Here the role of MCP (Model Context Protocol) in tool calling is key. We have VCONIC TADHack this weekend exploring exactly this topic. Andreas shares what he does when he realizes he’s talking with a bot, he asks for a pasta recipe. But in practice he finds the bot is limited on what it is allowed to do. Firewalling the bot sounds like an opportunity. In the survey less than one one half of the people attending CCW Berlin have a bot in production.
Survey performed by Sipfront at CCW Berlin in February 2026, Are running a VoiceAI bot?
Survey performed by Sipfront at CCW Berlin in February 2026, Why are you not running a bot?
Technical issues cover the uncertainty in what the bot will do in production, there have been many horror stories of chat bots on the loose. Given this survey was performed in Germany, data protection is high, but not as high as I’d expect. That may be related to type of people who attend CCW Berlin. Customer voices are being sent to US based cloud providers, and it only takes a recording of 10 seconds of your voice to create a clone.
Interestingly, customer acceptance remains an issue. This I find is region dependent, in the US, people are more likely to be indifferent on whether its a bot, while in Germany it appears that fact matters.
Andreas share what he’s been upton, which is customer led. See figure above on where VoiceAI breaks. Andreas highlights his work on assurance reports, comparing different voiceAI bots for customers across different scenarios. We then move to a broader discussion on Jambonz (open source) versus the many VoiceAI bot providers, and how Sipfront enables the different approaches to be compared.
Thank you Andreas for sharing your journey, and I’m proud to have helped you along the way.
Andreas and Filipe at CCW Berlin





One thought on “Sipfront, The Quality Assurance Platform for Voice AI and Enterprise Telco”