Programmable communications / telecoms puts the enterprise in control, we’ve seen rapid innovations over the past year. At TADSummit 2022 Karel pointed out the transformer model has changed game in LLMs, and he was proven correct through this year. BUT listening to the marketing hype is dangerous, the no BS policy ensures you avoid mistakes – tuning is critical, training data can just ‘pop out’ of the model – privacy and security are critical issues, and latency.
2FA and RCS are also changing, passkeys are rapidly being adopted. TADSummit presents the frank reality of where these technologies are going, and what it means to enterprises and telcos.
- Smartnumbers perspective: Tackling Contact Centre fraud, Abhinav Anand, Chief Product Officer at Smartnumbers. Video, Slides
- RCS is here, so why is it not taking off? Stuart Mitchell, Global Product & Business Development, Rich Messaging, Sinch. Video, Slides
- 2FA is (almost) dead, what’s next? Guillaume Bourcy, Founder Oofty. Video, Slides
- Passkeys, FIDO2, WebAuthn… What does it all mean? Dan Jenkins, Founder at Nimble Ape Ltd, Director at CommCon Events Ltd, Director at Everycast Labs Ltd. Video, Slides
- LLMs on the telephone: useful tool, or hallucinating danger to humanity? Rob Pickering, Software: Real Time Communications and Machine Learning. Video, Slides
- Artificial Intelligence in Telephony Systems and Solutions. Enhancing Communication Through AI. Borja Sixto, cofounder (Associé gérant) Ulex Innovative Systems & Karel Bourgois, Founder Voxist and President Le Voice Lab. Video, Slides
- From Communications to Conversations – What’s Changed And How It Might Matter, Paul Sweeney, Chief Strategy Officer & Co-Founder, Webio. Video, Slides
- How a legacy 80s technology is delivering ChatGPT to developing countries – AI over USSD. Ken Herron, Chief Growth Officer, UIB & Celeo Arias, CEO & Founder PAiC. Video, Slides
- PANEL SESSION: AI and Video applications, Video
Smartnumbers perspective: Tackling Contact Centre fraud, Abhinav Anand
Contact center fraud costs the UK economy 7B GBP according to the UK Government. Cyber crime is estimated at $7T world wide in 2022, and Identity Fraud at $50B world wide in 2022. Globally its a massive problem, a consortium based approach is essential as the bad actors are often operating abroad, and targeting multiple countries. The consortium approach Abhinav is leading can easily expand around the world.
52% of identified fraudsters have attacked multiple contact centres, with approximately 26 calls before fraud is executed. Also 59% of fraudulent calls are from anonymous numbers, and 3% of fraudulent calls are from
spoofed numbers. Using a consortium and network data can significantly helps in flagging suspect behavior.
The graph below shows the impact of fixed CLI blocking on spoofing, the impact was sustained, its beyond the usual whack-a-mole game. One of the data points Abhinav highlighted is using AI can make catching fraud six times more likely. The focus is now on stronger detection, stronger collaboration, and more holistic investigations.
I recommend you download the Smartnumbers Contact Centre Security Report 2023, it’s a great, quantified resource on contact center fraud, and methods to tackle it.
RCS is here, so why is it not taking off? Stuart Mitchell
Stuart provides a good review of RCS, well Google Messages, where Android users are now opted in by default. In my Google Messages I only have one regular RCS contact, my son. It’s still early days, and as soon as the conversation becomes a group chat, especially here in the US, the default in MMS. That’s one of the reasons MMS is growing so rapidly in the US, group SMS. It’s rather perverse, but telco standards have led to this as they are set in stone without evolving based on customers behavior.
RCS A2P uses the Google API specification. It includes custom enterprise branding, operator or Google verification, through in my experience Google is a much simpler experience, and Google spam protection, which generally works fine.
Sinch claim RCS campaigns drive 10% more sales compared to SMS click thru campaigns. On RCS costs:
- Basic text notifications – 1x A2P SMS cost
- Single call to action – 1.4x SMS cost
- Conversation chatbot service – 2x A2P SMS cost (that includes the conversation, not per message)
Google offer all the cloud infrastructure (across RCS and SMS) and provide billing reports. It’s a deal with the devil, but the savings are immediate, and it’s a similar motivation for the aggregators.
Like WhatsApp, Google Messages is just another channel for he aggregator with broad direct carrier interconnect. In the limit Google can squeeze out the aggregators, keep telcos happy by removing legacy infrastructure costs and pay them to keep quiet. While Google harvests all the intelligence from SMS to ensure the advertisers message is delivered, over one of its many channels, e.g. YouTube. Texting a friend about dinner plans and find your banner ads are for a local taco place.
How Apple plays in this emerging ecosystem is open to speculation. However, the advertising cash here is immense. Initially Apple can focus on customer privacy, and let Google’s messaging plans play out. And if the business model and customer acceptability works, then introduce messaging informed advertising across its properties.
Will Apple’s iMessage adopt RCS? What’s the point? There are loads of messaging apps that work across Android and iOS, customers can use one of them, Apple people seem to enjoy having an exclusive club that segregates Android users to green bubbles. Google are Apple are in no rush here, they’ll move when it makes sense for their customers. The SMS ecosystem does appear to be in a ‘boiling frog’ situation where they are in the pot, but the flame has not get been turned on.
2FA is (almost) dead, what’s next? Guillaume Bourcy
The resilience of SMS for OTP (One Time Passcode) has surprised many people in the industry. In 2017, the National Institute of Standards and Technology of the US Department of Commerce said SMS for 2FA was a deprecated solution. However, ease, ubiquity and habit kept SMS the dominant solution, about 40% of the global authentication market. However, things are now changing.
Passkeys are on a rapid rise, and will be covered in Dan Jenkin’s presentation. The mess of A2P SMS in the US with steep price rises, and international fraud on 2FA SMS, have driven large web brands initially back to email, and now to passkeys.
The phone number has always had the benefit of existing across the online and real world. But the options for identity verification across sign-up and sign-in are expanding. With progressive trust approaches. Some examples of alternative to the phone number include:
- Scan of an official document
- Social sign on
- Progressive profiling
- Decentralized identity
- Biometrics, facial or finger print
- Single sign on
- Header enrichment (silent authentication)
- FIDO2 / Passkeys
- Complementary contextual data – IP addresses, location, equipment ID, time of day, behavioural, etc.
Currently the telco approaches appear in retreat as the pricing, processes, and fraud have encouraged larger web brands to seek alternatives. Traditional risk / identity aggregators in some verticals have expanded their offers as well. But the ease and ubiquity of telecoms will ensure it remains part of the broader emerging solutions.
Passkeys, FIDO2, WebAuthn… What does it all mean? Dan Jenkins.
Dan provided a great review of how we have come to this point in username / password evolution. We’ve all known passwords suck, and some people have simple password methods to use a root password and systematically adapt to other websites. 2FA enables a simple additional layer of authentication. However, SMS’s ease and ubiquity, was also the cause of its downfall.
There are alternatives such as:
- TOTP (time-based one-time password)
- WebAuthn / FIDO (security keys)
- App based Push Notifications
- Voice based codes
Dan hits the nail on the head in stating passkeys are simply more secure than a password. Its for non technology people, who hate 2FA and write their password on post-its and stick them around the computer monitor or laptop. Enterprise security is only as strong as its weakest link, and passkeys make that weakest link much stronger.
Dan highlights recent moves by Github and WhatsApp on passkeys, we’ve also seen Microsoft for its enterprise customers, and Amazon also making moves. In Dan’s closing telecoms which has ridden the 2FA horse of many years, needs to find how it lives in a passkey dominated world.
LLMs on the telephone: useful tool, or hallucinating danger to humanity? Rob Pickering
Rob has been posting interesting results on LLMs, and his thinking has been evolving on their usefulness and how they can be implemented. This was another example of a no-BS presentation by an implementer. not a marketer.
Most interactions on apps to at kiosks only require human interaction if the user can’t or won’t use an app. Or for exception handling: algorithm
is broken, only a human has enough agency to fix it.
Rob had a brainstorm one morning and built a couple of LLM agents. From that he sent up some prompts for the 2 agents to negotiate the price of some doughnuts. And left the agents to it.
What Rob recommends is gatekeeper logic. See diagram below,
Allow LLM full authority over conversation flow, but authorise operations with side effects or changes in context only in gatekeeper logic.
We are back to writing code, but this is the easy code. Action what the LLM says the user wants.
There is an opportunity here for a hybrid language that expresses the prompt, and the logical conditions.
Using LLMs in automated phone conversations can be service enhancing.
It needs careful design to harness the language recognition strengths of these models whilst containing their non-deterministic properties.
Solving these and other implementation issues within a best practice open source stack seems like a good idea.
I think of it as fuzzy logic, the LLM translates the messy human communications side into roughly correct intents. The logic progresses the workflow based on those intents, with all the agency allowed for that workflow.
Artificial Intelligence in Telephony Systems and Solutions. Enhancing Communication Through AI. Borja Sixto & Karel Bourgois
Karel signposted the change in LLMs last year at TADSummit with the transformer model, and this year showed the state of the art in implementation with Borja.
Even within the media LLM’s today can :
- Eliminate noise
- Eliminate music, wind and specific background audio
- Enhance the speech part of an audio for better understanding
- Compress audio at higher rates than traditional codecs
LLMs can skip the STT (Speech to text) part and go straight to intents. In the demo Rob showed we saw how the mistakes in STT were ignored, and even misunderstandings on cost units used in for negotiation were resolved.
The key takeaways were:
- AI is now fully integrated in telephony systems
- AI adds a real value and improves the quality and user experience
- AI can be complex to integrate (tune, realtime, costs)
The last point reflects the craftsman stage of the technology’s development. Even the technology behind LLMs is rapidly evolving, with low footprint open source models moving fast. Technology expertise and experience are essential, hence why no-BS events like TADSummit are essential.
The problem of latency (waiting for the model to respond) is improving, though careful design choices are required. That is a common theme through several of the presentations on LLMs is design and optimization. Optimization is a moving target, so implementations are going to evolve rapidly over the next 2-3 years. This is not a mature segment.
From Communications to Conversations – What’s Changed And How It Might Matter, Paul Sweeney
Paul has enabled TADSummit to track the evolution of conversational AI through the lens of Webio. This has kept the focus practice based. Paul kicked off on the Amazon Alexa experiment, it simply lacked integration, so most questions resulted in non-answers. A timer remains my most popular application. Conversational AI is directionally correct, but there’s much work to do, and the technology is also evolving.
LLMs are the fuzzy logic between the complex real world and the state machine within the workflows. Implementing custom models gets complex fast, as Paul shows in the diagram below.
Paul shares Webio’s learning so far:
- Information is matched; interactions enabled; transactions made, all orchestrated in the one conversation flow. See Rob’s architecture of a gatekeeper. This is how value gets released. But you need it all.
- The LLM’s will be open source, customized, trained. That takes time and effort. Not one LLM, but multiple LM’s being trained. That requires LLMOPs.
- Model accuracy, intent fit, outcomes all have to be assured and controlled for. Custom data, values, numbers etc.
- All this has to be performative, low latency, and super low cost. Scale a suboptimal architecture and see what happens to your AWS bill.
The journey continues, with specific / constrained problems adequacy resolved, but we are far from the vision of conversational AI. As mentioned previously, the craftsman stage of development, but its moving fast thanks to open source.
How a legacy 80s technology is delivering ChatGPT to developing countries – AI over USSD. Ken Herron & Celeo Arias
This was an interesting example of mixing high (ChatGPT) and low (USSD) technologies to solve a problem in Nigeria on excessive delays at hospitals. The National Hospital Abuja provides the distribution channel for the USSD number, which is often the hardest part of an USSD service, discovery.
PANEL SESSION: AI and Video applications,
Chair: Arin Sime, Founder/CEO WebRTC.ventures
- Paula Osés, AI Engineer from Noumena
- Lorenzo Miniero, Chairman Meetcho, Author of Janus WebRTC Server
- Romain Vailleux. Apizee DevRel & Partnership Manager
- Paul Sweeney, Chief Strategy Officer & Co-Founder, Webio.
- Pieter Luitjens, Co-Founder and CTO at Private AI
I not going to summarize the session, there’s so much good stuff covered I can only recommend you spend the time to listen the the brain-trust of AI in programmable communications.
I think the closing statement from Pieter Luitjens, of Private AI, is crucial. Training on the specific problem is critical, especially in getting redaction to work on PII (Personal Identifiable Information). Linking back to the keynotes, vCons are going to be essential here for appropriate training. And Pieter points out, whatever the model is trained on can ‘pop-out’. Which ties back to the first point made by Paul Sweeney on this panel – privacy and security.