I’ve known Paul Golding for many years, well over one decade. All the way back to the early days of helping telcos try to understand what are these things called “developers”. Paul has been in AI since the beginning, I recommend following Paul as his posts are insightful and an excellent tracker on the latest in AI.
For example, just before we hopped onto the podcast Paul had posted on LLM evaluation tools and their importance in confirming the LLM does only what you intend, it’s a hard problem, much harder than software testing.
Through the podcast you’ll learn of the shortcomings of LLMs and some of the additional steps and technologies required to get close to the hype we currently experienced. This is a long podcast, but its deep and you’ll learn a lot. We really have taken a first step, of a thousand mile journey. That realization, AI is not a quick win, kicked off a stock market correction this week. Remember when governments were talking about the strategic importance of 6G, that one’s gone quiet. However, AI is different to 6G it is important. Casey Newton did a nice post setting the market correction in context, AI takes time.
Paul reviews the noise, and we’ve been here before, many times, with blockchain, big data, machine learning, dot com, wireless web, gamification, AR/VR, Metaverses, etc. All of which delivered to varying degrees, but it takes time, we have to experiment and learn, and that learning time varies.
Paul posits the co-incidence of foundation models like OpenAI working in a form, and lots of research papers with claims on massive performance improvements created a productivity myth. Such as coders can be 400% more efficient with a coding co-pilot, sales people can double their quota with an assistant. The market seems on the fence about co-pilots at the moment, with Github’s copilot receiving the most positive mixed reviews.
Clearly the hype has proven to be just that. And we’re currently in a phase of testing and experimentation. The more training the LLM receives the more general the answers. Its an inference engine and its seeing lots of answers, so the answers become more general to cover everything it’s seen. We see that in Chat GPT, the suggested text is bland and so generic to be almost meaningless.
While Chat GPT delivers the experience of an engaged discussion its not delivering insight. Its at risk of being too general. That’s where in combination with machine learning, e.g. XGBoost (decision tree machine learning library started in 2014), insight with orchestration is possible.
This is interesting, LLMs are a neat front end, but it’s often generic and bland. While in combination with machine learning / data science, the identification and application of rare insights could be possible.
The missing discipline across all this is the evaluation models to ensure LLM/ML combination does what is required.
So the 3 steps are, using weak supervision build many recipes / models, assemble into a strong learner using machine learning, and evaluate the combination.
Johnny wants to do a review with Paul of some of the companies with AI stories doing the rounds on Wall Street. We’ll have Paul back soon for that.
One thought on “Podcast 84: Truth in AI, Paul Golding”