Getting started with data visualization. Seeing patterns in telecom data by David Curran, Machine Learning Engineer. You can email Dave directly if you have questions or in the comments section of this weblog.
I first came across Dave through 2 different routes, I was introduced to him by Paul Sweeney as a possible speaker for TADSummit Asia 2019 given his work in China on chatbots. And I’d also unwittingly seen his profile on Reddit, as I enjoy the data visualization plots in r/dataisbeautiful. It took me a while to connect the two Daves.
I like to bring interesting speakers to TASummit who have something to contribute that I think is relevant to everyone in the TADS community, yet is not squarely in programmable communications. And Dave’s data plotting skills, are worth copying.
This presentation covers:
- How to plot data
- Where to get data
- Where to get code / visualizations – Joyplots
- How to get feedback
- Imitate, assimilate, innovate
- Grammar of graphs
- Network and geographic visualizations for Telecoms
This is a great primer on how to plot all the data we have access to in programmable communications like a hardcore data scientist. Dave provides guidance on where to find data (not just telecoms stuff), and the tools to play with the data, like Joyplots. He shares his journey in plotting music data, e.g. the change in bpm (beats per minute) and loudness of music over the past century. Using a number if communities to get advice and support in its presentation.
The quote from Clark Terry (Jazz Trumpeter) of ‘imitate, assimilate, innovate’ is true universally. But for plotting it really does ring true. So copy Dave and build on this work 🙂 He wraps up with some of his experiences in telecom data, where I think we’re only just scratching the surface on the understandings possible. The Twilio / Segment deal will look stone-age soon, paying so much for collecting data was so 2010s 😉 It’s the plotting that matters!
Slideshare is having difficulties today, here’s a direct link to Dave’s slides.
Thanks for a great introduction to data visualization, which I think will grow in importance in programmable communications. In your 7 step guide you mentioned, #2 plot a simple graph. To get a handle on the data, gaps, and mismatches. What are some of the ways you’ve used to remove / fill in / resolve such problems in the data to avoid nasty glitches in an otherwise beautiful curve?