Video and Slides
Keynote Outline: Mindful Connections
Sami Mäkeläinen, Head of Strategic Foresight at Telstra. Technologist | Humanist | Pragmatist
- Are we breeding complexity or simplicity?
- Position: Indiscriminately gluing everything together with IoT and APIs, especially when in search of efficiency, can breed (and has bred) systems that are fragile, unsafe, and inherently dangerous.
- Are we coupling things that are best left uncoupled?
- Position: some things are better left uncoupled, or coupled with slack in them. This does not come naturally to systems engineers, so some thoughtful consideration needs to go into the process.
- We need to connect the right things and connect them right.
- Start with pre-mortems, and more holistic systems analysis.
You can ask Sami questions in the comments section of this weblog, or contact Sami directly, his info is at the end of his presentation.
Sami sets out an excellent review of the systemic failures being created given the increasing coupling and interactions of communication systems. I highly recommend reviewing the AI Incident Database (AIID), it makes for interesting reading.
Sami reviews some of the thought leadership behind the need for a framework for mindful connections, and interesting case studies. Perrow’s Matrix shows how complexity is increasing across all systems, even something as linear and loosely coupled as a dam has become more complex.
The ‘warning’ signs Sami highlights rings so true to my experience. We’re all busy, a battery alarm occasionally pings, but everything is working, so its ignored along with the countless other network alarms that generally mean nothing, framing errors. That is until the power goes out and network nodes that were battery backed-up no longer are, and a cascade of errors propagate.
Sami describes some of the tools to address these issues such as pre-mortems, building slack into the system (Amazon Dynamo database is a good example), and openly sharing failures. He leaves us with 3 actionable steps:
- Not every connection is a good connection – what could go wrong?
- Avoid creating brittle systems by focusing purely on efficiency, embrace purposeful inefficiency.
- Help change the culture to be more like safety critical industries. People’s privacy, confidentiality, safety, well-being are critically important to your business. Facebook’s annual, ‘oops we messed up again with your data again’, is no longer tenable.
Thank you Sami for an excellent keynote 🙂
6 thoughts on “Keynote, Mindful Connections, Sami Mäkeläinen”
Hi Sami, I love this keynote, thank you. Here are my questions, you can answer each one in a separate response to make it more readable.
1) Do you think the move to Web 3 / decentralized systems, e.g. blockchain, Matrix.org, IPFS, Ethereum; will reduce coupling and interaction complexity. Compared to the Web 2 approach of the centralized systems of Facebook and Google? Or net-net add complexity so we end up in the same failure risk?
2) The incident database is excellent, and highlights the complex interaction between multiple systems including humans. How do you think in programmable comms we should make this happen? I’ve lived through many failures in campaigns and systems, but the learning is never shared. For example, on warning signs, the number of ignored battery error warnings, until the battery is finally required and the system fails in an emergency….
3) I’m not aware of any pre-mortems in products and services I’ve been involved in. Often there is no time, often its building on what has worked in the past. Is this one of those culture issues, in being allowed to make the time?
4) As we examine how programmable communications is evolving, we’ve generally built on top of the PSTN, a loosely coupled linear system with inefficiency (multiple routes in/out of a country) and an expectation of failure built in. But that is changing through both decentralization, internet-based communications, and real-time APIs on complex centralized services. Your recommendations look like they’re going to become much more important.
5) What’s your view on using tools like Chaos Monkey / chaos engineering?
6) The connected toy is a great example of the impact on privacy, confidentiality, and child safety. But innovation requires giving things a go. How do we strike a balance?
Thanks Alan! Now, onto the questions:
1) These technologies do not necessarily result in reduced coupling and complexity; in many cases, they may make things worse. If you think of Blockchain, many implementations including Bitcoin are actually very centralized (the decentralization is mostly a mirage) and they inevitably add complexity to the system. If you think of Ethereum and smart contracts, they almost by design eliminate slack and make systems more tightly coupled.
Even the inefficiency of Bitcoin is not the good kind of inefficiency – in fact, it’s an example of inefficiency done wrong.
Now, there are systems grouped under the “Web 3” moniker that can be genuinely useful – more decentralized systems like Matrix.org and Solid can end up making things better. Time will tell if, where, and importantly, how and how widely they are adopted.
2) I think this is one of those thorny situations where the whole, or almost whole, industry needs to come together to enact these changes – the most important of which is arguably the cultural change. Effective regulation can help in getting it started, but it’s not sufficient. It has now been five years since Bruce Schneier called for effective regulation of IOT because the cost of things going wrong in that space is too high (see https://www.schneier.com/blog/archives/2016/11/regulation_of_t.html). Have things improved since? Not a lot, I would argue. There have been some timid attempts to ensure IOT is more secure, but not nearly enough.
I used examples from aviation and granted, developing what is known as a Just Culture was perhaps easier in aviation. There was a very clear incentive to come together and ensure flying becomes safer – flying being safe was a prerequisite for people accepting flying as a means of transport and growing the industry. Over the history of aviation, there are countless cases where high-profile crashes have usually ended up in effective changes that have made the whole industry even safer. I’m just hoping that with other systems, we wouldn’t need to get to the point of actual mass fatalities happening before everyone agrees that this is something that warrants industry-wide co-operation and collaboration to improve.
3) Yes, it probably is – a malignant culture issue related to short-termism because spending that time to think about the potential pitfalls ahead of time is likely to pay back in multiples later on in the project. The famous comic of the caveman offering round wheels comes to mind; https://tenmilesquare.com/when-is-the-right-time-to-innovate-your-business/ is just one of the countless articles inspired by that.
4) Yes; from a systems-level, decentralization strategies are important. Whether we’re talking about an organization’s response to a crisis or communication protocols facing a disruption, decentralized decision-making has been proven to be more resilient and adapt quicker to abnormal situations.
There is a bit of a pitfall when it comes to APIs, however – way too many APIs have a single point of failure somewhere; in the worst case, a single switch failure can throw things off. So even when we have APIs as building blocks to more resilient systems, we need to make sure the APIs, including their communication endpoints, are actually resilient.
5) I think chaos engineering is a great approach to building resilience and confidence in the systems, but care must be taken to ensure it doesn’t build overconfidence. The large variety of perturbation models can make up a great test battery, if you will, but they can never guarantee the system will work under all abnormal conditions. We also have to remember that no matter the amount of chaos we throw at the systems, in most cases we simply cannot test ALL possible situations.
At the other end of the scale are formal methods where we CAN guarantee the operation of a usually simple system. This is something we should, I think, try to use more of – tools like ANSYS SCADE are widely used in aerospace industries, and friendlier formal specification languages like TLA+ now exist as well. There is no reason the use of these tools should remain confident to the aerospace industries.
6) Innovation does require giving things a go, but it does not require just doing anything you can think of in a haphazard manner without thinking. The prospect of regulation is often balked at with people claiming that it “will slow down innovation” – and you know what, they might be right; sometimes regulation and rules and guidelines and principles indeed slow down innovation. What we need to realize, however, is that does not need to be a bad thing; if by slowing down innovation we increase the quality of the innovation we generate, it can be a very good thing.
But we don’t need to look for regulation to improve on how we operate. While there is a whole plethora of tools and procedures that can help us, it really all boils down to thinking before acting – taking a moment before a decision to step back take a good hard look at what you’re doing, why you’re doing it and what the unintended consequences might be.
Comments are closed.