Tags

, ,

HiRes1

 

 

 

 

 

 

 

 

The first time I realised we had a problem was when Soeren stepped in my room and I could see more white than usual in his eyes. It turns out that during a scheduled replication task, a common task that we do almost every day, our team observed a serious performance degradation affecting a small number of our organisations. Further analysis revealed that the root cause was a simultaneous multiple hardware failure. One of my nightmares that prevents me from getting a good sleep at nights has happened.

Our technology team immediately stepped to the mark and took care of the problem. The way they reacted was very impressive. I wish I could say the same for the management of the company.

You see, one of the biggest issues for any cloud company in any market is to ensure reliability of services. With 3dnetmedical.com, when doctors use it for clinical decisions that affect human lives, this statement is one thousand times more important. For this reason, we had made massive efforts to ensure that our service, would always be up, (such as establishing multiple resilience across devices, servers and Datacentres), yet we had a problem. Customers quickly began to grumble that the service was not reliable.

When this happened, we actually had an uptime rate of more than 99 per cent, and our service was much better and much more reliable than any traditional Medical Image Management and PACS software. But any disruption was understandably maddening our customers. It took us a few days to restore normal operations for the small number of customers affected by this issue, but during that period we were losing their faith.

For me personally, as the CEO of a cloud organisation, it was an incredibly challenging time. It felt like sailing in uncharted seas with heavy weather. In my mind, I had no doubts for our foundation, our technology model, and its ability to scale as much as it needed to, or its ability to deliver our crazy ideas to the market. Yet with some customers we were hitting a roadblock.

When this happened, I felt that our public response was not our primary concern. I was thinking like an engineer, not like a manager. I thought we needed to focus 100% on fixing the issue and to remain as low profile as possible until the problem is solved. Once everything is fixed, I thought, we could respond with a proper explanation and share the good news. This seemed like the safe response, but yet, I was feeling increasingly uncomfortable about it.

I cannot tell you enough how wrong I was. Because what I did was based on a very antiquated assumption, and this is not how modern companies need to operate. Especially cloud and SaaS companies.

My silence had been a terrible a potentially disastrous strategy. And it wasn’t just the decision not to talk that had been a shocking error, it was that I had not talked immediately. Part of the problem was exacerbated by the very nature of what we do: because we host everything, people could not call their own techies to complain and learn what was happening.

Customers were getting very annoyed. The emails were coming in.

At the end, it was an email I received from one of our customers affected by the issue (who is also a very dear friend of mine) that shook me out of my lethargic self-indulging trance (thank you Phil, I owe you yet another one). Phil wrote:

“We have to accept that an occasional system failure is an unfortunate fact of life despite all our best endeavours and regrettably a system recovery may be the only option available. However for me, this recent issue has been compounded by the lack of notification that a problem had occurred and it consequences. It was only through the repeated logging of faults that the story slowly came out in small pieces. In general your support guys are excellent, very responsive, patient and honest with their observations but it’s events like these that need the next layer of management to co-ordinate directly with the likes of me to give an open and accurate evaluation of the situation so that we can manage things on our side.”

I had to find a way to communicate quickly and candidly – even if going public with this issue felt like a defeat at the moment, a bolt move and a leap of faith was what needed. I convinced the team that we should allow the public – and the competition – to see exactly how our system was functioning every day. It meant that we would be sharing embarrassing details every time the system slowed or stopped working. One of our engineers was very sceptical about it. Why would a company make itself vulnerable in that way? I will tell you why. For one very simple but very important reason:

Only by embracing transparency we can build Trust in the market we serve.

I realised that complete transparency was what we needed if we were to restore trust for the small number of affected users. Moreover, I also considered that by embracing transparency we would encourage good behaviour from our guys as it added a new level of accountability and responsibility. Then in the middle of my crisis, I have decided to do two things at once: send an email and explain what was happening to all our affected users, and to opened up our internal system for everyone to see. We call it the trust site:

http://trust.3dnetmedical.com

The site offers real-time information on system performance with up-to-the minute planned maintenance, historical information on transactions, reports on current and recent phishing and malware attempts and information on our security technologies and best practices. Instead of hiding behind our problems, we started educating customers and prospects about where they could find the information they needed. Let me tell you that, it felt so liberating not to have to act defensively. That night I got a very good night sleep.

There is no question: we would not be around today with 14,000 users, if we were not always bettering the technology and improving its features, speed and reliability. At the same time, I don’t think this unprecedented growth we’re experiencing will be sustainable if we do not embrace transparency. The difficult decision to launch the trust site I think differentiates us even more. As the CEO of the company, I hope and wish Transparency and Trust will be a strong part of this company’s identity and DNA in the years to come.