20201120 - FlowConFR - Nine Ways To Fail At Cloud Native, Holly Cummins

by Thierry de Pauw on

#flowcon

Nine Ways To Fail At Cloud Native, Holly Cummins, @holly_cummins

Consultant at IBM Garage

These are my scary stories

Twitter thread: https://twitter.com/ComSaraDufour/status/1329850370684751872?s=20

First problem: what is even Cloud Native?

One explanation of Cloud Native: SOA/ESB -> Microservices -> Cloud Native

??? lots of small services, the platform is smart and the services are dumb ???

The emphasis on micro-services for Cloud Native does not feel right

Cloud Native Foundation: micro-services, containers and dynamically orchestrated

Still does not feel right to me

If you ask 10 people what cloud native is, they will all know what cloud native is, but they will all come with a different definition. @holly_cummins

Some people will say:
- born on the cloud
- microservices
- kubernetes
- devops
- it has been build in the past 5 years, it is modern and nice
- synonym for 'cloud'
- idempotent: the problem with idempotent everyone says 'What?', you can rerun them

What is Cloud Native not? it is not a synonym for 'microservices' @holly_cummins

If cloud native has to be synonym for anything, it would be for 'idempotent' @holly_cummins

  • @ComSaraDufour: So WHAT does "cloud native" actually mean? Everyone has their definition.

" If #cloud native has to be a synonym for anything, it should be 'idempotent'*. " #flowcon

Cloud Native Foundation mentions immutable infrastructure! and microservices examplify this behaviour

why cloud native? we want to build great products faster
make high-impact changes, frequently, ...

Bottom line is: something that allows you to "make high-impact changes frequently & predictably with minimal toil, to build great products faster". #ProductManagement https://twitter.com/ComSaraDufour/status/1329852127926247426?s=20

Fail: The muddy goal?

what problem are we trying to solve?
"everyone else is doing it?"

Why cloud?
- It used to be a cost driver. It was cheaper because of economy of scale.
- Elasticity. You don't have to pay for things you don't use.
- Speed.
- Exotic capabilities. Use expensive infrastructure like quantum computers, ...

  • @ComSaraDufour: Holly sees 4 four major benefits from a #cloud native architecture: —money —elasticity —speed to market —exotic capabilities

But very often, a #transformation is social-driven: "everyone's doing it so we should too". Many orgs know they want to be CN but don't know what pb to solve. https://twitter.com/ComSaraDufour/status/1329853372221022208?s=20

First cloud-movers got electrocuted.

-> 2011: 12 factors: how to write a cloud application so you don't get electrocuted

-> 2010: the dawn of cloud native

Are we all going to agree on the goal?

Fail: Microservices Envy

Microservices are not the goal, they are the means. @holly_cummins

we're going too slowly. we need to get rid of COBOL and make microservices!
... but our release board only meets twice a year
-- a bank

=> you will not actually go faster until you fix your release board

Containers are a good base. But number of containers !!???
It's not a competition to see how many you can have

=> distributed monolith
but without compile-time checking ... or guaranteed function execution
=> there is a cost to distribution

reasons not to do microservices
- small team
- not planning to release independently
- don't want complexity of service mesh - or worse yet, rolling your own
- domain model doesn't split nicely
when we change one microservice, we need to change another one
=> cloud native spaghetti
distributed does not mean decoupled

"each of our microservices has duplicated the same object model ... with twenty classes and seventy fields"
why not a common library? we don't want a common library because we don't want coupling, which is fair
but the problem was: the domain model was common over all the microservices

Micro services need Consumer-Driven Contract tests!!!
remember the a failed space program: imperial units (base) vs metric units (satellite)

Fail: the not-actually continuous CI/CD

CI/CD is not something you buy, it is a verb - not a tool - it is something you DO! @holly_cummins

"I'll merge my branch in our CI environment ... "

CI/CD ... CI/CD ... CI/CD ...
we release every six months ...
CI/CD ...

that is not continuous ...

But ...

How often should you integrate?
there is a spectrum
- actually continuous ... but stupid
- every character
- every commit (several times an hour)
- every few commits (several times a day)
- once a day => trunk-based development
- once a week
- once a month
- once every six month

How often should you release?
there is also a spectrum
- every push (many times a day) -> need a good handle on feature flags
- every user story
- every epic
- once a sprint
- once a quarter
- once every two years: old school

How often should you test in staging?
there is not really a spectrum on that -> continuously

We can't release? But why? Why can't you deploy more often?

we can't release this microservice ...
we deploy all our microservices at the same time.

The point of Cloud Native is ... speed

What's the point of an architecture that is expensive and that enable you to go faster but you don't go faster?
=> you don't get feedback

Feedback is good engineering
Feedback is good business

  • @ComSaraDufour: The "not-actually-continuous" CI/CD...

CI/CD is a verb —not a tool—, it's something you DO!

Feedback is good #engineering and good #business. #cicd #tech #flowcon #ProductManagement https://twitter.com/ComSaraDufour/status/1329857162005327885?s=20

If it is too scary to deploy? defer wiring, use feature toggles, use A/B tests

Fail: lack of automation

our tests are not automated, we don't know if our code works

oh yes, that build has been broken for a few weeks ...

systems are going to behave in unexpected ways

with microservices you need to have these automated contract tests

Fail: Governance

Companies put too much Governance on cloud

Provisioning tool does not work because ...
84-step pre-approval process before provisioning

we're going to change cloud provider to fix our procurement process

If the developers are the only one changing, it is not going to work
It has a cost. Developers will leave.

Fail: the mystery money-pit

The cloud makes it so easy to provision hardware.
That doesn't mean the hardware is free or useful -> it has a cost

Hey boss, I created a Kubernetes cluster ... I forgot about it ... 2 months later I realised it cost 1000 USD per month

2017 survey: 25% of 16.000 servers doing no useful work
-> financial impact and climate impact
and that is why there is governance

But ...

There is surely nothing quite so useless as doing with great efficiency what should not be done at all.
-- Peter Drucker

we have no idea how much we're spending on cloud
=> cloud for managing cloud costs
=> FinOps

Fail: Ops in the cloud is harder

=> SRE
make releases deeply boring so that we can do them often

But our use case is special ...

in most cases there are good recoverable
space: not recoverable
most other cases: are recoverable

hand-offs bad, automation is good => allows to release often

Ways to succeed at Cloud Native

You have to have that clear goal of what you want to achieve

Optimise for feedback

Look at the whole system -> Systems Thinking.
If you automate something, change the processes around that assume that the previously manual process is expensive and error prone.

The objective is: Optimise the system as a whole