10/2 Velocity London Day 1

by Gene Kim on

#velocityconf

Theo Schlossnagle

  • @souders: #VelocityConf tutorials starting up. A lot of familiar faces and even more new ones. Super exci
  • ted!
  • Indeed. @allspaw just introduced Theo. RT @souders: #VelocityConf tutorials up. Lots familiar faces & even more new ones.
  • .@postwait: "Direct observation of failure leads to quicker rectification; (easier when you visually see the failure)
  • .@postwait: "I.e., you cannot correct what you cannot measure"
  • .@postwait: "Debugging failures require
  • .@postwait: "Debugging single threaded apps: easy (segfault == complete state); kernel panic at least gives u usable core"
  • .@postwait: "Windows gives you a perfectly usable post-mortem core file" "Raise hand with pride"
  • .@postwait: "The beauty of these is easy; one core file, one system"
  • .@postwait: "Multi-threaded applications: Challenging; multiple core files, stack traces, but don't know when"
  • (Holy cow, @unixdaemon! Long time no see! Drinks sometime this week?)
  • @TheOpsMgr: RT @velocityconf: RT @velocityconf: Welcome to Velocity EU everyone! Tutorials start in 30 min. All #velocityconf sessions today are in the East Wing, on ...
  • @TheOpsMgr: you're here, too? If so, drinks this week? Long time no haiku! :)
  • .@postwait: "Distributed applications: here there be dragons; (defies easy explanation of current state at crash time)"
  • .@postwait: "Breakpoints are no longer useful in distributed systems: most architectures too ugly to debug"
  • .@postwait: "Rare exceptions of debugable apps include cassandra, reak..."
  • .@postwait: "Most modern useful debugging tool: printf()"
  • .@postwait: "@arnaudlimbourg: @postwait at #velocityconf http://t.co/zDJBC4GY
  • @cvonwallenstein: #VelocityConf @postwait dropping some #Trogdor wisdom on distributed apps http://t.co/Mw5X5MeF
  • @RealGeneKim: Come join DevOps Cookbook authors tonight 7p tonight! #velocityconf @damonedwards @patrickdebois @botchagalupe @allspaw http://t.co/JKyhnGDn
  • @RealGeneKim: Come join DevOps Cookbook authors tonight 7p tonight! #velocityconf @damonedwards @patrickdebois @botchagalupe @allspaw http://t.co/JKyhnGDn
  • @botchagalupe: Nagios active checks is passive in the sense that it is not in the direct path of the application @postwait
  • @botchagalupe: Monitoring is all about the passive observation of telemetry data @postwait
  • @arnaudlimbourg: Room is packed for @postwait #velocityconf http://t.co/QdeteJpq
  • @botchagalupe: @postwait 1) repeatability 2) control groups 3) external verification (scientific process)
  • @botchagalupe: @RealGeneKim I can hear your fingers power tweet 4 rows behind you
  • Haha. Life is good, buddy! RT @botchagalupe: @RealGeneKim I can hear your fingers power tweet 4 rows behind you
  • @ninjasys: Monitoring doesn't pinpoint a specific line of code but it does show an overview of the problem
  • Power Tip: On Mac, but Windows "maximize window" button? Try Divvy. I bind it to key. no more manually move/stretch window
  • .@postwait: "Don't know how many times I log onto prod box, someone asks, 'what's that?'. I answ, 'same thing as last yr"
  • .@postwait:
  • @ninjasys: "I'm going to login to a production system, development systems are boring" .... Shit just got real @postwait
  • @synodinos: "Waterfall thumbnail too long to fit in the screen -> hint you've got too many connections" #velocityconf http://t.co/9iece9S0
  • Haha. .@postwait: "mpstat: God intended that all the columns line up. If they don't, that's your problem"
  • @synodinos: "Waterfall thumbnail too long to fit in the screen -> hint you've got too many connections" #velocityconf http://t.co/9iece9S0
  • .@postwait: "Monitoring: statsd, resmod; metrics; folsom, metrics.js, metrics-net"
  • .@postwait: "You need most urgently telemetry on your own internal apps; outside world has no helpful tribal knowledge"
  • Haha. @postwait: "You can log onto #postgres channel, desc ur symptom in four words & someone probably knows what's wrong"
  • .@postwait: "Easy problems: everything is broken/crashed; Hard problem: 1MM transactions worked, but 2 are broken"
  • .@postwait: "more tools: reconnitor
  • .@postwait: "
    https://pbs.twimg.com/media/A4MEKl8CMAAqMut.jpg
  • .@postwait: "
  • @RealGeneKim: .@postwait: "See the pattern? What went wrong?" #velocityconf http://t.co/o08RBLWO
  • .@postwait talking about how avg() kills you by obscuring interesting things"
  • .@postwait: "Every gaming site meas user load time; U hv shitty page load times? Put you on svr w/other crappy connections
  • .@postwait: "Why? So your crappy connection doesn't ruin other people's gaming experience"
  • .@postwait: "Most architectures have redundancy; why? crap breaks; but new paradigm emerging..."
  • .@postwait: "Many orgs have dev in production, with some escalated privs; so they can bounce BGP sessions. Prob regrettable" @kief: @postwait We sre shifting to where we develop & debug in prod, because prod is too big to be able to reproduce perf issues
  • .@postwait: "With the right design, u can turn that redundancy into a debugging environment" (ideally self-svc Dev)
  • Nice. Most miss step #4. @postwait: "Methodology: 10 web svrs: I fix 1, verify 1 is fixed, verify that 9 are still broken"
  • .@postwait: "Web svrs tend to be homogeneous, share nothing/little, indep; allows easy fixes"
  • .@postwait: "
  • @botchagalupe: premature optimization is the root of having a product that works right ( iLoviT ) @postwait #velocityconf #knuth
  • .@postwait: "Others more diff to fix: not homogeneous & equal; dbs, batch processes for billing (w/o charging twice), orchestration middleware, msg queues
  • .@postwait: "Others more diff to fix: not homog & equal; dbs, batch procs for billing sys, orch middleware, msg queues
  • .@postwait: "EC2 allows shittier code than ever, extending Dev thinking further than ever. Bad for billing systems"
  • .@postwait: "In old days, Production could beat crap out of Dev" (shaking finger, saying "how dare you!" :)
  • @pingdom: We used to not write shitty code because there was one computer, you only had one shot to get it right said @postwait at
  • @botchagalupe: We have denegrated our world with shitting code becuase our dev environments are so strong @postwait
  • @pingdom: We used to not write shitty code because there was one computer, you only had one shot to get it right said @postwait at
  • .@postwait: "For every 500 page load error, I can prb see it 15 other places: on tcpdump on svr/client, load balancer, etc.
  • (Life is bad when you're poring through tcpdump, wireshark (until you realize u have to open different window)
  • .@postwait: "The one command you don't want to use during outage: man(8). The time for that was before outage"
  • .@postwait: "Learn your tools before the problem happens; spend 1h on tcpdump and strace per week" (oh dear. the kata)
  • (A dawning moment: the dojo for IT ops includes prob includes weekly training drills of gdb, tcpdump, strace.)
  • .@postwait: while looking at prod running stats: "Looks about right; that's probably about 20 reqs/second. That's normal."
  • .@postwait: "
  • @kief: @postwait We're like a nascar team, not local mechanic. Can't take time to@kief: @postwait We're like a nascar team, not local mechanic. Can't take time to read mans during a race. Use tcpdump every week
  • read mans during a race. Use tcpdump every week
  • .@postwait: "captured: 482 bytes of payload: respond:
  • .@postwait: Looking at tcpdump output: "Looking at output, I see that my webserver is slow:
  • Hmm, yep. I think that awk script looks about right. Except line 3. May be missing a % sign somewhere...
  • @unixdaemon: @RealGeneKim some of the katas also make good discussion points for tech interviews
  • .@postwait: "Why is SSL always faster? Negotiation done at server"
  • Wait. Is @postwait really running as root on a production server? Wow. What could go wrong? :)
  • .@postwait: "
  • @pingdom: “If the waterfall chart doesn’t fit on the screen, that’s a hint that there are a lot of resources” @jmarantz
  • .@postwait: "
  • @pingdom: Sharding static resources across multiple domains can reduce load speed due to dns latency and client bandwidth saturation
  • .@postwait: "strace/truss; gstack/pstack; gcore+gdb/dbx/mdb" (these are the same primitive tools we used 20 yrs ago. yikes)
  • .@postwait: "
  • @stack72: getting a demo of tcpsnoop as a tool for observing webservers. this is a great way of observing the health of webserver health #VelocityConf
  • .@postwait: "Looking at gstack/pstack makes me want to cry; God finally came down and gave us mpstat lined up & dtrace"
  • .@unixdaemon: dude, this is bringing back bad memories. I was doing filesystem QA in 1986 for Sun hfs. Same horrible tools
  • Crowd breathes small sigh of relief as @postwait switches back to laptop shell, instead of production box. :)
  • .@postwait: looking at production box: "Sendmail? What? That shouldn't be running. Nothing should be sending messages..."
  • @unixdaemon: This is the part of the talk where dtrace is shown and all the Linux admins mutter 'systemtap' and look a little embarrassed
  • @unixdaemon: "Linux programming interfaces" book by nostarch press is excellent for those who want to know more about modern Linux kernels
  • .@postwait: "Linux is only OS w/o dtrace; Solaris, MacOS, etc. all have it. If u want to crash systems, try systemtap
  • @cvonwallenstein: Generating histograms with dtrace #velocityconf cc @postwait http://t.co/8BGpSHMS
  • @bertcraven: RT @postwait: RT @postwait: Monitoring and observability slides from #velocityconf are now available: http://t.co/IAgYJ9w2

John Allspaw

http://www.hypershop.com/HyperJuice-External-Battery-for-MacBook-iPad-60Wh-p/mbp-060.htm

  • @beezly: #velocityconf dtrace for linux at https://t.co/Nwq9Nvcn - just built it and ran a dscript. It worked :D Unsure of how feature complete it is
  • Sensei .@allspaw up talking "Escalating Scenarios: a deep dive into outage pitfalls"
  • .@allspaw: "Who works on distributed team, communicating primarily with email or chat"
  • .@allspaw: "This is not about troubleshooting; this is criteria, situational awareness, HROs, decision making, comm, etc"
  • .@allspaw: "And boatloads of psychology; may be familiar to you, but may be missing the language & framework"
  • .@allspaw: "6yo daughter just learned to ride bike w/o training wheels: Sadie. can't forget it
  • .@allspaw: "Next time you're in complex, cascading failure situation, hopefully you'll remember 3-4 thing from this preso"
  • .@allspaw: "Great Amazon EBS failure of 2011: the 80 hour outage. What's left out: how teams orged, escalated, handled it"
  • @pingdom: .@allspaw getting warmed up at #velocityconf http://t.co/Mi8m2Gbb
  • .@allspaw: "Headline: 'Google Down: Leads to Fear Of Apocalypse"
  • .@allspaw: "Responses to outages: often more gateways to chg, more approvals, more complexity, more alerts, more controls"
  • .@allspaw: "Where do we learn? military, surgical teams/nuclear power gen, airline safety; we're failure pron junkies"
  • .@allspaw: "Read Normal Accidents: http://www.amazon.co.uk/Normal-Accidents-Technologies-Princeton-Paperbacks/dp/0691004129
  • .@allspaw: "Talking worst airliner accidents: Tenerife, US Airways landing in Hudson River, Kegworth 1989"
  • .@allspaw: "Dr. Richard Cook, 2012 Velocity Conf keynote: http://m.youtube.com/watch?feature=plpp&v=2S0k12uZR14
  • .@allspaw: "Book: Managing the Unexpected: Weick/Sutcliffe: studied US Navy high reliability operations"
  • .@allspaw: "Skill based (simple, routine); rule based (knowable, but familiar); knowledge based (WTF is going on?)"@botchagalupe: Managing the Unexpected: Resilient Performance in an Age of Uncertainty http://t.co/0d1poRyi @allspaw
  • .@allspaw: "Situational awareness: "knowing what's going on so you can figure out what to do"; Boyd OODA loop"
  • .@allspaw: "In our world, this is
  • @botchagalupe: The Self-Designing High-Reliability Organization: Aircraft Carrier Flight Operations at Sea ( http://t.co/w7ymqPuV ) @allspaw
  • @botchagalupe: Managing the Unexpected: Resilient Performance in an Age of Uncertainty http://t.co/0d1poRyi @allspaw
  • .@allspaw: "Mica Endsley on situational awareness: 3 levels: perception, comprehension, projection of future"
  • Astonishing photo from @allspaw: fighter pilot reading manual while flying.
    https://pbs.twimg.com/media/A4McGH6CYAAbCTm.jpg
  • @botchagalupe: mica endsley situational awareness ( http://t.co/JI7GYXp5 ) @allspaw
  • @stack72: i want this sticker RT @pingdom: "We have charts and graphs to back us up" @allspaw #velocityconf http://t.co/KOVJFiv5
  • .@allspaw: "For pilots, exact altitude is usually not super critical, except on approach; for ops, equiv storage at 5% full
  • .@allspaw: "Level 2 Comprehension relies upon correct mental models; which gets us to Level 3: project future states"
  • .@allspaw: "Ahead of the curve: military pilots: 'if u know where u are know, u're in trouble; u were there 5 miles ago"
  • .@allspaw: "Clues u're losing sit awareness: ambiguity, fixation (scar tissues), confusion ("i've never seen before")"
  • .@allspaw: "...lack of info, failure to maintain, failed checkpoints, a bad gut feeling that things aren't going well"
  • .@allspaw: "We don't build systems that fall over like dominoes. that wd be too easy" (every domino blows up at same time)
  • .@allspaw: "Requisite memory trap: Our wetware can hold 7 items in head at time +/- 2; but in 20-30s, we'll forget it"
  • .@allspaw: "Workload, anxiety, fatigue, other stressors: at Etsy, now tracking over 300K production metrics"
  • .@allspaw: "Misplaced salience (need for red/yellow/green);
  • .@allspaw: "complexity creep (einstein: things should be simple as can be, but not simpler)"
  • .@allspaw: "out of the loop syndrome: (whoa. came from Boyd OODA framework!): exacerbates lack of sit awareness"
  • .@allspaw is throwing out brilliance here. I can tell b/c I've stopped capitalizing/using Shift key. :)
  • .@allspaw: "I've kicked out my boss of IRC channels, b/c it stops us from getting our work done. My team did it to me, too"
  • .@allspaw: "Teams: applied to problem space to divide/conquer, division of labor: or to swarm problem"
  • .@allspaw: "Teams not good for shotgun debugging: 'make random chgs to determine whether bug can be perturbed out"
  • .@allspaw: "Joint activity: interpredictability, common ground, directability"
  • .@allspaw: "Interpredictability: ability to make us predictable; 'u want me to do x; ok; it'll take me 6 min; took me 3m"
  • .@allspaw: "Common ground: shared mutual understanding; this is why cross training is so important"
  • .@allspaw: "Example: "I'm going to look at webserver"; "No need, John, we've already got that covered"
  • .@allspaw: "Directability: "we've scattered, but we after getting new info, can round up
  • .@allspaw: "Improvisation: every IT ops team should take improv class" (I've wanted to do this for years)
  • .@allspaw: "If you haven't seen Apollo 13 movie, read the post-mortems, now online" (best movie about ops ever IMHO)
  • .@allspaw: "Improvisation: Charles Mingus: "u can't improvise on nothing; u have to improvise on somthing"; tempo, chords"
  • .@allspaw: "
  • @botchagalupe: The post mortems of Apollo 13 and US Airways Flight 1549 .. improvisation @jazz @allspaw
  • .@allspaw: "Communications: explicitness (like military comm brevity), assertiveness, timing (talk way less during crisis)"
  • @stack72: "improvisation is not just making shit up, it has structure" @allspaw #VelocityConf
  • .@allspaw: "Timing: high reliability orgs talk less during crisis to increase signal/noise ratio"
  • .@allspaw: "@botchagalupe: Whatever @allspaw makes at #etsy ain't enough... I can't imagine there is anyone better at what he does anywhere
  • .@allspaw: "Assertiveness: 'Shut up. I know what's going on. Don't tell me about the nagios alerts."
  • .@allspaw: "Tenerife failure contributor; co-pilot deferred to very senior pilot: co-located, so 70% of signals were there"
  • .@allspaw: "On IRC, no one can see that you're thinking or face-palming"
  • @kief: @allspaw discusses the differences between passive, assertive, and aggressive
  • .@allspaw: Fascinating Exercise: Improv: 18.3 sec to construct adlib sentence: each person say one word to construct sent"
  • @unixdaemon: Improv and audience collaboration - it's all gone very american

**** TODO

  • After tweeting, cursor shows up in middle of last tweet; annoyingRT @kief: @postwait New paradigm: debug in prod, because prod is too big to be able to reproduce perf issues
  • Or after RT, cursor disappearsRT @botchagalupe: premature optimization is the root of having a product that works right ( iLoviT ) @postwait #velocityconf #knuth
  • RT should make new lineRT @kief: @postwait We're a nascar team, not local mechanic. Can't read manuals during a race. Use tcpdump every week
  • +1 Great point. RT @unixdaemon: @RealGeneKim some of the katas also make good discussion points for tech interviews
  • RT @unixdaemon: This is when dtrace is shown & all the Linux admins mutter 'systemtap' and look a little embarrassed
  • RT @cvonwallenstein: Generating histograms with dtrace #velocityconf cc @postwait http://t.co/8BGpSHMS
  • RT @beezly: #velocityconf dtrace for linux at https://t.co/Nwq9Nvcn: built & ran. It worked :D Unsure of how feature complete it is
  • RT @pingdom: .@allspaw getting warmed up at #velocityconf http://t.co/Mi8m2Gbb
  • RT @botchagalupe: Managing the Unexpected: Resilient Performance in an Age of Uncertainty http://t.co/0d1poRyi @allspaw
  • RT @botchagalupe: mica endsley situational awareness ( http://t.co/JI7GYXp5 ) @allspaw
  • RT @stack72: i want this sticker RT @pingdom: "We have charts and graphs to back us up" @allspaw #velocityconf http://t.co/KOVJFiv5
  • RT @stack72: "improvisation is not just making shit up, it has structure" @allspaw #VelocityConf
  • RT @kief: @allspaw discusses the differences between passive, assertive, and aggressive
  • RT @unixdaemon: Improv and audience collaboration - it's all gone very american
  • @vibhores: @allspaw problem solving is non-linear but iterative. Agree, in fact its hyper-iterative in fire-fighting & outage scenarios
  • .@allspaw: Jeez. On IRC, second trial was 1m 18s. Despite person was right next to you.
  • .@allspaw: "Underscores how difficult to have teams work effectively remotely. Advanced exercise: 'end sentence sadly'"
  • .@allspaw: "Kegworth 1989: smoke poured in cabin, captain disengaged autopilot, asked what engine, pilot responded 'left, no right'
  • .@allspaw: "Kegworth: pilot shut down only operational engine, based on incorrect mental model; power hierarchy failure"
  • Front row seat, watching @allspaw presenting:
    https://pbs.twimg.com/media/A4Ml-4jCMAEN-6g.jpg
  • Glimpse into correcting/reinforcing during Etsy outage: @allspaw:
    https://pbs.twimg.com/media/A4Mmx2HCUAA6uwP.jpg
  • Famous sticker: "We have charts & graphs to back us up" on @allspaw laptop
    https://pbs.twimg.com/media/A4MmOWOCcAEucrd.jpg
  • @botchagalupe: Crash that changed power hierarchy is airline crases http://t.co/f2t6T7ag
  • .@allspaw: "Latency of typing creates probs: it's learned behavior; distributed teams has inherent disadvantages & adv"@stack72: big realisation at #VelocityConf we need a WAR room
  • RT @stack72: big realisation at #VelocityConf we need a WAR room
  • .@allspaw: "Decision making: Book: Naturalistic Decision Making (NDM) by Gary Klein" (during high stress, shifting goals)
  • .@allspaw: "Step 1: what is the problem? and Step 2: what shall I do?"
  • .@allspaw: "Police officer: 1000 things happening, aware of only 100, and can only influence 10 actions"
  • .@allspaw: "Recognition primed decisions; NDM says billshit; people do what is intuitive"
  • .@allspaw: "
  • @kief: @allspaw says etsy mitigates disadvs of distributed ops team with lots of practice
  • Wow. Again, the kata. RT @kief: @allspaw says etsy mitigates disadvs of distributed ops team with lots of practice
  • @cmsj: Just heard about something from #puppetconf at #velocityconf - ChatOps. Great idea to drive infrastructure through chat bots.
  • +1 It was a great talk. RT @cmsj: Just heard abt something from #puppetconf on ChatOps. Grt idea to drive infrastructure through chat bots.
  • .@allspaw: "Often, the benefit of the runbook came from the writing, not from the reading; great for novices"
  • @TheOpsMgr: RT @vibhores: RT @vibhores: @allspaw power hierarchy sucks in fire-fighting scenarios.True, It's not who, it's what that is important
  • RT @TheOpsMgr/@vibhores: @allspaw power hierarchy sucks in fire-fighting.True, It's not who, it's what that is important
  • .@allspaw: "The PRE-MORTEM exer: "3 months from now, this initiative fails; why did it go wrong." Overcomes cognitive bias
  • .@allspaw: "@cmsj: RT @botchagalupe: RT @botchagalupe: #chatops at @github ( http://t.co/JSykk6YC )
  • RT @cmsj: RT @botchagalupe: RT @botchagalupe: #chatops at @github ( http://t.co/JSykk6YC )
  • .@allspaw: "Alerts: meant to boost SA, alarm overload, high false alarm rates, routinely disable alerts"
  • .@allspaw: "At conference for systems safety researchers, fire alarm went off, & only 3 people left. At Safety Conference!"@RealGeneKim: RT @cmsj: RT @botchagalupe: RT @botchagalupe: #chatops at @github ( http://t.co/u3dH9VfJ )
  • IMHO, this was among best talks at #puppetconf last wk. RT @cmsj/@botchagalupe: #chatops at @github ( http://t.co/u3dH9VfJ )
  • .@allspaw: "In typical operating room, on average, 4.7 alarms during surgery; 75% are false alarms" (correct #?)
  • .@allspaw: "Problem of false alerts:
  • @kief: @allspaw Once you lose trust in your alerts, it's very hard to get it back #monitoring
  • RT @kief: @allspaw Once you lose trust in your alerts, it's very hard to get it back #monitoring
  • .@allspaw: "Cincinnati riverbank w/plateau on approach path: alarms go off during approach, resulting in pilots ignoring alerts"
  • .@allspaw: "Pilots often slow clearing alarm, b/c too busy looking out window to confirm that it's false alarm"
  • .@allspaw: Etsy has karma chat bot, to shame people who don't silence alert during maintenance operations
  • .@allspaw: "Good hygiene requires continually squashing false alerts"
  • .@allspaw: "Mixed modalities: at Etsy, schemanator
  • .@allspaw: "Lisanne Bainbridge: Ironies Of Automation" 1980s:
  • .@allspaw: "Automation: moves humans from manual operator to supervisor; extends/augments humans; doesn't remove human error
  • .@allspaw: "Automation Inherently brittle" (encodes all our assumptions we had when we built it)
  • .@allspaw: "Law of Stretched Systems"
  • .@allspaw: "Eisenhower: In preparing for battle, planning is useless; training is critical"
    https://pbs.twimg.com/media/A4MtVzuCYAA_PVh.jpg
  • @kief: @allspaw on the limitations of automation. Key stuff.
  • +100. Awesome. I always learn tons every time I hear him talk. RT @kief: @allspaw on the limitations of automation. Key stuff.
  • @stack72: . @allspaw is pointing out the situational awareness of @postwait's talk from earlier #VelocityConf
  • RT @stack72: . @allspaw pointing out the situational awareness of @postwait's talk from earlier #VelocityConf
  • @FlorianOtel: RT @botchagalupe: premortem @allspaw #velocityconf < Very useful for hypothesis generation. My fav reference: http://t.co/L5TwlkrS Chap 4

Mark Burgess, CFEngine

  • @CubataKolectiv: I'm seeing Beyond Desired-State Configuration Management at #velocityconf EU. http://t.co/eFn9hoGo
  • (Accidentally got separated from #velocityconf tribe, having lunch with the Strata folks. Learned about Haskell, though..)
  • .@comsysto: Inspiring talk from John from #etsy Lots of interesting information about human psychology during outages #velocityconf http://t.co/r60u3Jv0
  • .@markburgess_osl: talking Kuhsian structure of scientific revolution;
  • @kief: @allspaw says @markburgess_osl is the Django Reinhardt of system administration
  • @markburgess_osl: "
  • @aneilsingh: @pingdom @postwait #VelocityConf #keepingithonest My experience is most developers feel code THEY did not write is shitty and start over.
  • @markburgess_osl: "Imagine 20 years ago, w/o cell phones: all possible b/c we can configure systems"
  • .@markburgess_osl: "The Big Bang was the first, ultimate expression of configuration management. 2nd one was 1980s w/rdist
  • .@markburgess_osl: "1st gen config mgmt was pkg based, 2nd gen was model based, 3rd gen is knowledge based"
  • .@markburgessosl: "@mrembetsy: "history begins with the big bang, you know the dinosaurs the iphone" @markburgessosl
  • @phrawzty: Smart infrastructure is persistent, decentralised, and understandable. -- Mark Burgess at
  • @kief: @markburgess_osl 3 waves of infra: 1) infra as process 2) infra as code 3) infra as documentation
  • @kief: @markburgessosl 3 waves of infra: 1) infra as process 2) infra as code 3) infra as documentation RT @kief: @markburgessosl 3 waves of infra: 1) infra as process 2) infra as code 3) infra as documentation
  • .@markburgess_osl: "Web focused CM (myopically) on build/provision: maybe even maintain"
  • .@markburgess_osl: "How many sysadmins to chg a lightbulb? A lot. You must tear down entire building to build from scratch"
  • .@markburgess_osl: "
  • @mysqldbahelp: #velocityconf 16% of those that have a bad experience tweet about it
  • @mysqldbahelp: #velocityconf 16% of those that have a bad experience tweet about it
  • (Wow. User experience more important than ever) RT @mysqldbahelp: #velocityconf 16% of those that have a bad experience tweet about it
  • .@markburgess_osl: "Configurations eventually evolve into a service; requires tending; example include GPS, parks, etc"
  • .@markburgess_osl: "
  • @scoobiedoobie: .@markburgess_osl it'll be fine - the way we look at systems until we get alerted by the monitoring system
  • @scoobiedoobie: .@markburgess_osl it'll be fine - the way we look at systems until we get alerted by the monitoring system
  • Haha. @markburgess_osl is the Ferber of systems work: wants to save us from being enslaved by machines
  • .@markburgess_osl: "Humans are the most important part of the system services: the consumers & custodians"
  • .@markburgess_osl: "Working for machines dehumanizes us; think of people Egyptians enslaved to build pyramids"
  • .@markburgess_osl: "
  • @scoobiedoobie: .@markburgess_osl what humans do is the number 1 influence on our IT systems
  • (creates demands, outages, work, etc.) RT @scoobiedoobie: .@markburgess_osl what humans do is the #1 influence on IT systems
  • .@markburgess_osl: "The Science of Stability used to be called Science of Correctness; think DMTF, etc."
  • .@markburgess_osl: "No notion of cause/effect, despite millions of tbls" (Fasc listening to "my 1990s flawed assumptions")
  • .@markburgess_osl: "3 waves: 1) agriculture (do by hand); 2) industrial (amplify w/machines), 3) knowledge (learn & design)
  • .@markburgessosl: "@botchagalupe: The Third Wave (Toffler) http://t.co/ceIaPKm2 @markburgessosl
  • @mbrit: RT @stack72: RT @stack72: real devices are the only real way of getting a feel of user experience #VelocityConf
  • .@markburgess_osl: "Alvin Toffler: 1970: as tech becomes more sophisticated, the cost of introducing variation declines"
  • .@markburgess_osl: "Emerging CM req: it must reduce the cost of maintaining all the configs resulting from consumer demand"
  • Profound: @markburgess_osl: "Variation adds complexity adds cost... but also value"
  • @kief: @markburgess_osl - the cost is not in how many servers, but in how hard it is for us to understand the infra we have created
  • @botchagalupe: Don't think data, think model @markburgess_osl #velocityconf (call them theories if you will)
  • .@markburgess_osl: "
  • @unixdaemon: @patrickdebois @markburgess_osl gets under your skin and makes you think and question
  • @unixdaemon: @patrickdebois @markburgess_osl gets under your skin and makes you think and question
  • Clever. @markburgess_osl: "CM 3.0: infrastructure as process" (vs infrastructure as code)
  • Haha. @markburgess_osl: "I despair when I see new network shell utilities each wk, encouraging ppl to amplify their errors"
  • .@markburgess_osl: "Using rdist attacks your machines; you can't force machines into compliance."
  • @cmsj: “Image based deployment is basically an attack against your own machines” (paraphrased) — @markburgess_osl
  • .@markburgess_osl: "Like Mt Everest climb, iterative approach difficult to capture in models"
  • .@markburgessosl: "RT @cmsj: “Image based deployment is basically an attack against your own machines” (paraphrased) — @markburgessosl
  • .@markburgess_osl: "2nd gen CM: infrastructure as code: like DNA describing organisms"
  • .@markburgess_osl: "Prob: centrally tethered systems don't handle complexity well; may become new bottleneck for chgs"
  • .@markburgess_osl: "2nd gen CM makes end nodes responsible for configuring their own state"
  • .@markburgess_osl: "
  • @botchagalupe: Seperate intent from action. In biology you separate dna from morphology @markburgess_osl #velocityconf #IAC
  • .@markburgess_osl: "When shopping at IKEA, you 1st look at catalog, browsing end states; then u buy, then assemble (recipes)
  • .@markburgess_osl is comparing #chef recipes as IKEA flat packs (cc @xthestreams)
  • .@markburgess_osl: "Convergence property is stronger than idempotent (running ops gets you to desired, correct state)"
  • .@markburgess_osl: "
  • @botchagalupe: @unixdaemon Difference is obligation = show be how it looks, cooperation = predict how it looks ( http://t.co/EifFnFcN )
  • .@markburgess_osl: "How to draw line between infras & apps? Smaller > easier to get details right; Larger > look easier"
  • .@markburgess_osl: "Ideal: you want to get to smallest details possible, so you never have to take systems out of service"
  • .@markburgess_osl: "Convergence requires thinking like engineers: akin to components on circuit board"
  • .@markburgess_osl: "Convergence requires promise theory: repeatable, distributed, understandable; equilibrium chemistry"
  • Whoa. @markburgess_osl: New CFEngine language uses promises theory: water => H/H/O; bonds and valencies"
  • .@markburgess_osl asserts CSS is also promise language, where browser promises to keep those promises during user rendering
  • .@markburgess_osl asserts promise language is also maximal Shannon info encoding, data compression
  • (I'm not sure if I've just been tricked by @markburgess_osl, but I think I've just heard him say something astonishing)
  • .@markburgess_osl: "Next advance: Infrastructure as documentation: value creators doing it for themselves"
  • .@markburgess_osl: "Big Q: how do we get from rocket science to commercial aviation: 747 is cont deployment cc @jezhumble
  • .@markburgess_osl: "Rocket science: deployment interval isn't rehearsed enough to create confidence" cc @jezhumble
  • .@markburgess_osl: "How do we get to 747 Continual deployment model? It's #devops, yes?" Correct value stream model
  • .@markburgess_osl: "Can't get there w/desired state CM; can't rebuild with every change; not even Google can do that"
  • .@markburgess_osl: "Configuration must be dynamical process, following all steps of value chain (DevOps)"
  • @scoobiedoobie: Nailed it! RT @RealGeneKim: .@markburgess_osl: "How do we get to 747 Continual deployment model? It's #devops, yes?"
  • .@markburgess_osl: "Automation re-humanizes Dev/Ops"
  • .@markburgess_osl: "Toffler: "dehumanization is not replacing humans by machines; but in making humans act like machines"
  • .@markburgess_osl: "CM 3.0 will require us to become teachers again, as opposed to working in solitude"
  • Profound. @markburgess_osl: "Get over it; we live in world w/o determinism; however, we can achieve equilibrium & stability"
  • @kief: A lot of what @markburgess_osl is saying makes me think of optimizing compilers. With IaC we are still writing in assembly
  • RT @kief: A lot of what @markburgess_osl makes me think of optimizing compilers. With IaC we are still writing in assembly
  • Gosh. I'd love to hear what @adrianco thinks of @markburgessosl latest thinking. I suspect he'd love it. .@markburgessosl: "The game of life doesn't req that we don't fail; it's that we don't fail is a bad [catastrophic] way"
  • @phrawzty: The aim isn't to not fail, but to fail in a controlled way. -- @markburgess_osl at
  • RT @phrawzty: The aim isn't to not fail, but to fail in a controlled way. -- @markburgess_osl a

Stephen Nelson-Smith: Workflow Patterns & Anti-Patterns (@lordcope)

  • Up next: Stephen Nelson-Smith: Workflow Patterns & Anti-Patterns (@lordcope)
  • .@lordcope: "I'm hugely passionate about our profession, society at large"
  • .@lordcope: "Pomodoro session: no one can focus for 90m; so repeat until done { set timer for 25m, take break}"
  • .@lordcope: "Dreyfus Model of Skill Acquisition: by AI researchers; asked 'how do AI ppl get more eff from novice to expert"
  • .@lordcope: "Pragmatic Thinking & Learning: book chapter"
  • .@lordcope: "5 lvls of capability: novice, adv beginner, competent, proficient, expert": avg shifted left, now adv beginner
  • .@lordcope: "Novice wants recipes, best practices, quick wins" "Adv beginner wants guidelines, safe env to make mistakes"
  • .@lordcope: "Competent want goals, freedom to execute; Proficients want maxim, war stories, metaphors"
  • .@lordcope: "Experts want philosophies, discussions and arguments w/other experts (!)"
  • .@lordcope: "What is work? Workflow not even a real word until 1960s; Def: 'action involving effort directed to definite end"
  • .@lordcope: "Ohno: you haven't worked if you've created a defect"
  • .@lordcope: "value: anything that gets stakeholder closer to their goals;
  • .@lordcope: "5 types of value: financial, manufactured, human, natural, umm, XXX" (missed one)
  • .@lordcope: "Web Ops helps the biz achieve its aims by solving problems, building and improving systems"
  • .@lordcope: "Taylor started movement of organizing work to optimize
  • .@lordcope: "Information processing started with invention of typewriter and copiers" (& artillery tables for WWII, yes?)
  • .@lordcope: "Drucker: coined term 'knowledge work'. But unlike mechanics, difference is cultural."
  • .@lordcope: "Rightshifting: term coined by Bob Marshall, based in UK: desire to shift effectiveness dist curve to right"
  • .@lordcope: "
  • @jmperezperez: Couldn't make it to Velocity Conference London 2012? There will be online streaming on http://t.co/pAJ0zFO1 #VelocityConf
  • RT @jmperezperez: Couldn't make it to Velocity Conference London 2012? Online streaming on http://t.co/pAJ0zFO1 #VelocityConf
  • .@lordcope: "Effectiveness: how good is org at achieving goals; Efficiency: how efficient to get there?"
  • .@lordcope gesturing on his goal of 'rightshifting':
    https://pbs.twimg.com/media/A4NidWJCUAEHvwE.jpg
  • .@lordcope: as waste goes down, productivity goes up
  • Haha. @lordcope: "5 levels: adhoc (imagine startup made up of really stupid people), analytical, synergistic; chaordal"
  • .@lordcope: "analytical (let's introduce process, milestones, etc); synergistic (systems thinking, The Goal, shortening feedback), chaordal (synergistic notice the leaders)
  • .@lordcope: "Chaordal: the combining of chaos and order"
  • (This is part of day when carpal tunnel starts setting in.)
  • .@lordcope: "My blog: http://agilesysadmin.net/"
  • .@lordcope: "Christopher Alexander on patterns seminal to Martin Fowler, part of the Gang of Four"
  • @botchagalupe: tonight devops meetup at #velocityconf http://t.co/GIllPuDD
  • Come join us! Tonight 7pm. RT @botchagalupe: tonight devops meetup at #velocityconf http://t.co/GIllPuDD
  • @botchagalupe: patterns http://t.co/B0oIHlNY @lordcope
  • @botchagalupe: Design Patterns in Ruby - http://t.co/1QQc3YuQ @lordcope
  • @eastdakota: RT @lognormal: RT @lognormal: A little late in the day, but we're excited to announce at #VelocityConf that we've been acquired by SOASTA: http://t.co ...
  • @botchagalupe: patterns http://t.co/B0oIHlNY @lordcope
  • .@lordcope: "Anatomy of a Pattern (Rising & Mann, Fearless Change): includes Forces (who will rise up to resist)" .@lordcope: "Talking David Allen, of Getting Things Done fame;
  • @phrawzty: A problem is the delta between what you expected to happen and what actually happened. - @LordCope
  • RT @phrawzty: A problem is the delta between what you expected to happen and what actually happened. - @LordCope
  • .@lordcope: "Tom Limoncelli: must capture commitments in a trusted system (e.g., ticketing)"
  • .@lordcope: "
  • @botchagalupe: Here's a list of patterns to start u off .. ( http://t.co/7NYWSig3 ) @lordcope #velocityconf also see ( http://t.co/PL9qH13j )
  • .@lordcope: ticketing systems are where good ideas go to die
  • .@lordcope: "Ticketing make it difficult to visualize your work" (imagine 10 org-years of work, trapped in ticketing system)
  • .@lordcope: "Pragmatic: write what you're working on today on sticky note & put on wall" (nice. creates doing/done pile)
  • .@lordcope: "Each day the task isn't done, put a dot on it each day" (accumulates WIP & cycle time)
  • .@lordcope: "Anti-pattern: looks like pattern, but creates more problems than it solves" (term often used incorrectly)
  • .@lordcope: "Little's Law: correlation betw queue size & cycle time" (SHIT! I've been looking for that citation for years!)
  • .@lordcope: I think formula is % busy / % idle. I.e., 50/50 = 1 cycle time unit; 90/10 = 9 units of time. 9x worse
  • .@lordcope: "Limiting WIP sets up tension; 'you can only 3 things at once'"
  • @scoobiedoobie: . @LordCope littles law: cycle time = work in progress / throuput
  • .@lordcope: "
  • @mpasternacki: happy to share my #velocityconf notes. Just email me at genek-realgenekim.me
  • @botchagalupe: WIP creates tension.. tension creates pull @lordcope
  • app just crashed and lost my notes from a better part of @LordCope's talk. Hope it's being recorded.
  • RT @botchagalupe: WIP creates tension.. tension creates pull @lordcope
  • .@lordcope: "How many times does Toyota pull andon card per day?" "Hundreds" (!!!) (Brilliant.) cc @jezhumble
  • .@lordcope: "Why? Impeding flow is biggest TPS sin"; @jezhumble sez, "Keeping deployment pipeln running is highest priority"
  • .@lordcope: "Separate swim lanes w/WIP limits for specialists is anti-patterns: creates blockages" h/t @dominicad.
  • I agree w/@lordcope. @dominicad kanban for Dev & IT Ops training is superb. http://www.ddegrandis.com/
  • .@lordcope: Ops hear last!"
    https://pbs.twimg.com/media/A4NupmkCYAIkHKQ.jpg
  • .@lordcope: "Citing David J Andersen"
  • .@lordcope: "Feedback time vs. effectiveness: adhoc: random; 3-4 months; 2-4 weeks; daily" (continuous delivery)
  • @phrawzty: . @LordCope The faster you get feedback, the faster you can deliver value.
  • PS: Grt mtg you! RT @phrawzty: . @LordCope The faster you get feedback, the faster you can deliver value.
  • .@lordcope: "Goal: get feedback at end of every code commit (code & environments)" ala @jezhumble
  • .@lordcope: "A3 Thinking and Planning"
  • .@lordcope: "Progression: Agile retrospective > Lean kaizen events > systemicized" (ala Toyota Kata by Mike Rother) .@lordcope: "Problem: how do u do planned wk w/interrupts? Dedicate person as Batman (or goalie) to handle all interruptions
  • .@lordcope: "On Call 'Bonus': compensate on-call people
  • @scoobiedoobie: .@LordCope Have a dedicated person on the team to manage the interruptions/alerts etc - BATMAN - enables project work
  • RT @scoobiedoobie: .@LordCope Have a dedicated person on team to manage interruptions/alerts etc - BATMAN - enables proj work
  • .@lordcope: "Martin Fowler: if it hurts, do it more"
  • .@lordcope: "Embrace remote teams & asynchronous working" (like Github, 37 Signals: all use remote ops)
  • .@lordcope: "On Wiki's: to prevent Wiki pollution: document How & Why, but not What. (b/c What lives in code)"
  • @RealGeneKim: “@phrawzty: @RealGeneKim @LordCope distributed ops teams like #Mozilla, too. :-)”
  • .@lordcope: "Anti-pattern: batch and queue" (opposite of single piece flow)
  • .@lordcope: "Pattern: Servant Leadership" birth of org chart: train crash disaster; who to blame? thus org chart for lines of responsibility
  • @cmsj: Sweet, just got recognised by @nasrat for Terminator. I AM LITERALLY FAMOUS! ;) #strataconf