6/28 DevOpsDays

by Gene Kim on

#devopsdays

  • It begins. Holy cow. What a fun morning already.
  • .@patrickdebois: "Google couldn't scale, so we're here at
  • C
  • loudCenter" (haha)
  • @nstielau: RT @BMC_DevOps: #DevOpsDays real-time streaming available at http://t.co/3avd0F0Y #DevOps PLS RETWEET to all potentially interested (Austin feed had 750+!)
  • Thx, all! RT @BMC_DevOps: #DevOpsDays real-time streaming: http://t.co/3avd0F0Y #DevOps PLS RETWEET to interested (Austin feed had 750+!)
  • Thx Chris Little (@BMC_DevOps)! Video here is amazing. 4 ppl, mult cameras. DevOpsDaysAustin archive was incredible (@wickett)@igb: RT @planetg33k: Cute. #salesforce guys talked briefly was wearing 'zero downtime' tshirt. #badtiming #devopsdays - he was really cool about it. #humility
  • Earlier this morning: @adrianco holding shiny, new, 1d old Nexus 7 running Netflix. Awesome.
    https://p.twimg.com/AwfgVInCQAAZqIu.jpg
  • .@moonpolysoft: "Twitter was saved by one person. Shouldn't be like that. Era of heroes needs to end."
  • (Lesser known fact: I actually write all these tweets 3 days ahead of time, b/c I have access to the slides ahead of time.)
  • RT @mrhinkle/@botchagalupe The Devops Mafia - http://t.co/kWJKjWly <Two of my favorite ops guys (cc @patrickdebois)”
  • @BMC_DevOps: http://t.co/VQtK2VXg the BMC tv control center at
  • (Same team who supp big BMC CEO keynotes. Wow/$$$/Grateful) RT @BMC_DevOps: http://t.co/VQtK2VXg: BMC tv control center at
  • @ddelmoli: RT @jzb: "Your post-mortem shouldn't be PR" - Cliff Moon
  • RT @ddelmoli: RT @jzb: "Your post-mortem shouldn't be PR" - Cliff Moon
  • .@moonpolysoft: Whoa. He's talking intrusion detection. Am I in right conference? Oh, he's making fun of them. Right place. :)
  • @ernestmueller: @moonpolysoft at #devopsdays: "Let's be real engineers"
  • @nickethier: "We should be building bridges not putting toys together"

RT @ernestmueller: @moonpolysoft at #devopsdays: "Let's be real engineers"
RT @nickethier: "We should be building bridges not putting toys together"

  • @joshcorman: check out my last two weeks about engineers, toys, etc. cc @wickett
  • @adrianco: RT @RealGeneKim: RT @ddelmoli: RT @jzb: "Your post-mortem shouldn't be PR" - Cliff Moon
  • @ernestmueller: RT @wickett: thank you @opnet for sponsoring #devopsdays < pls RT
  • RT @ernestmueller: RT @wickett: thank you @opnet for sponsoring #devopsdays < pls RT
  • @dominicad: Next up - "Misadventures in DevOps" #devopsdays @cwebber
  • Learned 2d ago abt some backoffice probs in academia: eg, onboarding 40K new student accts in PeopleSoft each yr. Wow.
  • .@cwebber: "If Ops guys can't even talk to each other, no prayer of being able to talk to Dev?" (DBA vs. network engrs :)
  • .@cwebber: "I've pulled cables out of floors that are older than I am"
  • @BMC_DevOps: #devopsdays if someone is saying we need to work together and u reply, we r already doing that... Then you're wrong
  • RT @BMC_DevOps: #devopsdays if someone is saying we need to work together and u reply, we r already doing that. Then you're wrong

  • Awesome salesforce.com dude: "We appreciate being here, esp after today's outage. Sorry. Bad day for 'zero downtime' tshirt"

  • @johnpaulhayes: RT @adrianco: Part of the audience at #devopsdays ready for our panel http://t.co/2Y50E0qF

  • RT @johnpaulhayes: RT @adrianco: Part of the audience at #devopsdays ready for our panel http://t.co/2Y50E0qF
  • Awesome salesforce dude: "We appreciate being here, esp after today's outage. Sorry. Bad day for 'zero downtime' tshirt"
  • @solarce: Killer panel of @adrianco, @markimbriaco, and @alexhonor, moderated by @botchagalupe starts now!
  • RT @solarce: Killer panel of @adrianco, @markimbriaco, and @alexhonor, moderated by @botchagalupe starts now!
  • .@adrianco: "We want Dev as productive as possible; they're the bottleneck to scaling; we don't enforce arch. we hv patterns"
  • .@markimbriaco: "Ops is radically outnumberd by Dev. They can run whatever sw you want, but if not supp, then u hv pager"
  • .@markimbriaco: "No dev pushback when framed that way"
  • .@alexhonor: "Prob often is defining 'done.' In enterprise, 'done' oft mean Dev commits code. Done has to mean prod ready"
  • .@adrianco: "
  • .@markimbriaco: "Lots of little apps/component is nice; we can reason about them; but someone has to own it forever. Problem!"
  • .@markimbriaco: "Must make sure Dev has enough cycles to maintain; components are like products w/dedicated team"
  • @royrapoport: #devopsdays theme: forget control. Make it as easy as possible for people to do the right thing.
  • RT @royrapoport: #devopsdays theme: forget control. Make it as easy as possible for people to do the right thing.
  • .@adrianco: "All components have an email on them; if no owner, then Simian Army deletes the component!" (holy crap)
  • .@adrianco: "Do painful things more often; We clean out Test env this way regularly." (Always stunned by Adrian)
  • .@adrianco: "Day to day ops life will prevent rearchtecture/refactor: so we hv dedicated teams chg everything at once"
  • @lolcatstevens: GREAT roundtable discussion on the #devopsdays stream, right now: http://t.co/x6GT33QH
  • Agreed. Fantastic stuff! RT @lolcatstevens: GREAT roundtable discussion on the #devopsdays stream, right now: http://t.co/x6GT33QH
  • @jzb: "Do painful things frequently and make them less painful."
  • RT @jzb: "Do painful things frequently and make them less painful."
  • @solarce: RT @geekle: "All of our services have emails assigned to them and if they don't the monkeys delete them." - @adrianco
  • RT @solarce/@geekle: "All of our services have emails assigned to them and if they don't the monkeys delete them." - @adrianco
  • .@adrianco: "Conformity monkey kills anything not conforming to good patterns" (@botchagalupe likes this more than chaosmonkey
  • .@adrianco's concepts are truly compliance integrated into daily operation to stamp out variance. Inspiring.
  • @wickett: conformity monkey kills ur stuff if it isn't right
  • RT @wickett: conformity monkey kills ur stuff if it isn't right
  • as opposed to compliance sucking all air in the room, sucking will to live out of everyone they touch, of course..
  • .@markimbriaco: "No pushback from culture of compliance monkey: dev sees it as making rollouts smoother"
  • .@adrianco: "It helps when your CEO is a developer [Netflix CEO did purify, amazingly. Loved it in 1992]
  • @AtlDevTools: Overheard at #DevOpsDays: "It's not about avoiding pain, it's about working through it. If something is painful, do it often!"
  • RT @AtlDevTools: OH: #DevOpsDays: "It's not about avoiding pain, it's about working thru it. If something is painful, do it often!" #devopsdays
  • .@alexhonor: "Self service creates a narrow interface, sacrifices freedom, but allows people to focus on more important things
  • @royrapoport: #devopsdays I still remember when Purify saved my butt by analyzing an obscure memory leak in my C code. Funny I'd end up working for Reed.
  • RT @royrapoport: #devopsdays I still remember when Purify saved my butt for obscure memory leak. Funny I'd end up working for Reed
  • .@adrianco: "We gave AWS budget to Dev; Products being run by Dev, shuffled staff as necessary; IT Ops does Exchange, etc.
  • .@markimbriaco asks @adrianco about Ubuntu transition for AMI, and "isn't that Ops?"
  • .@botchagalupe: "Q: bake vs fry: bake is prebuilt image; fry is chef/puppet to build from scratch; talk thru your choices
  • .@alexhonor: "bake vs fry is one of most polarizing issues; someone usually wants to control all the base imgs for Dev
  • .@alexhonor: "Dev have very valid issues, b/c prod ctrl of base imgs can invalidate all of QA testing done w/different libs
  • .@alexhonor: "Answer seems to be creating build process and process for updating the builds. It's about controlling change
  • .@markimbriaco: "At Heroku, we did config mgmt globally; allowed us to chg things w/o deploying everything
  • .@markimbriaco: "OTOH, At LivingSocial, we bake." @adrianco: "lots of outages due to blades having diff firmway...
  • .@adrianco: "Caused so much pain that we wanted every bit in production to be same: firmware, hw, builds, etc."
  • .@adrianco: "We eventually had to esc to mgmt, who made the call. We do stateful tier diff than stateless tier; local data
  • Wow. Great panel: @adrianco, @markimbriaco, @alexhonor, @botchagalupe: could hv listened to them for 2h. Great job, guys!
  • @mariusducea: RT @wickett: "Management is dragging engineering into the future at Netflix" @adrianco
  • RT @mariusducea: RT @wickett: "Management is dragging engineering into the future at Netflix" @adrianco
  • @mariusducea: Aha... There is one ops guy at Netflix! He maintains the base ami ;)
  • Pondering blog post insp by @mtnygard/Jermiah/@adrianco/@royrapoport: Why Netflix/Facebook keep denying they love ctrl/rigor?
  • Theory: @mtnygard: "there are no approval forms, but this is a documented process, followed ruthlessly"@RealGeneKim: RT @seemaj: Things that can make geeks cry, a dashboard full of graphs!

RT @mariusducea: Aha... There is one ops guy at Netflix! He maintains the base ami ;)

  • "Continuous delivery is airport without air traffic control; that's Etsy's process; all self-service made poss by micro-chgs"
  • @daguy666: Watching @noahsussman talk about testing at
  • RT @daguy666: Watching @noahsussman talk about testing at
  • .@noahsussman: "This style of delivery very different than Agile" (rather, it's decoupled from Agile); "has huge impact on QA
  • .@noahsussman: "QA really does become everyone's job, more about exploration and production analysis
  • .@noahsussman: "It's made possible b/c small chgs, lots of monitoring to support A/B testing, trust in Dev do the right thng
  • .@noahsussman: "Focus on resiliance, not 'quality.' Readable code, reasonable test coverage, sane arch, good debug tools
  • .@noahsussman: "An engineering culture that values refactoring (!)
  • .@noahsussman: "What does testing look like. It's manual. Real-time monitoring is the new QA test
  • .@noahsussman: "When we saw our prod dashboard w/250K metrics, we stared, almost cried. 5y in the making
    https://p.twimg.com/Awf7KFzCAAAwobt.jpg
  • @lozzd: The Puppet guy at #devopsdays just introduced himself with "ohai!" #irony
  • Funny. Cool. RT @lozzd: The Puppet guy at #devopsdays just introduced himself with "ohai!" #irony
  • .@noahsussman: "Exploratory testing gets to core of QA soul: 'I like to break stuff'; very different than Dev, thus important
  • @seemaj: Things that can make geeks cry, a dashboard full of graphs!
  • RT @seemaj: Things that can make geeks cry, a dashboard full of graphs!
  • @akucharski: "Customer experience is a much better term than quality, as in assurance because it's quantifiable"
  • RT @akucharski: "Customer experience is a much better term than quality, as in assurance because it's quantifiable"
  • @eriksowa: RT @allspaw: At #devopsdays @noahsussman opens minds: "resilience, not quality"
  • RT @eriksowa: RT @allspaw: At #devopsdays @noahsussman opens minds: "resilience, not quality"
  • @RealGeneKim: #devopsdays: RT @adrianco: @sec_prof @RealGeneKim see our Distributed Ops post at http://t.co/lVt7QI3r a few weeks ago”
  • .@noahsussman: "Keep feedback loops short" (Yes! System dynamics adage: shorten and amplify feedback loops)
  • .@noahsussman: Showing 60% improvement to defect time to resolve:
    https://p.twimg.com/Awf9J32CIAAz48w.jpg
  • .@noahsussman: Hey! How to integrate QA/infosec into cont delivery is next topic up in #devopscookbook. Wld lv to talk!
  • BTW, infosec oft has very similar temperament/psychographics as QA: they love to break things
  • .@noahsussman: "You can't rollback b/c you can't rollback time" (again, mirroring @markbrittain edict: "only fix forward")
  • @akucharski: RT @dominicad: "Risk looks really diff in cont Delivery. Include QA into team. " @noahsussman
  • RT @akucharski: RT @dominicad: "Risk looks really diff in cont Delivery. Include QA into team. " @noahsussman
  • .@noahsussman: "You can write crappy code until you get funded, and then rewrite it. But monitoring must be in on Day 1" Nice
  • OH: woman: "Amazing! The long line for the bathrooms is for the men!" ( @myleshocking: RT @dominicad: "Continuous Delivery works when the changes are small." #smallbatchsize
  • RT @myleshocking: RT @dominicad: "Continuous Delivery works when the changes are small." #smallbatchsize
  • @pvdissel: #devopsdays Mountain View, live: http://t.co/C4nYRIUZ Ignite talks now
  • Ignite talks! @dominicad up: RT @pvdissel: #devopsdays Mountain View, live: http://t.co/C4nYRIUZ Ignite talks now
  • vimeo.com/channels/336
  • @imprecise_matt: "A dev can unit and func test a widget, but sometimes you need a tester to q the defn of a prob" -- Noah Sussman (paraphrase)
  • @imprecise_matt: "A dev can unit and func test a widget, but sometimes you need a tester to q the defn of a prob" -- Noah Sussman (paraphrase) @wickett: Dominca mentions Lean Conference and learning to let go of things that don't work out #LSSC12 RT @wickett: Dominca mentions Lean Conference and learning to let go of things that don't work out #LSSC12
  • .@dominicad: Whoa. Showing "abandoned work" designed into kanban work to provide discussion vs. frustrated/betrayed
  • @wickett: There is a sunk cost fallacy that we often fall into but at pivot points we have to let go of our experiments
  • RT @wickett: There is a sunk cost fallacy that we often fall into but at pivot points we have to let go of our experiments
  • @wickett: Putting an 'abandoned work' on your kanban board helps the team know that some experiments end and that is ok
  • RT @wickett: Putting an 'abandoned work' on your kanban board helps the team know that some experiments end and that is ok

  • (Watching Ignite ppl in line, remembering giving Ignite talk last yr: one of most terrifying/exhilarating things I've done)

  • @wickett: Can I just say that the ignites are awesome at #devopsdays?

  • +100 RT @wickett: Can I just say that the ignites are awesome at #devopsdays?

  • @RealGeneKim: #devopsdays@adrianco: @mtnygard @RealGeneKim @royrapoport patterns supported by tooling vs. process enforced by rules”

  • PS: learned Clojure code from @mtnygard last nt after long day at #velocityconf: reqd him to say everything 3 times. Intense

  • Next up: "what about devops in the enterprise? 5:1 devices/admin ratio
  • "Why? Long tail of complexity due to so many unique needs not available by COTS
  • .@cote: "only thing worse than monitoring is retrofitting monitoring into something"
  • 20 slides, 20 seconds
  • Oops. That wasn't @cote. That was Jos. Sorry!

20 slides: 5 min

noah's talk
i'm working on devops cookbook project w/patrick, john, and mike orzen
how do we really integrate qa & infosec into ci/cd value stream
maintain fast flow
but prevent issues that impacts ux, relianility, infosec

  • Node.js with zombie: runs v8 javascript engine: replay sqli attacks, confirm not effective against vm;
    • production monitoring
  • Selenium: zed attack proxy: OWASP
    • integrated into running test/prod
  • hiphop compiled: static code analysis
    • part of continuous deployment
    • take too long: 5-10m
    • scans all code
    • Everyone uses the same base servlet: shared code
    • Dave Flynn: noticed Dropbox failure
    • run analysis only on new (but tool doesn't scan delta)
    • exception: build stops there: red build, you knew who changed build, developer fixes it
      • low medium high
    • Google: bug find, triage it: filter it out:
    • manual signoff
    • how subjective is it really? We use the tool month over month: try to track delta between those
  • list of PHP functions: grep whole code base
    • no PHP: production scrubbing (PHP Apache logs)
  • sonatype: dl jars from third party repository
    • result: central service: maven central
  • Firefox pattern: monitor network traffic
  • no more pen testing PDFs

    • demand that on front end
  • QA embedded by development teams: front end has dedicated QA, because it's for stability, externally visible:

    • QA team now DevOps team: because they promote
    • weekly promotes (2 weeks -> 1 week -> trying to push batches)
    • black box tests
    • A/B testing makes UI testing hard: so many versions
  • QA UI are product testers

    • black box testers aligned with developers: color (UI expert)
  • How much pen testing internal/external?

  • Get security training: security summit (Adrian): how much are we a target (Jason Chan)

    • Hyperguard: what's it telling us
  • A/B tests: lots of experiments

    • testing includes business outcomes (who watched more movies): did we implement the 5 variants right
    • se
  • How do you handle mistakes

    • rollback: you turn off the feature
    • turned on feature for everyone: took site down: flipped switch, which turned it off
    • red/black deploys: check on 50%, can flip traffic to old ones
    • go back to the last known good cluster: may won't fix it
  • Cloud security person is an Auditors: educators

    • have to educate external compliance
  • How do we really gain assurance in place

    • Source control on wiki
    • write only log: Jira: ServiceNow: traceability and complete population of changes, and manager level: perforce
    • no tweaking in production: it will disappear
  • Automation is the documentation: you have 100% reliance on the test suite

    • WealthFront:
    • "Noah: Tests aren't important: monitoring is important": seems morally wrong
    • Lee Thompson:
    • log pushes
      • re
    • Wii: 6 week turnaround: don't focus on the problems, increase capability
  • Performance testing: how much can you do inline, deadlocks, endurance testing

    • what is most inefficient of all the stuff out there: put perf team on there
    • Minimal set of gates
  • Aha moment: "with product development, testing is critical because you can't monitor it. For service development, monitoring is critical, because testing doesn't catch everything. Relying on monitoring controls, acknowledges that failing is okay"`

  • @alanbecker:

Conformity monkey: you do get emails first before the process is killed
Usually caught in test

Nick @ Atlassian: how we don't fuck our customers
Dave Connors: constant Contact: in house integration security guardrails
Carl Quinn, Engr Tools Team Netflix: automation
Adrian Cockcroft: impossible to do in data center: can't do it in the data center
Dominica DeGrandis: handoff and specialization
Elon Becker: we have chance to do this right (when every commit goes to production): do we do it before commit, or both
Dan Nemic, Silverpop
Michael: Arch online game company (Sony got hacked heavily)

  • TODO: DevOpsDays: email me so I can atttribute
  • Gene to write up report, publish on Google Doc
  • Email all list to attendees
  • Holy cow. Open group on QA/Infosec in CI/CD blew my mind. Incredible brainpower. Loved it!!!

    https://p.twimg.com/AwhBYUPCEAApCSU.jpg

  • I'll write up notes & incredible list of awesome QA/Infosec patterns for CI/CD within next week. Amazing stuff!

do we talk about the process from commit to deploy

.@adrianco: Security is a DOS attack on functionality
Code never lasts more than a week:
@ladyleet: Whoa! Lunch snafu! Sorry, all. :) Ice cream sundaes and beer in the afternoon to make up for the hiccup. :) #devopsdays #devopsday