2016/09/13: PagerDuty Summit 2016 (part2)

by Gene Kim on

#pdsummit16

2016/09/13: PagerDuty Summit 2016 (part2)

Agenda: https://www.pagerduty.com/pagerduty-summit-2016/

Post-Mortems at Google, Chaos Engineering at Twilio

  • @paxxman: .@pagerduty Modern Day Post-Mortem with Google #PDSummit16 Breakout Session https://t.co/O9GsAZo8nU
  • @paxxman: .@pagerduty Modern Day Post-Mortem with Google #PDSummit16 Breakout Session https://t.co/O9GsAZo8nU
  • @mdasif: RT @pagerduty: Swing by the lower level: @brucemwong & @1mentat from @twilio on "The Journey of Chaos Engineering Begins with a Single St…
  • Google's Andre Kelly talking modern day post-mortems #PDSummit16 https://t.co/bsipS1aotq
  • #pdsummit16 post mortems at Google! https://t.co/usXyQLiMla
  • I take responsibility for my actions. However - I do not take responsibility for my in actions. #PDSummit16 https://t.co/2xy6Nrrhgz
  • Settling in to watch @AndreKellyTech from @google's #PDSummit16 talk "Where is the Modern Day Post-Mortem" https://t.co/F9T9WSjYOw

  • @mdasif: RT @pagerduty: Swing by the lower level: @brucemwong & @1mentat from @twilio on "The Journey of Chaos Engineering Begins with a Single St…

  • RT @paxxman: .@pagerduty Modern Day Post-Mortem with Google #PDSummit16 Breakout Session https://t.co/O9GsAZo8nU

  • OH: @AndreKellyTech: "I can't imagine that we'd have an incident and there's nothing to learn from it." (Profound)

Mark Imbriaco talk (@markimbriaco)

  • @markimbriaco: describing @HostedGraphite (cool service) Monitorama ppt: "To reduce burnout, Ops has 7 days oncall, Thu-Thu, then Fri off."
  • @markimbriaco: "Make it safe to learn, from both successes and failures; retrospectives become habit, not opp to assign blame"
  • @knowledgebird: RT @Service_Ninja: @RealGeneKim emphasizing importance of sharing knowledge across Org to drive #DevOps innovation! #PDSummit16 https://t.c
  • RT @claud_hop: #PDSummit16 @markimbriaco "Dont make me think at 3am" https://t.co/ij0NzoIHhA
  • @claud_hop: RT @RealGeneKim: #pdsummit16 @markimbriaco: "Make it safe to learn, from both successes and failures; retrospectives become habit, not opp…
  • @markimbriaco: "adaptable organizations win"

Lindsay Holmwood: @auxesis: "What we learned from 7 years of monitoring data"

  • @auxesis: "Collect: collected, statsd; Storage: graphite, openTSDB, Kairos, Aggregation: Reimann (the maazing tech that magically arrived from the future)
  • @auxesis: "2013: alert fatigue, #monitoringsucks meme, Monitorama"
  • @auxesis: "if your org had one metric to alert on, which one would it be?"
  • @auxesis: "Strip charts: the PHP hammer of monitoring" (HA!)
  • RT @ajdomie: @thingles to quote @auxesis "Pie charts...ughhhh.... We can do better" #PDSummit16
  • RT @RealGeneKim: RT @ajdomie: @thingles to quote @auxesis "Pie charts...ughhhh.... We can do better" #PDSummit16
  • @auxesis: "PhantomJS, something-something... if u want to go old-school, you can use mechanize" (?!? I just built a great mechanize script!!!)
  • @auxesis: "cool new tools: opsweekly, pigeonhole,
  • @auxesis: h/t @adrianco: "monitoring systems need to be more available/reliable than systems being monitored"

Dropbox talk (Andrew Fong)

  • RT @JLxSF: What's 00:43:23?= 99.9% of downtime.@dropbox talks tactics and strategies to go from 43mins of downtime to 4mins #PDSummit16#SF#itops#devops
  • @seoul_tiger: @pagerduty Andrew Fong @Dropbox now speaking @ #PDSummit16 https://t.co/qubiayZZo1
  • RT @mkalra: Get to 99.99 availability by involving a cross-section of engineering from @andrewfong #PDSummit16