by Gene Kim on
@bradjohnsonsv: #LSPE meetup has record attendance! About to see @mikebrittain from #Etsy, then @dbartow on "Actionable Metrics". Woot!
.@mikebrittain: "If you're pushing a change, everyone should know about it
.@mikebrittain: "All our configs flags live in big PHP file. allows enable/disable features rapidly. Much faster than full deploy"
.@mikebrittain: "Failure is inevitable, a learning opportunity, and DETECTABLE"
Everyone can see our operational metrics: people love looking at graphs
Dev have graphs up, IRC channel on separate monitor
.@mikebrittain: "We detect failures quickly; we optimize not to prevent, but quick detect/recover"
.@mikebrittain: "Q: sounds like a lot of work; who does it? A: Engineers; they build the app: shared resp for logging/graphing/trending/alerting"
.@mikebrittain: "Metrics are part of every feature & so are config flags; we never roll something out where we can't see succ/fail"
.@mikebrittain: "Apache/PHP are boring to many people. We like boring. They work well."
.@mikebrittain: "we have field for etsyabtest to facilitate easy spilt A/B testing in campaigns, test forms, etc." (Awesome)
.@mikebrittain: "Our StatsD library: in one line, dev have no excuse not to log; love to see whether features are working; flushes to Graphite
..@mikebrittain: "Awesome. Time to recover was 8m" (check out the graph)https://p.twimg.com/AwcTiRjCIAAbI1l.jpg
.@mikebrittain: "250K+ metrics tracked at Etsy; "
.@mikebrittain: "We don't have rollback; we like to fix forward"
@bradjohnsonsv: ah, they look for outliers, use "confidence bands", to detect issues in the 250k metrics. whew @mikebrittain #ETSY #LSPE
RT @bradjohnsonsv: ah, they look for outliers, use "confidence bands", to detect issues in 250k metrics. whew @mikebrittain #ETSY #LSPE
@mikebrittain: Previous (more robust) versions of Metrics-Driven Engineering slides at http://t.co/YnGHldXG #lspe /cc @wickett
Actionable Metrics at Full Production Scale
Dan Bartow @perfdan
Netflix, Roy Rappaport
Here it is. RT @bradjohnsonsv: Lot's of pictures being taken of @perfdan's "The List" of Actionable Metrics at #LSPE meetup.
.@royrapaport: "Been at Netflix 1094 days; job: make things better (security monkey, python platform, central alert gateway)
.@royrapaport: "Netflix culture: freedom and responsibility; hire very smart people and let them loose
.@royrapaport: "Netflix open sourced Asgard, their cloud orchestration: note that there's no authentication or login" (when he told.
...when @royrapaport told me this last night, I almost fell over. Laughing, shocked, aghast..
.@royrapaport: "Distributed Operations; Get out of the way of Developers"
I can hear infosec peeps fainting on absence of authentication in Netflix open source Asgard. :)
.@royrapaport: "Systems: flexible, scalable, self service; @marcog: Netflix in the range of 10-100m metrics. Woah!
RT @marcog: Netflix in the range of 10-100m metrics. Woah!