The Velocity 2008 Conference Experience – Part VII

Jun.24, 2008 in Application Performance Management, Automation, Conferences, Velocity 2008

We’ve reached the last couple sessions at Velocity 2008. Read me! Love me!

We hear about Capacity Planning with John Allspaw of Flickr. He says: No benchmarks! Use real production data. (How? We had to develop a program called WebReplay to do this because no one had anything. We’re open sourcing it soon, stay tuned.)

Use “safety factors” (from traditional engineering). Aka a reserve, overhead, etc.

They use squid a bunch. At NI we’ve been looking at Oracle’s WebCache – mainly because it supports ESIs and we’re thinking that may be a good way to go. There’s a half assed ESI plugin for squid but we hear it doesn’t work; apparently Zope paid for ESI support in squid 3.0 but no traction on that in 4 years best as we can tell. But I’d be happy not to spend the money.

Anyway. You should do forecasting. Assuming it’s linear which it never is. But you can take output from your stuff (ganglia, whatever) and fityk can give you a curve fit.

They use lots of nagios for monitoring – about 10 checks per host.

Determine a ceiling, and high/low water marks, and alert outside the water marks.

Then have a simple capacity dashboard for everything – how close to ceiling you are.

Horizontal scaling is all well and good, but sometimes you should do some vertical by upgrading. He calls it “diagonal.” By upgrading image proc servers, they got same CPU usage but 3x more work out of them. (We saw the same when we upgraded our Java app servers from Sun V440s to Dell 2850s a year or two ago – 50% performance improvement.) In their case, they also got faster processing time, less power usage, and less rack space.

Memcached. You turn it on, and the DBs go idle! Yay. But then your Web servers heat up as they become the bottleneck. So beware the wandering bottleneck.

Stupid Capacity Tricks! Before Puppet and Capistrano there was dsh (distributed shell). Ooo, I want it.

Shut Shit Off – they have software switches to disable various features when needed. (We have a lot of those switches at NI, but they’re not documented and under the control of business units not ops – sad.) Their programmers are good, they put flags in config files in order of importance to turn things on/off, read on the fly.

Host an outage page NOT in your datacenter, and use it – users appreciate knowing what’s up.

Bake dynamic into static. Some Yahoo! properties have a big red button to bake/unbake at will. Bye to DDoS attacks.

And at the end, a plaintive “We’re Hiring…” Like everyone else here. Man I need some good Web ops people. I have two open spots. We’re hiring too!!!

Question: You do lots of mini-code pushes (20/day). How the heck do you manage that and keep the site up? He says – culture is the biggest thing. They have devs that think like ops and don’t do retarded things. They’re ganglia addicted and they’re the ones hitting the big red buttons. Then less important are some technical parts, like a one button deploy and verbose logging of changes.

He uses more dirty words than I do. Boss.

Artur Bergman of Wikia speaks again, on Squid vs Varnish.

PHP is a pig and wikitext is hard to parse, so they need caching. A hit is 8 ms and a miss is 200 ms, and they have a 75% hit rate. You have to get the cache hits up, by making more cachable. Ooo, they’re playing around with ESI he says!

They decided to force caching for anonymous users. They’ve only gone up to 30 seconds, but no complaints. They ignore if-modified-since and purge. Be careful about vary-accept on encoding because there’s an annoying browser bug with misplaced commas.

Mediawiki lives and dies by squid and puts cache control in the code, which is bad because developers are stupid.

Squid – the slide actually says “Me hates it” and “Still a piece of shit.” Awesome.

Varnish – he loves varnish. He nearly cried when he read the source code (C). But it’s a little unstable. He got it up to 65k hps (squid doing 2800).

Varnish has some “novel” techniques. Its control language is VCL. (Side note, they monitor with lvs). This gets compiled down to C at runtime. So you can put assembly in if you want. Lawdy. It segfaults from time to time under load and they’re helping fix it. In a month or two he’ll have it crackin’!

And the last one – Puppet, by Luke Kanies, Puppet developer.

Automation tools are old and bad, and especially because they’re SSH based. (Agree!) And also because there’s not many people who cross the chasm between sysadmin and developer. They decided they had to solve the problem and create something a billion times better than anything (where anything is cfengine). Either you can manage many machines with little effort, or you can’t. You want to be able to. So this required abstraction. He’s using the analogy of C scaring the bejeezus out of assembly programmers – a good analogy.

It’s sad you have to do it, but he goes into why a more powerful tool should not scare people and put them all out of work. Developers seem to have gotten over this but not sysadmins. it’s stupid especially because “we’re understaffed” is the #1 thing I hear out of all of ours.

So they implemented Puppet with the metaphor of resources and resource providers, hiding all the file/command/UNIX admin stuff. (Well, kinda.) It’s easily extensible.

The Web 2.0 crowd has made “microformats.” Your infrastructure can use that idea too. Catch up with the times – if you’re proud of doing something developers have been doing for 10 years (like moving to version control in Subversion) then you’re behind. (Use git!) Anyway, you have to use polymorphism (overloading) to make a system like Puppet understand ssh on system 1 vs 2 vs 3.

Also, have one solution per problem. Not multiple. And most of the problems you face are NOT unique to you or your organization – so using a common tool like this can benefit from the network effect.

And the third big principle (were there only two before?) is completeness. Everything that matters in your config should be in the config. Not some minimal set. Relationships are important (dependencies). You can do things like have a service subscribe to a file and restart when it changes, for example.

Puppet is mainly used as a central config management tool. Each host gets a resource catalog. Machines get put in classes and they get lists of resources.

Puppet clients retrieve their resource catalog, determine order, check em, fix, and repeat every 30 minutes. “Like cfengine but sexier!” The completeness approach means clean management through the lifecycle – a freshly kickstarted box doesn’t end up different. You just kickstart enough to run puppet and use it to do everything. So all boxes are kept 100% up to date without artifacts.

And it has reporting underway too! They’re planning to charge for that to make some mooonay! Google, Stanford, Sony, Rackspace all use Puppet.

Why Puppet vs Capistrano? Cap is SSH in Ruby. Not something for yoru whole infrastructure.

Why Puppet vs cfengine? More open dev community and better.

What about Puppet slowness? It scales like HTTPS.

Puppet: Is XMLRPC but moving to REST. Uses certs and SSL, not keypairs. It’s in Ruby. He’s had to learn to be a developer in the process. It’s also an API to the systems. It supports VMs well and can get into the guts of the VMs unlike pure VM provisioning tools. Buy me! it’s open source but he sells support/trainin/addons. Discovery to come! There’s nagios integration of some sort. Vertebra, like capistrano, is an ad hoc change tool – Puppet isn’t (though you can use relsh for that).

That’s the last session – wrapup later once I power up my laptop and get some booze in me!

Tags: Conferences, performance, velocity, velocity08, velocityconf08

One Comment on “The Velocity 2008 Conference Experience – Part VII”

Velocity 2009 – Introduction to Managed Infrastructure with Puppet « the agile admin
June 4th, 2010 at 5:01 pm
[…] saw Luke’s Puppet talk last year at Velocity 2008, but am more ready to start uptaking some conf management back home. Our UNIX admins use […]

Welcome to WebAdminBlog!

This blog site is run by Josh Sokol, the Founder and CEO of SimpleRisk, a free tool for Governance, Risk Management, and Compliance. Josh is a former Web Admin and Information Security Program Owner of National Instruments.

Categories
Recent Posts
Recent Comments
devops
Links
Security
Tags
21ct agile amazon analysis application appsec attack aws browser cloud Cloud Computing code Conferences data devops ec2 firewall google hansen internet lynxeon malware Management network Operations owasp PCI performance project rsnake SaaS secure Security strategies velocity velocity08 velocityconf velocityconf08 velocityconf09 Virtualization vpn vulnerability waf web wifi
Categories
- Advertising (2)
- Application Performance Management (14)
- Automation (4)
- Browsers (4)
- Cloud Computing (9)
  - Elastic Compute Cloud (3)
- Conferences (64)
  - BSides Austin 2013 (1)
  - BSides Austin 2016 (1)
  - OWASP AppSec DC 2009 (16)
  - OWASP AppSec NYC 2008 (18)
  - OWASP LASCON 2017 (1)
  - OWASP LASCON 2018 (1)
  - TRISC 2009 (8)
  - Velocity 2008 (8)
  - Velocity 2009 (8)
- Content Management (2)
- Featured (3)
- Green Computing (1)
- High Availability (1)
- Log Management (2)
- Management (4)
- Monitoring (4)
- Networking (12)
  - Firewalls (4)
  - NetFlow (4)
- Operating Systems (2)
  - Linux (2)
  - Mac OSX (1)
  - Unix (2)
- Operations (11)
- Popular (2)
- SaaS (2)
- Sarcasm (1)
- Search (1)
  - Enterprise Search (1)
- Security (75)
  - Access Management (1)
  - Capture the Flag (4)
  - Cloud Computing (4)
  - Compliance (1)
  - Disaster Recovery (1)
  - Malware (4)
  - Metrics (2)
  - OWASP (2)
  - PCI (2)
  - Phishing (2)
  - Physical (1)
  - Risk Management (2)
  - Virtualization (3)
  - Web Application Security (32)
    - Dynamic Analysis (1)
    - Static Analysis (1)
  - Wireless Networks (5)
- Service-Oriented Architecture (1)
- Software and Tools (15)
  - Crashplan (1)
  - Drobo (1)
  - GRC (1)
- Training (2)
- Uncategorized (1)
- Virtualization (4)

Blogroll
- Agile Operations Blog
- Agile Testing
- Agile Web Operations
- Amazon Web Services Blog
- dev2ops – Web Ops at Scale
- Gilligan on Data Web Analytics pro tips
- ISSA Home The Information Systems Security Association (ISSA)® is a not-for-profit, international organization of information security professionals and practitioners.
- Kitchen Soap, A WebOps Blog
- Michael Howard's Blog Software security guy at Microsoft.
- National Instruments Home The majority of the contributers here are current or past NI employees.
- OWASP Home The Open Web Application Security Project (OWASP) is a worldwide free and open community focused on improving the security of application software.
- RSnake's Blog ha.ckers.org web application security lab
- Server Fault
- Steve Souders’ Blog Google High Performance Guru
- The Madstop
- The Open Minded Enterprise
- The Simple Logic
- Transparent Uptime blog
Archives
- March 2019
- October 2017
- April 2016
- January 2016
- December 2015
- May 2015
- November 2014
- August 2014
- June 2014
- May 2014
- October 2013
- September 2013
- August 2013
- May 2013
- March 2013
- February 2013
- October 2012
- May 2011
- April 2011
- December 2010
- July 2010
- June 2010
- April 2010
- March 2010
- February 2010
- January 2010
- November 2009
- September 2009
- July 2009
- June 2009
- April 2009
- March 2009
- February 2009
- January 2009
- December 2008
- October 2008
- September 2008
- August 2008
- July 2008
- June 2008
- May 2008
Tag Cloud
21ct agile amazon analysis application appsec attack aws browser cloud Cloud Computing code Conferences data devops ec2 firewall google hansen internet lynxeon malware Management network Operations owasp PCI performance project rsnake SaaS secure Security strategies velocity velocity08 velocityconf velocityconf08 velocityconf09 Virtualization vpn vulnerability waf web wifi

Web Admin Blog

Real Web Admins. Real World Experience.