Physical Security FAIL :-(
Notice anything wrong with this picture?
I was walking by one of the Iron Mountain Secure Shredding bins at work one day several months ago and noticed that the lock wasn't actually locked. Being the security conscious individual that I am, I tried to latch the lock again, but the lock was so rusted that it wouldn't close as hard as I tried. I can't just leave it there like that so I call the number on the bin's label and there is an automated message that tells me that they're not taking local calls anymore and gave me a different number to try. I call that number and they ask me for my company ID number which I had no idea what it was. She informed me that without that ID number I couldn't submit a support request. I informed the lady that this bin contained sensitive personal and financial information and that the issue couldn't wait for some random company ID to be found. Fortunately, she gave in and created the support ticket for me saying that I should hear back from someone within four hours.
One week later, on Friday, Iron Mountain finally calls me back and says that they will come to replace the lock the following Monday before 5 PM. When the lock hadn't been replaced yet on Monday evening, I called Iron Mountain back up. Looking at their records, they showed that a new lock had been delivered, but they had no idea where and the signature was illegible. I work on a three-building campus with 14 floors between them and almost 3,000 people. If they can't tell me where the lock is, then there's no way for me to track it down. They said that they would investigate and call me back.
After not hearing back from them again for a couple of days, I called them back. The woman I spoke with had no real update on the investigation. She said that she would send another message "downstairs" and escalate to her supervisor. At this point it had been almost three weeks with sensitive documents sitting in a bin with a malfunctioning lock. The next day they called me back and said they were never able to track down who the new lock was left with so they would bring us a new one at no charge. Finally, after a total of 24 days with a unlocked Secure Shredding bin, Iron Mountain was able to replace the lock. Iron Mountain......FAIL.
Velocity 2009 – Hadoop Operations: Managing Big Data Clusters
Hadoop Operations: Managaing Big Data Clusters (see link on that page for preso) was given by Jeff Hammerbacher of Cloudera.
Other good references -
book: "Hadoop: The Definitive Guide"
preso: hadoop cluster management from USENIX 2009
Hadoop is an Apache project inspired by Google's infrastructure; it's software for programming warehouse-scale computers.
It has recently been split into three main subprojects - HDFS, MapReduce, and Hadoop Common - and sports an ecosystem of various smaller subprojects (hive, etc.).
Usually a hadoop cluster is a mess of stock 1 RU servers with 4x1TB SATA disks in them. "I like my servers like I like my women - cheap and dirty," Jeff did not say.
HDFS:
- Pools servers into a single hierarchical namespace
- It's designed for large files, written once/read many times
- It does checksumming, replication, compression
- Access is from from Java, C, command line, etc. Not usually mounted at the OS level.
MapReduce:
- Is a fault tolerant data layer and API for parallel data processing
- Has a key/value pair model
- Access is via Java, C++, streaming (for scripts), SQL (Hive), etc
- Pushes work out to the data
Subprojects:
- Avro (serialization)
- HBase (like Google BigTable)
- Hive (SQL interface)
- Pig (language for dataflow programming)
- zookeeper (coordination for distrib. systems)
Facebook used scribe (log aggregation tool) to pull a big wad of info into hadoop, published it out to mysql for user dash, to oracle rac for internal...
Yahoo! uses it too.
Sample projects hadoop would be good for - log/message warehouse, database archival store, search team projects (autocomplete), targeted web crawls...
As boxes you can use unused desktops, retired db servers, amazon ec2...
Tools they use to make hadoop include subversion/jira/ant/ivy/junit/hudson/javadoc/forrest
It uses an Apache 2.0 license
Good configs for hadoop:
- use 7200 rpm sata, ecc ram, 1U servers
- use linux, ext3 or maybe xfs filesystem, with noatime
- JBOD disk config, no raid
- java6_14+
To manage it -
unix utes: sar, iostat, iftop, vmstat, nfsstat, strace, dmesg, friends
java utes: jps, jstack, jconsole
Get the rpm! www.cloudera.com/hadoop
config: my.cloudera.com
modes - standalong, pseudo-distrib, distrib
"It's nice to use dsh, cfengine/puppet/bcfg2/chef for config managment across a cluster; maybe use scribe for centralized logging"
I love hearing what tools people are using, that's mainly how I find out about new ones!
Common hadoop problems:
- "It's almost always DNS" - use hostnames
- open ports
- distrib ssh keys (expect)
- write permissions
- make sure you're using all the disks
- don't share NFS mounts for large clusters
- set JAVA_HOME to new jvm (stick to sun's)
HDFS In Depth
1. NameNode (master)
VERSION file shows data structs, filesystem image (in memory) and edit log (persisted) - if they change, painful upgrade
2. Secondary NameNode (aka checkpoint node) - checkpoints the FS image and then truncates edit log, usually run on a sep node
New backup node in .21 removes need for NFS mount write for HA
3. DataNode (workers)
stores data in local fs
stored data into blk_<id> files, round robins through dirs
heartbeat to namenode
raw socket to serve to client
4. Client (Java HDFS lib)
other stuff (libhdfs) more unstable
hdfs operator utilities
- safe mode - when it starts up
- fsck - hadoop version
- dfsadmin
- block scanner - runs every 3 wks, has web interface
- balancer - examines ratio of used to total capacity across the cluster
- har (like tar) archive - bunch up smaller files
- distcp - parallel copy utility (uses mapreduce) for big loads
- quotas
has users, groups, permissions - including x but there is no execution, but used for dirs
hadoop has some access trust issues - used through gateway cluster or in trusted env
audit logs - turn on in log4j.properties
has loads of Web UIs - on namenode go to /metrics, /logLevel, /stacks
non-hdfs access - HDFS proxy to http, or thriftfs
has trash (.Trash in home dir) - turn it on
includes benchmarks - testdfsio, nnbench
Common HDFS problems
- disk capacity, esp due to log file sizes - crank up reserved space
- slow but not dead disks and flapping NICS to slow mode
- checkpointing and backing up metadata - monitor that it happens hourly
- losing write pipeline for long lived writes - redo every hour is recommended
- upgrades
- many small files
MapReduce
use Fair Share or Capacity scheduler
distributed cache
jobcontrol for ordering
Monitoring - They use ganglia, jconsole, nagios and canary jobs for functionality
Question - how much admin resource would you need for hadoop? Answer - Facebook ops team had 20% of 2 guys hadooping, estimate you can use 1 person/100 nodes
He also notes that this preso and maybe more are on slideshare under "jhammerb."
I thought this presentation was very complete and bad ass, and I may have some use cases that hadoop would be good for coming up!
Anatomy of an Attack: From Incident to Expedient Resolution
For the first session of the morning on the last day of the TRISC 2009 Conference, I decided to attend the "Anatomy of an Attack: From Incident to Expedient Resolution" talk by Chris Smithee, a Systems Engineer at Lancope. He talked about the different types of attacks that you see on your network and how using FLOW data can be used to monitor and eliminate some of these types of threats. My notes from the session are below:
Consider Your Hotel Network Hostile
As I'm preparing to take my trip to New York for the OWASP AppSec Conference, I came across a timely article on the risks involved with using a hotel network. The Center for Hospitality Research at Cornell University surveyed 147 hotels and then conducted on-site vulnerability testing at 50 of those hotels. Approximately 20% of those hotels still run basic ethernet hub-type networks and almost 93% offer wireless. Only six of the 39 hotels that had WiFi networks were using encryption (see my blog on why are people still using WEP for why this is necessary). What does this mean for you, Joe User? It means that both your personal and company information is at risk any time you connect to those networks. The next time you're surfing the web, start paying attention to all of the non-SSL links (http:// versus https://) that you visit. Then, think about the information that you are passing along to those sites. Are you signing in with a user name and password? Entering credit card information? Whatever it is, you better make sure that it's something that you wouldn't feel bad if it wound up on a billboard in Times Square, because that's about how risky your trasmission could be.
Before you get too concerned, there are a few things you can do to try to prevent this. First, DO NOT visit any links where you transmit information unencrypted. This is just asking for trouble. Since many man-in-the-middle type attacks can still be used to exploit this, my second suggestion is to use some sort of VPN tunnel. Whether it's a corporate VPN or just a freebie software VPN to your network back home, this allows you to encrypt all traffic over the untrusted hotel network. Make this your standard operating procedure anytime you connect to an untrusted network (not just a hotel) and you should keep your data much safer. Lastly, please be sure to have current firewall and anti-virus software on the computer you are using to connect to the untrusted network. The last thing you want is to get infected by some worm or virus just by plugging in to the network.
One other thing that I think that deserves mentioning here is that if you don't absolutely have to use the internet on an untrusted network, then don't do it. Obviously, there are times when you need access to do work, pay bills, etc, but if you can save those tasks until you reach a more familiar (and hopefully safer) network, that is far and away the best way to keep yourself and your data safe.
