The 2014 Verizon Data Breach Investigation Report (DBIR) is out and it paints quite the gloomy picture of the world we live in today where cyber security is concerned. With over 63,000 security incidents and 1,367 confirmed data breaches, the question is no longer if you get popped, but rather, when. According to the report, data export is second only to credit card theft on the list of threat actions as a result of a breach. And with the time to compromise typically measured in days and time to discovery measured in weeks or months, Houston, we have a problem.
I've written in the past about all of the cool tricks we've been doing to find malware and other security issues by performing NetFlow analysis using the 21CT LYNXeon tool and this time I've found another trick around data loss detection that I thought was worth writing about. Before I get into the trick, let's quickly recap NetFlow for those who aren't familiar with it.
Think of NetFlow as the cliff notes of all of the network traffic that your systems handle on a daily basis. Instead of seeing WHAT data was transmitted (a task for deep packet inspection/DPI), we see the summary of HOW the data was transmitted. Things like source and destination IP, source and destination port, protocol, and bytes sent and received. Because many network devices are capable of giving you this information for free, it only makes sense to capture it and start using it for security analytics.
So, now we have our NetFlow and we know that we're going to be breached eventually, the real question becomes how to detect it quickly and remediate before a significant data loss occurs. Our LYNXeon tool allows us to create patterns of what to look for within NetFlow and other data sources. So, to help detect for data loss, I've designed the following analytic:
What this analytic does is it searches our NetFlow for any time an internal IP address is talking to an external IP address. Then, it adds up the bytes sent for each of these unique sets of connections (same source, destination, and port) and presents me with a top 25 list. Something like this:
So, now we have a list of the top 25 source and destination pairs that are sending data outside of our organization. There are also some interesting ports in this list like 12547, 22 (SSH), 443 (HTTPS), and 29234. A system with 38.48 GB worth of data sent to a remote server seems like a bad sign and something that should be investigated. You get the idea. It's just a matter of analyzing the data and separating out what is typical vs what isn't and then digging deeper into those.
My advice is to run this report on an automated schedule at least daily so that you can quickly detect when data loss has begun in order to squash it at the source. You could probably argue that an attacker might take a low and slow approach to remain undetected by my report, and you'd probably be right, but I'd also argue that if this were the case, then I've hopefully slowed them enough to catch them another way within a reasonable timespan. Remember, security is all about defense in depth and with the many significant issues that are highlighted by the Verizon DBIR, we could use all of the defense we can muster.
Last year I gave a talk at a number of different conferences called "The Magic of Symbiotic Security: Creating an Ecosystem of Security Systems" in which I spoke about how if we can break our security tools out of their silos, then they become far more useful. Lately, I've been doing a lot of work at my company in identifying systems infected by malware and getting rid of the infections because, as you are hopefully aware, the presence of malware on your systems is equivalent to hackers on your network. Malware can give the controller backdoor access to the system, allows them to scan the network for other devices to compromise, gives them a platform to launch additional attacks from, and enables them to exfiltrate data out of the network. I have a few different tools which I'll highlight later that do some really cool things on their own, but when you combine their functionality together, you open up a whole new world of possibilities.
The first tool that I wanted to talk about is for malware analysis. In our case this is FireEye, but this could just as easily be Damballa, Bit9, or any other technology that will allow you to identify IP addresses of hosts infected by malware, servers hosting malware objects, and command and control servers. Alone, this tool identifies a single client-to-server relationship, but it does provide a pattern that we can use as a template to find similar issues in our environment where perhaps we do not have coverage with this device. Now that we have identified the patterns that we are looking for, we need to find a way to discover additional instances of those patterns. This brings me to our second tool.
The second tool is for NetFlow analysis. In case you are unfamiliar with NetFlow, it is a feature of most network devices that creates summary information about the network activity that is running through them. It includes the source and destination IP addresses, source and destination ports, protocols, and bytes transferred. Specifically, we need a NetFlow analysis tool that is capable of showing us connections between our internal systems and systems on the Internet. In our case, we use a product called LYNXeon to do this. Alone, LYNXeon does a good job of allowing us to visualize connections from one system to another, but finding the systems related to malware issues can often be a needle in a haystack because of the NetFlow limitations mentioned above. So while our malware connections (downloads and command-and-control) are buried in the NetFlow data, we really have no way to identify them in the NetFlow tool silo.
Now comes the fun part. One of the cool things about the FireEye system is that it provides us with the ability to export data and one of the cool things about the LYNXeon system is that it provides us with the ability to import data and tag it. So what we do is, in FireEye, we export the list of all systems that we have detected as having been infected by malware. We also export the list of all of the command and control servers and malware hosting servers that we have seen connections to. Next, we go into LYNXeon and tell it to import these two lists of IP addresses and tag them with a custom tag that we created called "FireEye". We have now successfully combined these two tools and the payoff is huge.
Success #1: Detecting the Spread of Malware on Your Network
Our FireEye system works by executing downloads inside of a virtual machine and analyzing the affect they have on the system. Because the virtual machine doesn't always match the target system, in many cases we are only able to tell that it was malware and not that the malware actually infected the system. Using LYNXeon, however, we can create special queries that will show us all connectivity from the potentially infected system after the time of the malware download. Did the system immediately make connections to other foreign systems on the Internet? Did it start scanning our internal network looking for other hosts to compromise? All this and more is possible now that we have identified a potentially infected system on our network. Here is a pattern file which I created in LYNXeon to do this:
And here is the pattern diagram which this query accomplishes:
Success #2: Finding Other Infected Systems
FireEye appliances aren't free and with offices in over 40 countries around the world getting full coverage can get expensive. But, if we can use a handful of appliances to get an idea of where our systems are talking to when compromised, then we have data which we can turn around and use in places where we do not have those appliances. Because we are sending NetFlow data from our devices around the world into LYNXeon, we can search for any connections to these common malware servers. No more needle in a haystack. The data is all there, we just needed to know how to look for it. Here is a pattern file which I created in LYNXeon to do this:
And here is the pattern diagram which this query accomplishes:
Success #3: Discovering Other Types of Attacks
Often times our adversaries aren't just trying one type of attack and giving up when it fails. They are trying every trick in their arsenal and trying to gain and maintain a foothold on your network with whatever method they can. Once we've identified an attacker's IP address, we can now use our NetFlow data to see all other traffic coming from that IP address. Often times, expanding these types of relationships can shed light on other activities they are performing on your network. Perhaps they are performing reconnaissance on your servers? Maybe they are trying to DOS one of your systems? The fact is that once they've been uncovered as a bad guy on your network, you should be weary of all activities performed by them. Maybe even ban their IP address altogether. Here is a pattern file which I created in LYNXeon to do this:
And here is the pattern diagram which this query accomplishes:
So there you have it. By combining our malware analysis using FireEye and our NetFlow analysis using LYNXeon, we have created a hybrid system capable of far more than either of these tools by themselves. This is the magic of symbiotic security in action. Our tools becomes infinitely more powerful when we are able to share the data between them. Hopefully you will take that into consideration the next time you are looking at purchasing a security tool.
I had a meeting yesterday with a vendor who sells a SaaS solution for binary application vulnerability testing. They tell a very interesting story of a world where dynamic testing ("black box") takes place alongside static testing ("white box") to give you a full picture of your application security posture. They even combine the results with some e-Learning aspects so that developers can research the vulnerabilities in the same place they go to find them. In concept, this sounds fantastic, but I quickly turned into a skeptic and as I dug deeper into the details I'm not sure I like what I found.
I wanted to make sure I fully understood what was going on under the hood here so I started asking questions about the static testing and how it works. They've got a nice looking portal where you name your application, give it a version, assign it to a group of developers, and point it to your compiled code (WAR, EAR, JAR, etc). Once you upload your binaries, their system basically runs a disassembler on it to get it into assembly code. It's then at this level that they start looking for vulnerabilities. They said that this process takes about 3 days initially and then maybe 2 days after the first time because they are able to re-use some data about your application. Once complete, they say they are able to provide you a report detailing your vulnerabilities and how to fix them.
The thing that immediately struck me as worth noting here was the 2-3 day turnaround. This means that our developers would need to wait a fairly substantial amount of time before getting any feedback on the vulnerability status of their code. In a world full of Agile development, 2-3 days is a lifetime. Compare that to static source code testing where you get actionable results at compile time. The edge here definitely goes to source code testing as I believe most people would prefer the near-instant gratification.
The next thing worth noting was that they are taking binary files and disassembling them in order to find vulnerabilities. This lends itself to one major issue which is how can you determine with any accuracy the line number of a particular vulnerability written in let's say Java from assembly code generated by disassembling the binaries. By default, it's simply not possible. This vendor claimed that they can by adding in some debug strings at compile time, but even then I'd contend that you're not going to get much. I'm guessing they have some heuristics that are able to tell what function generated a set of assembly code, but I'm extremely skeptical that they can do anything with variable names, custom code functions, etc. I've seen some source code scanners, on the other hand, that not only tell you what line of code is affected, but are able to give you an entire list of parameters that have been consequently affected by that vulnerability. The edge here definitely goes to source code testing.
The main benefit that I can see with binary testing vs source code testing is that we can test code that we didn't write. Things like APIs, third-party applications, open source, etc are all things that we now have visibility into. The only problem here is that while we now can see the vulnerabilities in this software, they are unfortunately all things that we can't directly influence change in, unless we want to send our developers off to work on somebody else's software. I'd argue that scanning for vulnerabilities in that type of code is their responsibility, not ours. Granted, it'd be nice to have validation that there aren't vulnerabilities there that we're exposing ourselves to by uptaking it, but in all honesty are we really going to take the time to scan somebody else's work? Probably not. The edge here goes to binary testing with the caveat being that it's in something that I frankly don't care as much about.
This isn't the complete list of pros and cons by any means. It's just me voicing in writing some concerns that I had about the technology while talking to this particular vendor. In my opinion, the benefits of doing source code testing far outweigh any benefits that we could get from testing compiled binary files. What do you think about the benefits of one versus the other? I'd certainly love for someone to try to change my mind here and show me where the real value lies in binary testing.
This presentation was by Rohit Sethi, the Project Leader for the Secure Pattern Analysis Project at OWASP and he works at Security Compass, a security analysis and training company. My notes from the session are below:
- Before anyone starts building complex systems, they need to design.
- We create threat models on completed designs.
- What about during design?
- Book: "Core J2EE Patterns Best Practices and Design Strategies"
- If you use J2EE development, chances are you're using patterns documented here
- Core J2EE patterns are used extensively
- Patterns are used in JSF, Velocity, Struts, Tapestry, Spring, and Proprietary Frameworks
Example: Project: Analyze Patterns
Use to Implement:
- Synchronization Tokens as Anti-CSRF Mechanism
- Page-level authorizations
- XSLT and Xpath vulnerabilities
- XML Denial of Service
- Disclosure of information in SOAP faults
- Publishing WSDL files
- Unhandled commands
- Unauthorized commands
- Analyze patterns for security pitfalls to avoid
- Determine how patterns can implement security controls
- Provide advice portable to most frameworks
A security pattern is not the same as a security analysis of a pattern.
- Designing new web application frameworks (make the next generation of frameworks secure by default)
- Designing new apps that use the patterns
- Source code review of existing apps
- Runtime assessment of existing apps
- Integrate with threat modeling of new or existing apps
You can help:
- Tell developers
- Improve the analysis
- Add code review and examples to the existing pattern book
- Look at other pattern books to see if there are other patterns that we should analyze
- New web application framework idea + Design-time security analysis = Secure-by-default web application framework
This presentation is by Jacob West in the Security Research Group and Taylor McKinsley in Product Marketing from Fortify software. I'd like to note that Fortify is a developer of a source code analysis tool and so this presentation may have a bias towards source code analysis tools.
56% of organizations fail PCI section 6. Poorly coded web applications leading to SQL injection vulnerabilities is one of hte top five reasons for a PCI audit failure. Section 6 is becoming a bigger problem: #9 in 2006 reason for failure, #2 in 2007.
PCI Section 6 has to do with guidelines to "Develop and maintain secure systems and applications". Section 6.6 reads "Ensure that all web-facing applications are protected against known attacks by either of the following methods: Having all custom application code reviwed for common vulnerabilities by an organization that specializes in web application secure" or by using a web application firewall. Further clarifications say that automated tools are acceptable, web application penetration testing is allowed, and vulnerability assessments can be performed by an internal team.
Comparing Apples, Oranges, and Watermelons
- Setup: Source code analysis (+2) is good because it works on existing hardware, but must live where your source code lives. Penetration testing (+3) is good because you only need one to assess everything and works on existing hardware, but needs to talk to a running program. Application firewall (+1)is good because it lives on the network, but you must model program behavior.
- Optimization: Source code analysis (+2) is good because you can specify generic antipatterns in code, but you must understand vulnerability in detail. Penetration testing (+2) is good because tests are attacks, but you must successfully attack your application. Application firewalls (+1) are good because they share configuration across programs, but must differentiate good from bad.
- Performance: Source code analysis (+3) is good because it simulates all application states and is non-production, but scales with build time and not the number of tests. Penetration testing (+2) is good because you get incremental results and is non-production, but you must exercise each application state. Application firewall (+1) is good because it's a stand-alone device and scales with $$$, but impacts production performance and scales with $$$.
- Human resources: Source code analysis (+1) is good because it enables security in development and reports a root cause, but makes auditors better and does not replace them. Penetration testing (+2) is good because it is highly automatable, but reports symptoms and not the root cause. Application firewall (+2) is good because once it's configured it functions largely unattended, but requires extensive and ongoing configuration.
- Security know-how: Source code analysis (+3) is good because it gives code-level details to an auditor, but you must understand security-relevant behavior of APIs. Penetration testing (+1) is good because it automates hacks, but a hacker is required to measure success and optimize. Application firewall (+2) is good because it identifies common attacks out of the box and is a community effort, but a hacker is required to measure success and customize.
- Development expertise: Source code analysis (+1) is good because it focuses attention on relevant code, but you must understand code-level program behavior. Penetration testing (+2) is good because basic attacks ignore internals, but advanced attacks require internal knowledge. Application firewalls (+2) are good because they live on the network, but you must understand the program to tell good from bad.
- False positives: Source code analysis (+1) is good because it gives auditors details to verify issues, but reports impossible application states. Penetration testing (+2) is good because results come with reproduction steps, but it is difficult to oracle some bugs. Application firewalls (+1) are good because it is attacks instead of vulnerabilities, but there is an evolving definition of valid behavior.
- False negatives: Source code analysis (+3) is good because it simulates all program states and models the full program, but it must be told what to look for. Penetration testing (+1) is good because it is good at finding what hackers find, but is difficult to oracle some bugs and has missed coverage. Application firewalls (+1) are good because it uses attacks instead of vulnerabilities, but there is an evolving attack landscape.
- Technology support: Source code analysis (+2) is good because parsing is separable from the analysis and is interface-neutral, but it must adapt to new program paradigms. Penetration testing (+2) is good because it is independent from program paradigms, but is tied to protocols and is limited to network interfaces. Application firewalls (+2) are good because they are independent from program paradigms, but are tied to protocols and are limited to network interfaces.
Working Towards a Solution
- Assessment: Proving the problem or meeting the regulatory requirement. Recurring cost that does not "fix" anything
- Remediation: Fixing security issues found during assessments. Lowering business risk at a single point in time.
- Prevention: Get security right hte first time. Minimizing business risk systematically.
Do your own comparison and fill out the scorecard yourself (presenters ratings are noted in parentheses above).
Taylor did interviews with three companies to get their experiences deploying each (source code analysis, penetration testing, and application firewall) and had them evaluate based on the nine criteria both before and after buying. Not going to list each company's results in the blog, but it was just a basic table with each criteria and a number rating for both before purchase and after deployment. To sum it up, Source Code Analysis was a 14 rating before purchase and a 17 rating after deployment. Penetration testing was a 21 rating before purchase and a 21 rating after deployment. Application firewalls were a 21 rating before purchase and a 16 rating after deployment. It seems like the first organization had a large amount of developers and that factored into their decision to purchase a source code analysis tool. The second organization had a far fewer number of developers and was more of an IT shop and chose the penetration testing tool. The last organization was a smaller shop in general (still fairly large) and went with the WAF because they wanted something they could just put in place and manage.
Analysis: All three solutions required more effort than expected. All three solutions produce reasonably accurate results. Varying levels of expertise needed.
How do you demonstrate that your application is protected against known attacks?
- Verification that the application was analyzed
- A report showing no critical security issues identified
- Document showing how the tool fits into your architecture
How do you show that the user is appropriately trained?
- Document explaining prior experience or an informal interview
How do you show that you have configured the tool appropriately?
- Document explaining how the tool was configured and what new rules had to be added.
Summary: PCI section 6 is evolving to become increasingly precise. Compare technologies in your environment along nine criteria. Demonstrating compliance is an art, not a science.