Ryan Kazanciyan, Chief Security Architect, Tanium ( @ryankaz42 )
Co-Author, Incident Response & Computer Forensics (book website: https://ir3e.com/ )
This talk was delivered 04 March 2016 at the RSA Conference in San Francisco.
I'm providing a brief reaction/summary, and then my notes. The notes are my sort-of free-form notes, so if they are only semi-comprehensible.
I’ve always been skeptical of the threat intel (really mostly threat data) trend. It’s not a bad idea, but it seems like really just a new analogue to signature-based detection; it can only help detect something that someone else has already detected someplace else.
Ryan shares my concerns with the use of threat data, and gives some other reasons why its use is problematic. Not only is the data by definition incapable of deleting truly NEW threats, but it is inconsistent and often of dubious use even for its intended purpose(s). He gives some good ways to make better use of such data, as well as some methods of scouring your own systems for high-value threat data.
Five years ago, the compiling and sharing of indicators of compromise (IOC) seemed like it would “save the world” from attacks.
Today, this has still not become a reality.
- Brittle indicators with a short shelf life
- Poor quality of data in IOC feeds
- hard to build effective home-grown IOCs
- Tool for ingesting and detecting IOCs are inconsistent in quality
- IOCs are applied to a limited scope of data
Threat Intelligence is not equivalent to threat data
- Intelligence includes context and analysis
- However, good threat data is required for useful intelligence
- For this talk, we’re just talking about threat data, not intelligence
IOCs are Brittle:
- IP addresses and malware file hashes are most common
- URLs/hostnames are next most common
- File names are another common type
- 4/5 of malware types last less than a week; 95% less than a month
- C2 IPs and domains have a short lifespan
- Shared hosting means malicious sites often share an IP with compromised hosts (leads to false positives)
- Even paid feeds are not necessarily high quality
- Informal look at IOCs from paid, subscriber-only feeds
- File IOCs that include both hash and filename (filename easily changed, will lead to false negatives)
- File hashes included for files that are unique to a specific host
- Legitimate software libraries included as malware hashes because they were leveraged in some piece of malware
- Hard to avoid being too specific (leading to false negatives) or too general (leading to false positives)
- High-effort IOCs work for a specific investigation, but not for generic use across enterprises
IOC Detection Tools are Inconsistent:
- Tools support (and don’t support) different observables from standards (OpenIOC, Cybox, STIX, YARA, etc.).
- Logic structures in IOCs are not always implemented in the same way.
- Data normalization is a problem.
- The standards have some issues. E.g., OpenIOC was not created intentionally as a standard, but is merely the XML format created for Mandiant’s MIR tool. This has led to some serious issues.
Broadening the Scope of Endpoint Indicator Usage:
- Most common host data in SIEMs:
- HIPS logs
- Event log data, usually for only a subset (e.g. servers)
- Things like file hashes will therefore simply never be seen in the SIEM.
- Matching on forensic telemetry data
- Matching on live endpoints:
- Gives access to everything in memory, files on disk, and event logs.
- Can be high-impact and hard to scale.
- The Goal:
- Mixture of the above methods to maximize the value of brittle IOCs.
- Increase cadence of analysis as tools & resources permit.
- Taking shortcuts in coverage (“I only need to check my most important systems”) will leave gaps and lead to failure.
- Malicious code and actions rarely take place on the actual target servers.
Shrinking the Detection Gap:
- The most relevant threat intelligence is what comes from within your own environment.
- Over time, the effectiveness of looking for known IOCs has decreased.
- Looking for attacker methodology and outlier files/behaviors has correspondingly become more effective.
- The reality is that automation using known IOCs/threat data is good at finding the easiest things to find.
- Preventative controls need to fill a large part of that gap, as does internal analysis.
Looking inward to hunt:
- Derive intelligence from what is “normal"
- Build repeatable analysis tasks — a repeatable process
- More is not always better — start small with high-value indicators
- “What easily observable outlier conditions do intruder actions create?"
Example of Duqu 2.0 report:
- All the various samples created a scheduled task to run an msiexec.exe command.
- However, the provided IOCs consisted of a long list of file hashes and C2 IPs.
Example of analysis of scheduled tasks:
- create list of what accounts are used to run scheduled tasks
- create list of what actions/programs are being run using scheduled tasks
- Look through this for outliers
"Hunting in the Dark” talk gives more detailed examples. https://speakerdeck.com/ryankaz
Things to check out:
- MISP http://www.misp-project.org - open source project for ingesting and sharing feeds
- Facebook Threat Exchange https://threatexchange.fb.com
- Cybox 3.0
Questions for your threat feed vendor:
- Where is the data coming from?
- actual IR engagements
- auto-generated sandbox data
- firewall and other device/system data
- What is the breakdown of observable types? (IPs vs URLs vs file hashes, etc)
- What is the QC process? (if there is one!)