Security Metrics

Intro

In a prior article, I discussed the cycle of inaction. With respect to cybersecurity, the cycle results in managers putting too much faith in protection, which can lead to complacency.  To help improve the situation, management needs to get a better understanding of the effectiveness of the cyber defenses. Basically, managers need effective security metrics.

There have been attempts at coming up with security metrics, but they are usually not very useful. Consider, for example, this less than ideal security metric.

Percentage of systems with current anti-virus software.

Why is this a poor security metric?  Because it is actually an operations metric. Consider, if the number is 100% of systems have anti-virus, does that mean that all systems are then immune from malware? The answer is NO, because a zero day virus could get through. That 100% of systems are covered means that, operationally, we are doing a good job.

What if the answer is 0%, does that mean we are in big trouble? Again, the answer is NO, because the systems might be lab systems, in a closed network, immune from any new software. Or, the systems might be running an operating system like plan9, for which there are currently no known malware.

The problem with the above metric is that it does not directly tell us anything about the state of cybersecurity. So, how can we improve that?

The percentage of systems with antivirus should be tracked as an operational issue, or course. However, as for cybersecurity, the real measure will be how effective are the controls that currently exist.

The role of incident response in security metrics

Consider that a single desktop in an organization gets malware. The key metric here will be to determine how that happened, and it will tell you which part of your cyber defense failed. For example, consider the case where an employee receives an email message with an attachment. The attachment contains a virus, and the employee clicks on the attachment, because the attachment looks like a resume and the employee’s job is to look at resumes from strangers, as they work in the Human Resources department.

If the “resume” that the employee opened actually contained a virus, and that virus infected systems in the company, that would be a problem. Once the virus has been detected, an incident response would then tell us which security control failed. Perhaps it was the content filter in this case, along with the desktop antivirus. This is security metric data, how effective are our existing controls. Track this data.

While tracking this data and performing an incident response, it is very important to build an environment where users will report these events. Chastising users for opening an email attachment focuses too much on prevention and could hamper our ability to collect information to perform incident response. Therefore, I encourage security teams to encourage reporting and discourage the statements of “you should not have opened that attachment.”

Over time, with incident response and user reporting, you will have information on the effectiveness of security controls in use in the organization. You will see whether the current network architecture can be secured with existing tools, or whether it needs to be tuned.

Security assessments as a measure of cybersecurity

Many organizations run periodic security assessments, such as a pentest or vulnerability assessment. These are effective tools to help understand the current state of cybersecurity, if they are used correctly.

Consider the vulnerability scan, for example. Vulnerability scans from tools, whether OpenVAS, Nessus, Nexpose or others, may contain false positives or might contain too little data. The scanners offer great options, but if these options are not set correctly, the scanners might look at too little or too much. For the purpose of this discussion, let’s assume that your scanner is properly configured, and it not sitting behind a firewall that blocks all of its requests.

After a vulnerability scan has been completed, the output of a vulnerability scan should be cleansed of false positives. (A reminder, a false positive is a finding reported by the vulnerability scanner that is not a real finding.) Once the false positives have been removed, you have an accurate report on the number and severity of known vulnerabilities existing in a system.

Let’s assume we have scanned a webserver and we find that it is susceptible to the old heartbleed vulnerability. The web server administrator then either knew about this issue, or they were surprised to learn about it. That is a metric worth tracking. So, consider this metric.

How many security issues that were identified during a vulnerability scan were already known about by the application team?

This number should be zero, in an organization that has a good understanding of how to deploy secure applications. And this is much more realistic metric. Rarely are applications deployed with zero defects, zero vulnerabilities, because it is too hard to get there. Instead, applications are generally deployed with known vulnerabilities that we can monitor and control, and where the application owner can accept the risk for the issues.

Security awareness

Organizations generally have security controls in place, such as a firewall and virus scanners. Others have gone a little further and put a proxy in place to screen http traffic (but not https, sadly). As organizations put these tools in place, they are making it more difficult for attacker.

Attackers then respond by changing their tactics. Instead of going for a direct attack, they try to trick a user. This process is typically known as social engineering. And, one popular type of social engineering is a phishing email. A typical phishing email tries to trick a user to give out their username and password to an attacker.

In response, organizations perform a phishing email test. Controlled, safe phishing emails are sent to users, and their responses are tracked. What are good metrics to track with these phishing tests? I suggest that the metrics that we care most about are:

  • How many users have given away valid user credentials?
  • How many users have done this multiple times?
  • How has security awareness training reduced the number that give away credentials?

Conclusion

Cybersecurity metrics should be able tracking the organization’s current efforts  to protect their information, detect cybersecurity related issues, and  respond to cyber security threats. With the right metrics, the cybersecurity program can then focus on driving the metrics in the correct direction. For example, it is not about the number of systems infected by malware that is our primary concern anymore. It is, what existing control failed that allowed the malware in? It isn’t about how many users fall for phishing. It is about, does our awareness program actually show a meaningful change in the number susceptible to phishing attacks.

Take the time to build the right cybersecurity metrics. Once in place, the organization will be positioned to naturally response to changes in the cybersecurity landscape.

wannacry ransomware

A new ransomware outbreak has captured a lot of press over the past couple of days. So much so that the US Department of Homeland Security has put out a statement on this ransomware. While trying to prevent exposure to this type of malware would be great, it will be difficult to do. Antivirus, content filtering, firewalls, proxies and other cybersecurity devices are good, but the only work against threats that they know about. Now that the wannacry malware is known, cyber defenses can be tuned to block it. However, a new version of the malware, with just a small change, might not be recognized as a threat until it is too late.

This again points out that protection against this type of threat can not be done by cyber defense tools alone. Ransomware is an effective attack if you do not have any other way to recover your files. This might sound obvious, but this is the best time to ensure that your backups are up to date, and stored off line. If your cyber defenses fail against the next ransomware and you get hit, at least you can recover at your pace, and without paying.

Here is what I have shared with many that I have been working with on this malware. I hope it helps you.

—-
Recently, there has been a new cyber attack that has been spreading across the Internet. The attack has been named “wannacry” or “wannacrypt”. It encrypts files and demands a ransom in order for users to recover the files. Typically, the ransom is approximately more than $300 per system.

This attack uses a flaw in the Windows operating system to spread from one system to another. Microsoft has recently release an update to fix this problem, and we are working hard to ensure that systems are protected against this attack. However, patching alone won’t be enough to protect us from this attack, we will need your help.

Please make sure that you do the following to maximize your cyber safety.

  • Ensure that your most important files are backed up. Further, make sure that backup is removed from your computer. For example, copy your important files to a USB disk drive and remove the disk once the copy has been completed.
  • Do not open email attachments unless you are absolutely sure you know who sent you the email and what the attachment is. If you have any doubts, call the person that sent you the email to confirm the message.
  • Do not download any freeware or “too good to be true” utilities from the Internet. These tools may be infected with malicious code.
  • Do not visit any suspicious websites, because the ads that play on suspicious websites may be infected with malicious code.

And, if you see anything suspicious, or notice that files have been encrypted on your system, please contact your local IT support team at once.

Stay safe…

Don’t Let the Press be your Intrusion Detection System

All of the highly-publicized breaches last year continue to highlight that organizations are still wrestling with how to get a handle on their cybersecurity[1].  Breaches put the confidentiality and the integrity of your information at risk, as we recently saw with the hack into the Democratic National Committee’s email[2]. A denial of service attack impacts your availability, as we have recently seen with the attacks against the DNS provider Dyn[3].  In cases like these or similar, organizations were not aware of the extent of the issue until they read it in the press.

So, why are organizations usually the last to know?

  1. Protecting confidentiality requires surgically reducing access to information. The information needs to be available and modifiable, just not to everyone. To do this takes an understanding of the workflow. Just opening the data up to all is a fast way to get a system deployed.
  2. Management lacks clear metrics on the state of cyber security in their organization. Few know any real information on how effective their current protection is. For example are all the virus scanners up to date in an organization? Can people bypass the proxies? Currently, management is given useless data like number of attacks blocked at the firewall, number of spam messages stopped, or number of viruses caught by virus scanner. (Why do I call these useless? I’ll be following that up in my next post – and tell you what you should be looking for. But suffice it to say – you have the data – you just aren’t looking at it correctly.)
  3. The perimeter defense just isn’t working. Many organizations have firewalls, web proxies and virus scanners that protect laptops at work. However, those same laptops are then used at home, where they are not behind the web proxy or firewall.
  4. There are very few really good cybersecurity professionals out there, which probably contributes to #2
  5. The bad guys are relentless.

As management is not seeing the right picture, most then are unaware that their cybersecurity defenses are inadequate. They don’t yet see a need to invest in monitoring the technologies they’ve invested in. And, this leads to no monitoring, which re-enforces the strategy of not investing in cybersecurity.

The Cycle of Inaction

cycle-of-inaction

Good management means that you invest efficiently, and investing in something that is not needed is inefficient. Lacking effective information, the perception becomes that there isn’t a problem. This feeds what I call the Cycle of Inaction. This cycle is caused by believing the investment in protection is enough, and lacking additional information, must be working. This leads to complacency, when metrics are actually needed. A complacency that sometimes is broken by a press article.

This cycle of inaction can lead to spectacular failures. Of note over the past couple of years, we have the hack of the NSA toolkit, the recent release of the CIA cyber toolkit, the hack of Yahoo!’s passwords, the hack of Target, the hack of …

We know of these events because the press is the Intrusion Detection System (IDS) of default for many organizations. That IDS, however, is not easy to control, and definitely reports what we call “trailing metrics,” or a metric about a problem AFTER it has happened.

What Is Your Cybersecurity Maturity

I’ve found that the cybersecurity issue that the industry is confronting is very similar to the quality issues that the industry tackled in the 1970s and 1980s. To address and improve quality, the ultimate solution was to install a mature process within an organization. A mature process is defined as a process that is repeatable, with quality-based decisions made using meaningful metrics.

I offer that many organizations are at a maturity level of 1, if the Capability Maturity Model (CMM) metrics are used. Getting to a CMM maturity level of 2 (of which there are 5) appears to be a little bit away for cybersecurity. If the struggle is to get to CMM 2, perhaps it makes sense to sub-divide the maturity level 1 into sub levels, as in the list below.

Level Action You are first to tell the story You can investigate privately You can prevent a large incident
1.1 Organization learns about cybersecurity failures via the press, where the message is uncontrolled and incident needs to be addressed.
1.2 Organization learns about cybersecurity failures via a third party, privately (e.g. law enforcement or a business partner) The message can be controlled, as can the response to the incident.
1.3 Organization learns about cybersecurity failures internally. This allows the organization to control the message of the incident as well as the response.
1.4 Organization notes indicators that an incident is about to happen. Here, the organization can take steps to mitigate an incident before it happens.

To increase your cybersecurity maturity, you need to improve your ability to monitor the cybersecurity of your digital assets, by analyzing the outputs of the technologies you have invested in.

Consider, your organization currently has firewalls to protect against bad things coming from the outside. You have web proxies and even content filters to protect against bad things coming from the outside. And, you have anti-virus scanners on your desktop.

With all of those layers of defense, it seems reasonable to conclude that no virus should ever reach the desktop. Measure that. Any time that any computer’s virus scanner detects a virus, a root cause investigation should be performed to determine which security control failed. For example, if a desktop has recently been infected with ransomware, a forensic analysis should be performed to determine how the virus got on the system. At the highest level, the cause will be one of these two things:

  1. The user violated a security practice, such as plugging in a USB.
  2. An existing cybersecurity technology failed. Did it not work? Was it improperly deployed?

Collect these metrics on the root causes, and soon you will have a clearer picture of the effectiveness of the controls.

Next topic, suggestions for effect metrics, to help you increase your “sense” of cybersecurity within your organization.

References:

[1] Let’s define cybersecurity as the protection of the confidentiality and integrity of information, along with ensuring that the information is available when needed to whomever needs it


[2] Krebs, B. (2017,January). The Download on the DNC Hack. Retrieved from https://krebsonsecurity.com/2017/01/the-download-on-the-dnc-hack/


[3] Newman, L. H. (2016, December). The Botnet That Broke the Internet Isn’t Going Away, retrieved from https://www.wired.com/2016/12/botnet-broke-internet-isnt-going-away/

Speed matters. How to make a forensic image as quickly as possible.

The typical method used to create a forensic image is to connect the source disk to a write-blocker. The write-blocker is then connected to a computer and a forensic image is made. This process needs to be updated to keep up with the capacity and speeds of the newest disk drive. By making the process as efficient as possible, the forensic imaging times can be substantially reduced.

When making a forensic image of a disk drive, it is necessary to copy every byte available from the source disk and to ensure that nothing is written to the source disk. As the capacity of disk drives has increased, the time required to make a forensic image has also increased. For example, a 20GB disk drive would take approximately 8 minutes to image at best. A 200GB could take approximately 50 minutes at best, while a 1TB disk drive would take approximately 2.5 hours.

We can calculate how fast a disk drive can be imaged by dividing the total capacity of the disk by the maximum sustained transfer rate (MSTR) of the disk. The MSTR is the manufacturers information on how fast data can be read off of a disk drive for a very large transfer. The MSTR tells us how fast data comes off of the disk. (Note that the maximum burst transfer rate is not of use to us since it only provides information on how quickly data comes out of the disk cache, and it only applies to a small amount of data.)

Let’s look at a 1.5TB Western Digitial Caviar Green disk drive as an example. The data for this drive is available here.  This disk drive has a capacity of 1,500,301 MB and it has a maximum sustained transfer rate of 110 MB/s. Thus, it would take 227.3 minutes (almost 4 hours) to forensically copy the entire contents of the disk drive. (A transfer rate of 110MB/s is 6.6 GB/minute.) To achieve this speed, all parts of the forensic imaging process must be able to process data at a rate of 6.6GB/minute or greater.

Using a USB 2.0 write-blocker would slow this transfer rate down dramatically, as USB 2.0 has a maximum data transfer rate of approximately 34 MB/s. Using a USB 2.0 write-blocker when imaging the 1.5TB disk drive would require 735.4 minutes (over 12 hours).

Other factors that can alter the efficiency of the disk imaging process include:

  • The buffer size of a data transfer.
  • The filesystem where the data is being written to.
  • Whether compression is used when making the forensic image.

All of the above factors need to be tuned to ensure that forensic images are made as quickly and efficiently as possible.

I have recently published a paper in the Journal of Forensic Sciences entitiled Characteristic of Forensic Imaging. This article discusses the impacts of different factors on the efficiency of forensic imaging. I am also preparing a web page that will provide simple scripts to allow you to evaluate the efficiency of your forensic imaging setup.

the latest on credit card frauds

Recently I worked with Acme (the name has been changed to protect their identify), a retail company that had been contacted by their bank. (Let’s call the company Acme.) During an investigation of some credit card frauds, the bank discovered that many of the fraudulent transactions appeared to have one location in common, Acme.

The analysis works like this. Let’s assume that Joe Smith and Mary Jones used their credit cards at Acme on March 1st. Then, on March 20th, both Joe’s and Mary’s credit cards were involved in fraudulent transactions. Once a credit card is involved in a fraudulent transaction, the banks look to see if this transaction is part of a larger fraud. So, they check the historical transactions of Joe and Mary, looking for the business that they both have in common. The theory is simple, if Joe and Mary visited a company with a security breach, it will be seen in the historical analysis.

This type of fraud analysis is useful for detecting when many credit cards are compromised at a business. If the bank can identify the location where credit card numbers were compromised, it can prevent future fraud from that compromise. In order to do that, the bank will need to cancel all credit cards that were used at the business where the compromised occurred and re-issue new ones.

Back to Acme. So, based upon fraud analysis, the bank had strong reason to believe that somehow Acme was leaking credit card numbers. In fact, the bank suspected that over 70 fraudulent transactions resulted from a problem with Acme. Our review of Acme showed that their network was Payment Card Industry (PCI) compliant. The credit card numbers were protected in Acme’s network. So, the card numbers were not leaking out because a network hacker.

This left only two options. The first is that an employee or employees were stealing the credit card numbers through the use of a skimmer, or that Acme’s card processor was hacked. Based upon the fact that only certain transactions at Acme were reported as compromised, this meant that the skimmer possibility was much more likely.

While there has been a lot of work on securing credit card data over the network, the physical credit card is still vulnerable to the skimming attack.

In order to protect yourself, do not let you credit card out of your sight when you use it. Because when it is out of your sight, it is possible that the person that took your credit card also took a copy of your credit card.

 

 

Why won’t my call go through? Denial of service in the cell phone network.

Recently, some of the major cellular carriers have released “Network Extenders”, also known as femtocell. The network extender is a device that a subscriber purchases to extend the reach of the cell phone network. (In effect, the subscriber is paying for the privilege of increasing the cellular network coverage. What a deal!)

The network extender is conceptually similar to a Wi-Fi access point. Both connect to the Internet via wire, and both provide wireless services. While the Wi-Fi device provides Internet services, the femtocell provides cellular services.

The femtocell basically appears as a new cell tower to cell phones that are within its range. And, the femtocell will process calls for any and all cell phones that successfully register with the cell phone while is it connected to the Internet. Effectively, the femtocell is just a new gateway to the cellular network.

It is not possible for the cell phone owner to choose to connect to the femtocell or to a regular cell tower. The decision on how the cell phone connects to the cell network is made by the cell phone and the “cell tower”. And, this did not used to be a problem, when only the cellular carriers were putting up cell towers. However, the release of the network extender has allowed individuals to deploy cell towers.

Recently, I encountered a denial of service issue with a cell phone that I tracked back to an issue with a femtocell. A cell phone has registered with the femtocell to connect to the wireless network. However, the femtocell lost connectivity to the Internet. (Remember, the femtocell is a gateway that uses the Internet to connect to the cellular network.)

Since the femtocell still had power, the wireless side was still active. This meant that any cell phone that had registered with the femtocell thought that it was still connected to the cellular network. However, the femtocell had no ability to connect to the cellular network, since the Internet was done. It appears that the current cell phones do not have the ability to determine if they are connected to a cell tower that is active.

Thus, the cell phone could not make or receive calls or text messages. And the user had no ability to tell the cell phone to switch to a working cell tower. The only was to get the cell phone working again was to move to a different area, outside of the range of the femtocell. And, the cell phone reported 3 or 4 bars during the entire outage.

Until the carriers improve the algorithm that a cell phone uses to ensure it has an active cell tower, about the only thing the subscriber can do is use a Voice over IP (VoIP) application as a backup to the standard phone. And, this will only work if the VoIP application can use the Wi-Fi network for calls. And, if that is not possible, use email, which should still work via Wi-Fi if the cell tower is not functioning.

 

SCADA and security

A recent article  by Hal Hodson of Information Age reports that the FBI has publicly stated that hackers have successfully targeted SCADA systems in three unnamed US communities. The attacks were reported to have the potential to shut down electricity at a nearby mall as well as the potential to dump sewage. Just weeks earlier came an announcement from the Illinois Statewide Terrorism and Intelligence Center that claimed a water pump failure was caused by a hacker attacking the pump control system. The failure came from the attackers repeatedly turning the pump on and off. (The Illinois hacking attack has been refuted the FBI, so then it must not be one of the three sites reported above, right?)

So, what exactly is SCADA? Supervisory Control and Data Acquisition. SCADA systems control power production and distribution, such as those used for the generation of electricity or the delivery of water to communities. They are basically used to support the infrastructure that we rely upon. Thus, the failure of SCADA systems can impact a large number of people.

In a display of the potential damage that can be caused by an attack on the SCADA network , let’s look back to Stuxnet . This malware was reported to have targeted very specific Siemens based SCADA systems. (The attack was so specific that there was speculation that the purpose of the malware was to damage the nuclear facilities of Iran.) While details are hard to come by, it appears that the Stuxnet attack resulted in damage to centrifuges. (The centrifuge is used to separate different isotopes of uranium.)

Stuxnet caused incorrect data to be reported, which lead to the control systems effectively “mis-operating” the equipment. This “mis-operation” then resulted in damage. Stuxnet further revealed that it is difficult to prevent SCADA systems from malware attack. Theoretically, Stuxnet should not have been able to infect the SCADA systems controlling the centrifuges. However, in practice, it did because somehow the malware was introduced, either through an Internet connection or carried in via a USB. This reveals the risks of taking SCADA systems that are already network capable systems and making them accessible via the Internet.

So, you would think that a malware infection such as Stuxnet could not happen again. Not so fast, as Iran has reported that they are now dealing with another virus, the Duqu virus, that is targeting their civil defense system.

Well, what can we learn from all of this? Certainly, virus scanners are less effective now, especially against a determined adversary. Therefore, it truly is important that SCADA systems be shielded from the introduction of malware, whether it is via the Internet or through a USB device.

As consumers, we all have an interest in the security of the SCADA systems that manage our power, our water, and even our prisons.