Machine Learning: Is It Really the Next stage in Security?
Most people understand that computers do what they are told, but the concept of machine learning is a complex part of artificial intelligence (AI) where a computer "learns" without being specifically told what to do. Researchers have leveraged AI to turn machine learning into precise security analysis to detect intruders and protect from ongoing attacks. Because hackers are always changing the code to avoid detection, AI can help detect new zero-day viruses automatically without waiting for a solution to be coded for every new malicious application.
A Brief Summary of Machine Learning and AI
Machine Learning is still in its infancy. It was often a part of sci-fi movies and theoretical debates, but with the evolution of technology and the increase in computing power, AI is a big part of research.
In typical programming, a computer only does exactly what it's told to do. How do you tell a computer what to do? You do it through code. Your code produces dynamic output based on user input, but the instructions are always static and the results are always the same if the input is the same.
Machine Learning algorithms still use code to tell a computer what to do, but the data collected allows the machine to make determinations based on different values. This means that although the input is the same, the output could be very different based on the machine's analysis in data as it changes over time.
You see machine learning every day, especially in common applications. You might not realize what you see is a part of AI, because it's built to be invisible to the user. Take Facebook, for instance. When you log in and see your feed, all of the posts and popular topics generated by the algorithms are done using AI and machine learning. The algorithms identify your likes and common interests and determine what would best interest you at any given day.
Several interest groups have joined together to form MLSec, an open-source project based on machine learning and information security. Just like Facebook learns your input and applies it accordingly to increase your interest in products, machine learning can be applied to security.
Take antivirus software, for instance. There are millions of viruses in the wild, and your antivirus program must be able to identify malware before it is able to run on your computer. Imagine millions of ones and zeros analyzed and identified as malware, but malware writers continue to tweak their code to avoid detection. As a matter of fact, a malware writers main goal is to write code that won't be detected by antivirus software.
With machine learning, anti-malware writers could "learn" to detect suspicious traffic and malware signatures even if the signature isn't known in the wild. The machine learning algorithms can read data used from user interaction, common malware signatures, and certain footprints left by malware and make a "learned" decision to either allow the malware to pass through and run or block it as suspicious. This type of security is already a part of some intrusion detection and intrusion prevention systems.
Machine Learning details are a complex subject mainly due to the algorithms and statistical analysis needed to create a positive outcome and reduce false positives or false negatives. One issue with machine learning and security analysis is the large chance for false positives. A false positive could be anything that generates an alert or blocks traffic based on data from the analysis application.
False positives in security can be extremely problematic, because the traffic blocked could be critical or a part of a legitimate application. Imagine your customers store sensitive data or send an emergency message only to have this data blocked due to a false positive.
False negatives are also an issue. A false negative leaves the network vulnerable to malware when the anti-malware software doesn't catch legitimate malicious software and allows it to pass through the network. This, too, can have devastating effects.
Issues and Benefits of Machine Learning and Information Security
Almost any industry can take advantage of machine learning. Security is a complex topic by itself, which means that applying machine learning to any security application is more difficult than other industries.
Every year, malware writers evolve their code into more efficient ways to avoid detection. Insider threats are also more prominent than previous years, and machine learning has helped to stop insider threats whether they are intentional or unintentional. Insider threats occur when disgruntled employees steal data or the employee is a victim of an attack and accidentally installs malware on the network.
With machine learning, your security application can take a snapshot of common file access. It uses the data as a benchmark to identify what is "normal" access. It continues to monitor files across the network and makes assumptions based on incoming data. When file access is suspicious, the security application sends an alert to the administrator or locks the file. This type of security and file access analysis is a good example of machine learning currently implemented in the information security industry.
One issue with security and AI is that both topics are extremely complex. Information security is difficult alone, but adding AI to the mix makes it even more difficult. The big reason the technology hasn't been more widespread is due to the limited availability of people who specialize in both AI and security. You need data scientists, developers that have worked with big data and machine learning, and then the right people to implement the solution.
You also need people who can monitor the solution to ensure that false positives and false negatives are at a minimum. These people need to be the deciding factor for alerts sent by the security software. A reduced amount of false notifications is dependent on the expertise of people installing the software, and these people are difficult to find.
The cyber security talent pool is small compared to other areas of IT such as administration and development. Cyber security is an untapped wealth of information that is much more difficult to master, so many applicants focus on basic information technology. It's also an industry that doesn't have many junior level employees because these new applicants need to get experience in basic network administration before moving on to cyber security.
Another issue tied to finding talent is the cost of this talent. Having a security analyst on-hand 40 hours a week (and overtime for on-call support) is expensive. It's an expense that most small businesses can't budget for.
Machine Learning and the Future of Security
Machine Learning is the future of technology in all aspects of the industry. Most industries that use IT have research in the field such as healthcare, eCommerce, business, and sales. Research has opened several doors to building better software that can take big data and learn from it to help us humans make educated, efficient decisions. This technology can even make educated projections of what could happen in the future such as what will be the best product to sell or what could be the next malware signature.
Although it's not widespread just yet, machine learning in security will greatly help companies reduce risk impact and improve protection of customer data. It's a long way before it's affordable and available at low costs, but as with most technology, it will become available for small businesses with time.