How to Build a Smart Intrusion Detection System With Opencv and Python

Feb 11, 2022•14 min read

Languages, frameworks, tools, and trends

For small and midsize businesses, cyber security solutions are becoming a necessity. Due to COVID-19, remote work has become more prevalent over the past few months, expanding the threat landscape.

In order to monitor security events on the network, businesses need to implement intrusion detection systems (IDS). These detect and mitigate network threats and attack malicious activities with the help of hardware and software.

IDSs collect and analyze malicious activity information and send it to an IT team for analysis.

What is an intrusion detection system?

The main purpose of an intrusion detection system is to monitor network traffic for suspicious activity and alerting when such activities are detected. But most intrusion detection systems are intelligent enough to capture malicious activity and take action when it occurs. In those systems, suspicious Internet Protocol (IP) addresses are blocked.

In contrast to an IDS, an intrusion prevention system (IPS) monitors network packets for damaged network traffic, with the primary goal of preventing threats rather than simply detecting them.

How does an intrusion detection system work?

IDS monitors all the traffic between devices on a network. The system essentially functions as a secondary firewall behind the primary one that identifies malicious packets based on two suspicious clues:

A signature of a known attack.
Abrupt changes in routine.

An intrusion detection system detects threats by analyzing patterns. Using this technique, IDSs can compare network packets with a database of cyberattack signatures.

With pattern correlation, IDS can flag attacks such as:

Threats like malware (worms, ransomware, trojans, viruses, bots, etc.).
A scanning attack that involves sending packets to the network to detect which ports are open and which ones are closed, what type of traffic is acceptable, and what type of software is installed.
With asymmetric routing, security controls are bypassed by sending malicious packets that enter and exit through different routes.
Replace database content with malicious executables through buffer overflow attacks.
Specific protocols targeting attacks such as ICMP, TCP, ARP, etc.
Defeat DDoS attacks, which overload networks with traffic.

In cases where an anomaly is detected, the IDS will flag it and raise the alarm. There can be any form of alarm, either a note in the audit log or an urgent message to the IT administrator. Afterward, the team diagnoses the problem and identifies its root cause.

Intrusion detection system using Python

For this project, we will be using Python programming language along with two other libraries that are OpenCV and NumPy.

We will start by importing the libraries. If you haven't already installed these libraries you can install them using the pip command. Just open your command prompt, paste the below command, hit enter and wait for the libraries to be installed.

pip install OpenCV-python

How to install OpenCV-Python.webp

If the above piece of code doesn’t give an error, your libraries are installed successfully.

You can either use the camera of your laptop or use some video for this project. The first step will be to capture the video file or start the video, in case you are using your laptop camera. Then we have to define the region which we want to protect.

For this, we will create a click_event function. This function will allow you to select a rectangular region in the frame. The user has to first click left at the point of start L1 and drag the pointer to the end of the portion L2. By default, we will take the whole frame, so, you can leave this parameter if you want just by pressing any key to continue.

Region of interest to determine intrusion.webp

code to set region of interest.webp

From the above snippet, you can observe that we used OpenCV mouse events to create the region of interest. We used a flag object "draw" that determines the state of the mouse pointer. When the state is 1, it means the user is drawing the region of interest and once he is done, the state comes back to 0 again, allowing the user to recreate the region of interest. We allocated the coordinates of the region of interest as a global variable so that we could use those values in the later section of our code.

We will start reading the frames of the video as you can see in the below snippet of code. We will take two consecutive frames of the video and focus on the portion of the frame or the region of interest that we defined in step 1.

Code Snippet to read frames of the video in IDS.webp

Using an RGB image for this task may not be very helpful and will make the process slower. So, we will convert these frames to grayscale images. After that, we will check if there is any change in these two consecutive frames. We can calculate these changes by using the absdiff() function of OpenCV.

The absdiff() function will return the difference between the two frames. The next step will be to extract the masked image of this different image which will contain either white or black pixels. For that, we can use image thresholding. The white pixels denote the change in the frames whereas the black portion of the image denotes similarity in the two frames. As these changes could be very sharp and might contain several breaks between them, hence they might be hard to detect in some cases. So, we will use some image processing techniques to rectify the problem.

We will use the Gaussian blur technique to smoothen the image and dilate the image to fill the gaps between the white-masked images. After performing the image processing, our masked image looks as below.

Gaussian blue technique to smoothen the images.webp

After this, we will calculate the area of these individual white segments in the image. To calculate the area of these white segments, we will use the find contours method of OpenCV. This method will extract the boundary points. You can see these contours by using the draw contours function of OpenCV. Further, we can feed these points to the contour area function which will give us the area of each contour. We can ignore minute changes as they might just be noise.

For that, we will define a threshold area. If the contour size is less than this threshold area (900 in our case), we will ignore that contour and otherwise. To show that we have detected the intrusion, we can surround the contour with a green bounding box.

code to set threshold to determine the level of intrusion.webp

intruder output.webp

We can even use a buzzer to alarm the owner. Consider it as an assignment to implement the buzzer alarm in this project. There are several libraries you can use for that like win sound and beeps.

Types of the intrusion detection system

The deployment of intrusion detection systems varies according to the environment. A new generation of IDS can either be host or network-based, just like many other cybersecurity solutions.

Types of intrusion detection system.webp

1. Host-Based IDS (HIDS)

Specifically, a host-based IDS gets deployed on a specific endpoint to improve its protection against external and internal threats.

Features of IDs include:

Monitoring network traffic to and from a machine.
Watching running processes.
Inspecting the system logs.

The visibility of a host-based IDS gets limited to the host machine, which reduces the context for decision-making. However, it provides a deep understanding of the internals of the host.

2. Network-Based IDS (NIDS)

IDS solutions that monitor networks as a whole can be called network-based solutions. As a result, it is able to view all packet information and make decisions based on the contents and metadata of each packet.

Therefore, these systems can detect widespread threats but do not have visibility into the insides of the endpoints they protect.

Due to different levels of visibility, implementing HIDS or NIDS in isolation does not fully protect an organization's systems. Integrated threat management solutions offer more comprehensive security since they combine many technologies into one package.

3. A signature-based intrusion detection system (SIDS)

This system cross-checks all packets passing through a network with an inbuild attack signature database. This database consists of known malicious threats. In other words, this system works like antivirus software.

4. An anomaly-based intrusion detection system (AIDS)

This analyzes network traffic by comparing it against a baseline established for the network regarding bandwidth, protocols, ports, etc. In this type of security policy, a baseline gets created using machine learning.

Upon detecting suspicious activity or policy violations, it alerts the IT team. Anomaly-based detection uses a broader model instead of specific signatures and attributes to overcome the limitations of signature-based detection.

What is the purpose of an intrusion detection system?

In this age of information, there is no such thing as an impenetrable firewall or network. The attackers are continually creating new exploits and attacks to circumvent your defenses. Various malware or social engineering techniques are available and used by attackers to gain access to your network and data.

In order to maintain network security, you need an intrusion detection system (IDS) monitoring network that detects malicious traffic and responds to it. Intrusion detection systems are primarily responsible to alert IT personnel about any possible attack or network intrusion. Both incoming and outgoing traffic, including data traversing between systems within a network, is monitored by an intrusion detection system (IDS).

IDSs monitor network traffic and trigger alerts when suspicious activity or known threats are detected. It allows IT personnel to investigate further and take action to stop attacks.

The benefits of intrusion detection systems

Benefits of intrusion detection system.webp

Intruder detection systems (IDS) can be an integral part of a company's security plan. Here are some of the benefits of IDS you can take advantage of.

They can get tuned to specific content in network packets

Although firewalls can provide information about the ports and IP addresses used between two hosts, NIDSs can present data about the specifics contained within packets. An example would be uncovering botnets and exploitation attacks by analyzing the logs of compromised endpoint devices.

They can look at data in the context of the protocol

As part of its protocol analysis, a NIDS examines the payloads of TCP and UDP. Because the sensors know how protocols should function, they can detect suspicious activity.

They can qualify and quantify attacks

The purpose of an IDS is to analyze the amount and type of attacks. Using this information, you can implement new and more effective security controls or change your security systems. Also, it can be used to identify configuration problems or bugs in network devices. In the future, the metrics can be used to assess risk.

They make it easier to keep up with regulations

IDSs help you meet security regulations as they provide visibility across your network. Depending on your requirements, logs from your IDS can be helpful in the documentation.

They can boost efficiency

With IDS sensors, you can identify which operating systems and services your network is using by inspecting the packet data. By automating this, you will save a lot of time. It is also possible to automate hardware inventories using an IDS, which further cuts labor expenses.

What Are The Challenges Of Intrusion Detection?

Challenges of intrusion detection system.webp

IDS is a technology that has been in use for a long time, therefore, it is expected that the system can encounter some challenges in the modern IT environment. Malicious attackers have developed escape techniques to fool the IDS technology into missing intrusions.

The following are some IDS escape techniques:

Fragmentation

By fragmenting the attack payload into many packets, the attack remains undetected. It is difficult to bypass an IDS simply with small packets, but the attacker can make them reassemble in a complicated way to dodge detection.

A common method of implementing fragmentation is to pause while other parts of the payload get transmitted, hoping that the IDS will time out before it receives the entire payload. Also, packets can be sent randomly to confuse the IDS but not the target host, or fragments can get overwritten from a previous packet.

Obscurity

Through protocol manipulation, this IDS bypass technique uses different ports to bypass detection. This intrusion will not get detected if the IDS does not address these protocol violations in the same way as the target host does.

Low-bandwidth attacks

As a result of coordinated attacks across several sources, an attacker can imitate benign traffic or noise such as the one produced by online scanners and avoid IDS detection for a considerable period.

As a result, the IDS will have difficulty correlating all packets to discern whether they are harmless or malicious. Besides knowing common bypass techniques, IDS technology still faces other challenges.

In the first place, they often generate false alarms or fail to do so. False positives and false negatives are IDS's biggest weaknesses. In addition to generating noise, false positives can negatively affect the efficacy of other systems, including IDS and security operations centers (SOC). As a result of a high false positive rate, security teams can become fatigued and real threats can go unnoticed.

What are the limitations of intrusion detection systems?

The use of the internet in businesses is continuously increasing, thereby increasing the risk of IT intrusions. This gives way to security breaches that can access sensitive company information and lead to the loss of proprietary information.

Most companies install intrusion detection software as a first line of defense. Intrusion detection software can improve network security, but it also has some limitations.

Source addresses

Intrusion detection software uses the IP packet's network address to provide information about the packet as soon as it enters the network. If the IP packet contains an accurate network address, it also becomes helpful. IP packets can, however, contain fake or confusing addresses.

If the IT technician team faces either of these scenarios, they will get caught chasing ghosts and will not be able to prevent network intrusions.

Encrypted packets

A software program that detects intrusions does not process encrypted packets. This means that an encrypted packet might allow an intrusion into the network that is not discovered until a much more significant intrusion takes place.

When encrypted packets are implanted into a network, they can be activated automatically at a certain time or date. Intruder detection software can process encrypted packets that will prevent the release of a virus or other software bug into the network.

Analytical module

When intrusion detection is performed, the analytical module can only analyze a limited amount of information from the source. A limitation like this results in a buffering of part of the source data.

While IT professionals can be alerted of abnormal behavior, they cannot identify the origin of the behavior. It is only possible to stop unauthorized access to the network if valid information is provided. IT professionals can take a defensive approach if more information is available.

False alarms

Network intrusion detection systems can detect unusual behavior on networks. The disadvantage of intrusion detection software is that it can generate multiple false alarms if it is unable to detect abnormal network usage. On networks with multiple users, the number of false alarms increases.

IT professionals must receive detailed training so they can identify false alarms and will not have to chase them. The expense of completing this training is another disadvantage of intrusion detection software that companies have to deal with.

How is machine learning used in intrusion detection systems?

Here are a few things you should know before getting started:

The field of Machine Learning is concerned with how computers are able to learn and improve without explicit programming. The concept of machine learning focuses on developing programs that can learn from data.
To learn, a program observes or collects data and examines that data to find patterns and makes predictions based on that data.
For computer programs to learn without human interaction and adjust actions accordingly, the primary objective is to allow them to learn without human assistance.

The following categories can be used to classify machine learning algorithms:

Supervised machine learning algorithms

Using labeled examples, it can predict future events based on its previous learnings. Based on the training datasets, the algorithm produces an inferred function in order to predict the output value. With sufficient training, the system can provide new input targets.

In supervised learning, a new set of examples is provided to the machine so that the algorithm can analyze the training data and produce a correct outcome based on labeled data.

Self-supervised Machine Learning

Self-supervised learning is a part of the machine learning process that involves training a model to learn one part of the input from another part of the dataset. It is also known as pretext learning or predictive learning.

The function of this process involves transforming the unsupervised problem into a supervised problem via auto-generated labels. With so much unlabeled data available, setting the right learning objectives is essential to gain supervision from the data.

For example, Weather forecasting is a complex process. There is a difference between supervised and unsupervised data regarding the quality of a report. In such cases, self-supervised learning plays a vital role. Using it, you can create accurate and precise climatic reports.

Unsupervised machine learning algorithms

In this, no markings or classifications are used to train the data. Using unlabeled data, unattended learning involves identifying a function that describes a hidden structure.

An unsorted set of information has to get grouped without any prior training with the help of matching patterns, similarities, and identifying differences.

Semi-supervised machine learning algorithms

This method uses a blend of less labeled and more unlabeled data for training. Semi-supervised learning gets divided into unsupervised learning and supervised learning. Generally, semi-supervised techniques are used when you lack enough labeled data to produce a robust model, or when you do not have the means and resources to obtain additional data.

What is an intrusion prevention system (IPS)

IDS configurations complement IPS configurations by monitoring incoming traffic for malicious requests and weeding them out. Security solutions like web application firewalls and traffic filtering are typically used as part of an IPS configuration.

An IPS prevents any attacks by dropping malicious packets, blocking offending IP addresses, and warning security personnel of potential threats.

In such cases, the system can recognize attacks based on traffic and behavioral anomalies following the analysis of a pre-existing database of signatures.

However, some IPS systems are limited in their ability to block known attack vectors. It is common for such systems to produce false positives because they over-rely on predefined rules.

Conclusion

Intrusion detection systems can help businesses up to some level, but firewalls, IDSs, and IPSs are necessary for more comprehensive cyber security protection. As long as their signature databases are kept up-to-date, intrusion detection and prevention systems can be effective solutions.

Administrators are responsible for configuring and monitoring IPS according to enterprise requirements. In recent years, businesses have increasingly turned to managed IDS, IPS, and IDPS services.

Python and OpenCV are the most commonly used tools to detect intrusion attempts. They are powerful, intuitive, and also work together. And this is the reason for the increasing demand for Python developers who can work on projects that search for security anomalies or possible intrusions.

Author
Turing Staff