Trending November 2023 # Misguiding Deep Neural Networks: Generalized Pixel Attack # Suggested December 2023 # Top 14 Popular

You are reading the article Misguiding Deep Neural Networks: Generalized Pixel Attack updated in November 2023 on the website We hope that the information we have shared is helpful to you. If you find the content interesting and meaningful, please share it with your friends and continue to follow and support us for the latest updates. Suggested December 2023 Misguiding Deep Neural Networks: Generalized Pixel Attack

This article was published as a part of the Data Science Blogathon.


Before we go into the details, let us have a quick recap of the deep neural network.

Artificial Neural Network (ANN)

A neural network is a method that simulates the activity of the human brain and tries to mimic its decision-making capability. Superficially, it can be thought of as a network of an input layer, output layer, and hidden layer(s).

Each layer performs its specific task assigned to it and passes it to further processing to another one. This phenomenon is known as “feature hierarchy.” This feature comes quite handy when dealing with unlabeled or unstructured data.

A typical Artificial Neural Network (Image by Ahmed Gad from Pixabay)

Convolutional Neural Network (CNN)

Convolutional Neural Network (CNN) or ConvNet for short, are the architectures that are mostly applied to images. The problem domain targeted by CNNs should not have spatial dependence. Another unique perspective of CNN is to get abstract features when input propagates from shallow to deeper layers.

A typical Convolutional Neural Network (Ref: Kang, Xu & Song, Bin & Sun, Fengyao. (2023). A Deep Similarity Metric Method Based on Incomplete Data for Traffic Anomaly Detection in IoT. Applied Sciences. 9. 135. 10.3390/app9010135.)

Misguiding Deep Neural Networks by Adversarial Examples

In 2023, people at Google and NYU affirmed that ConvNets could easily be fooled if the input is perturbated slightly. For example, our trained model recognizes the “Panda” with a confidence of 58%(approx.) while the same model classifies it as “Gibbon” with a much higher confidence of 99%. This is obviously an illusion for the network, which has been fooled by the noise thus inserted.

In 2023 again, a group of researchers at Google Brain and Ian J. Goodfellow showed that printed images, when captured through the camera and perturbated a slight, resulted in misclassification.

The umbrella term for all these scenarios is the Adversarial example.

The Threat

Analysis of the natural images’ vicinity, that is, few pixel perturbations can be regarded as cutting the input space using low-dimensional slices.

A measure of perceptiveness is a straightforward way of mitigating the problem by limiting the number of modifications to as few as possible.

Mathematically, the problem can be posed as-

Let ‘f’ be the target image classifier which receives n-dimensional inputs,

x = (x1, x2, …, xn), t

ft(x) is the probability of correct class

The limitation of maximum modification is L.

subject to e(x)<= L

The Attack: Targeted v/s Untargeted

An untargeted attack causes a model to misclassify an image to another class except for the original one. In contrast, a targeted attack causes a model to classify an image as a given target class. We want to perturb an image to maximize the probability of a class of our choosing.

The Defense

Increasing the Efficiency of the Differential Evolution algorithm such that the perturbation success rates should be improved and comparing the performance of Targeted and Untargeted attacks.

Differential Evolution

Differential evolution is a population-based optimization algorithm for solving complex multi-modal optimization. Differential Evolution

Moreover, it has mechanisms in the population selection phase that keep the diversity such that in practice, it is expected to efficiently find higher quality solutions than gradient-based solutions or even other kinds of EAs in specific during each iteration another set of candidate solutions (children) is generated according to the current population (fathers).

Why Differential Evolution?

There are three significant reasons to choose for Differential Evolution, viz.,

Higher probability of Finding Global Optima,

Require Less Information from Target System, and


In the context of a one-pixel attack, our input will be a flat vector of pixels, that is,


First, we generate a random population of n-perturbations


Further, on each iteration, we calculate n new mutant children using the formula


such that


The standard DE algorithm has three primary candidates for improvement: the crossover, the selection, and the mutation operator.

The selection has been unchanged from the original publication by Storn and Price to state-of-the-art variants of DE, making it less likely that improvements could significantly enhance the performance.

Crossover has a large effect on the search and is of particular importance when optimizing non-separable functions.

Mutation: Can be changed

The mutation operator has been among the most studied parts of the algorithm. Numerous variations published over the years provided all the required background knowledge about the population. It is just a matter of changing some variables or adding some terms to a linear equation. All this makes mutation the best step to improve.

The question then becomes which mutation operator to improve.

The DE/rand/1 operator seems to be the best choice because it is studied extensively, and the comparison of the implementation becomes effortless. Moreover, it would be interesting to see how effectively generating additional training data with dead pixels affects such attacks.

Whatever we put – 1 pixel, some error, noise, fuzz – or anything else, neural networks could give some false recognitions – entirely in an a-priory unpredictable way.

Conclusion References

Kurakin, Alexey, Ian Goodfellow, and Samy Bengio. “Adversarial examples in the physical world.” arXiv preprint arXiv:1607.02533 (2023).

Storn, Rainer, and Kenneth Price. “Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces.” Journal of global optimization 11.4 (1997): 341-359.

Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., & Fergus, R. (2013). Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199.


You're reading Misguiding Deep Neural Networks: Generalized Pixel Attack

Introductory Note On Deep Learning

Introduction to Deep Learning

Artificial Intelligence, deep learning, machine learning — whatever you’re doing if you don’t understand it — learn it. Because otherwise you’re going to be a dinosaur within 3 years.

-Mark Cuban

This article was published as a part of the Data Science Blogathon

This statement from Mark Cuban might sound drastic – but its message is spot on! We are in the middle of a revolution – a revolution caused by Big Huge data and a ton of computational power.

For a minute, think how a person would feel in the early 20th century if he/she did not understand electricity. You would have been used to doing things in a particular manner for ages and all of a sudden things around you started changing. Things which required many people can now be done with one person and electricity. We are going through a similar journey with machine learning & deep learning today.

So, if you haven’t explored or understood the power of deep learning – you should start it today. I have written this article to help you understand common terms used in deep learning.

Who should read this article?

If you are someone who wants to learn or understand deep learning, this article is meant for you. In this article, I will explain various terms used commonly in deep learning.

If you are wondering why I am writing this article – I am writing it because I want you to start your deep learning journey without hassle or without getting intimidated. When I first began reading about deep learning, there were several terms I had heard about, but it was intimidating when I tried to understand them. There are several words that are recurring when we start reading about any deep learning application.

In this article, I have created something like a deep learning dictionary for you which you can refer to whenever you need the basic definition of the most common terms used. I hope after this article these terms wouldn’t haunt you anymore.

Terms related to Deep Learning

To help you understand various terms, I have broken them into 3 different groups. If you are looking for a specific term, you can skip to that section. If you are new to the domain, I would recommend that you go through them in the order I have written them.

Basics of Neural Networks

Common Activation Functions

Convolutional Neural Networks

Recurrent Neural Networks

Basics of Neural Networks

1) Neuron- Just like a neuron forms the basic element of our brain, a neuron forms the basic structure of a neural network. Just think of what we do when we get new information. When we get the information, we process it and then we generate an output. Similarly, in the case of a neural network, a neuron receives an input, processes it and generates an output that is either sent to other neurons for further processing or is the final output.

2) Weights – When input enters the neuron, it is multiplied by a weight. For example, if a neuron has two inputs, then each input will have has an associated weight assigned to it. We initialize the weights randomly and these weights are updated during the model training process. The neural network after training assigns a higher weight to the input it considers more important as compared to the ones which are considered less important. A weight of zero denotes that the particular feature is insignificant.

Let’s assume the input to be a, and the weight associated to be W1. Then after passing through the node the input becomes a*W1

3) Bias – In addition to the weights, another linear component is applied to the input, called the bias. It is added to the result of weight multiplication to the input. The bias is basically added to change the range of the weight multiplied input. After adding the bias, the result would look like a*W1+bias. This is the final linear component of the input transformation.

4) Activation Function – Once the linear component is applied to the input, a non-linear function is applied to it. This is done by applying the activation function to the linear combination. The activation function translates the input signals to output signals. The output after application of the activation function would look something like f(a*W1+b) where f() is the activation function.

In the below diagram we have “n” inputs given as X1 to Xn and corresponding weights Wk1 to Wkn. We have a bias given as bk. The weights are first multiplied by their corresponding input and are then added together along with the bias. Let this be called as u.


The activation function is applied to u i.e. f(u) and we receive the final output from the neuron as yk = f(u)

Commonly applied Activation Functions

The most commonly applied activation functions are – Sigmoid, ReLU and softmax

a) Sigmoid – One of the most common activation functions used is Sigmoid. It is defined as:

sigmoid(x) = 1/(1+e-x)

Source: Wikipedia

The sigmoid transformation generates a more smooth range of values between 0 and 1. We might need to observe the changes in the output with slight changes in the input values. Smooth curves allow us to do that and are hence preferred to overstep functions.

b) ReLU(Rectified Linear Units) – Instead of sigmoids, the recent networks prefer using ReLu activation functions for the hidden layers. The function is defined as:

f(x) = max(x,0).

source: cs231n

The major benefit of using ReLU is that it has a constant derivative value for all inputs greater than 0. The constant derivative value helps the network to train faster.

c) Softmax – Softmax activation functions are normally used in the output layer for classification problems. It is similar to the sigmoid function, with the only difference being that the outputs are normalized to sum up to 1. The sigmoid function would work in case we have a binary output, however in case we have a multiclass classification problem, softmax makes it really easy to assign values to each class which can be easily interpreted as probabilities.

It’s very easy to see it this way – Suppose you’re trying to identify a 6 which might also look a bit like 8. The function would assign values to each number as below. We can easily see that the highest probability is assigned to 6, with the next highest assigned to 8 and so on…

5) Neural Network – Neural Networks form the backbone of deep learning. The goal of a neural network is to find an approximation of an unknown function. It is formed by interconnected neurons. These neurons have weights, and bias which is updated during the network training depending upon the error. The activation function puts a nonlinear transformation to the linear combination which then generates the output. The combinations of the activated neurons give the output.

A neural network is best defined by “Liping Yang” as –

“Neural networks are made up of numerous interconnected conceptualized artificial neurons, which pass data between themselves, and which have associated weights which are tuned based upon the network’s “experience.” Neurons have activation thresholds which, if met by a combination of their associated weights and data passed to them, are fired; combinations of fired neurons result in “learning”.

6) Input / Output / Hidden Layer – Simply as the name suggests the input layer is the one that receives the input and is essentially the first layer of the network. The output layer is the one that generates the output or is the final layer of the network. The processing layers are the hidden layers within the network. These hidden layers are the ones that perform specific tasks on the incoming data and pass on the output generated by them to the next layer. The input and output layers are the ones visible to us while being the intermediate layers are hidden.

Source: cs231n

7) MLP (Multi-Layer perceptron) – A single neuron would not be able to perform highly complex tasks. Therefore, we use stacks of neurons to generate the desired outputs. In the simplest network, we would have an input layer, a hidden layer and an output layer. Each layer has multiple neurons and all the neurons in each layer are connected to all the neurons in the next layer. These networks can also be called fully connected networks.

8) Forward Propagation – Forward Propagation refers to the movement of the input through the hidden layers to the output layers. In forward propagation, the information travels in a single direction FORWARD. The input layer supplies the input to the hidden layers and then the output is generated. There is no backward movement.

9) Cost Function – When we build a network, the network tries to predict the output as close as possible to the actual value. We measure this accuracy of the network using the cost/loss function. The cost or loss function tries to penalize the network when it makes errors.

Our objective while running the network is to increase our prediction accuracy and to reduce the error, hence minimizing the cost function. The most optimized output is the one with the least value of the cost or loss function.

If I define the cost function to be the mean squared error, it can be written as –

C= 1/m ∑(y – a)2 where m is the number of training inputs, a is the predicted value and y is the actual value of that particular example.

The learning process revolves around minimizing the cost.

10) Gradient Descent – Gradient descent is an optimization algorithm for minimizing the cost. To think of it intuitively, while climbing down a hill you should take small steps and walk down instead of just jumping down at once. Therefore, what we do is, if we start from a point x, we move down a little i.e. delta h, and update our position to x-delta h and we keep doing the same till we reach the bottom. Consider bottom to be the minimum cost point.


Mathematically, to find the local minimum of a function one takes steps proportional to the negative of the gradient of the function.

You can go through this article for a detailed understanding of gradient descent.

11) Learning Rate – The learning rate is defined as the amount of minimization in the cost function in each iteration. In simple terms, the rate at which we descend towards the minima of the cost function is the learning rate. We should choose the learning rate very carefully since it should neither be very large that the optimal solution is missed nor should be very low that it takes forever for the network to converge.


12) Backpropagation – When we define a neural network, we assign random weights and bias values to our nodes. Once we have received the output for a single iteration, we can calculate the error of the network. This error is then fed back to the network along with the gradient of the cost function to update the weights of the network. These weights are then updated so that the errors in the subsequent iterations is reduced. This updating of weights using the gradient of the cost function is known as back-propagation.

In back-propagation the movement of the network is backwards, the error along with the gradient flows back from the out layer through the hidden layers and the weights are updated.

13) Batches – While training a neural network, instead of sending the entire input in one go, we divide in input into several chunks of equal size randomly. Training the data on batches makes the model more generalized as compared to the model built when the entire data set is fed to the network in one go.

14) Epochs – An epoch is defined as a single training iteration of all batches in both forward and backpropagation. This means 1 epoch is a single forward and backwards pass of the entire input data.

The number of epochs you would use to train your network can be chosen by you. It’s highly likely that more number epochs would show higher accuracy of the network, however, it would also take longer for the network to converge. Also, you must take care that if the number of epochs is too high, the network might be over-fit.

15) Dropout – Dropout is a regularization technique that prevents over-fitting of the network. As the name suggests, during training a certain number of neurons in the hidden layer is randomly dropped. This means that the training happens on several architectures of the neural network on different combinations of neurons. You can think of drop out as an ensemble technique, where the output of multiple networks is then used to produce the final output.

16) Batch Normalization – As a concept, batch normalization can be considered as a dam we have set as specific checkpoints in a river. This is done to ensure that the distribution of data is the same as the next layer hoped to get. When we are training the neural network, the weights are changed after each step of gradient descent. This changes how the shape of data is sent to the next layer.

But the next layer was expecting a distribution similar to what it had previously seen. So we explicitly normalize the data before sending it to the next layer.

Convolutional Neural Networks in Deep Learning

17) Filters – A filter in a CNN is like a weight matrix with which we multiply a part of the input image to generate a convoluted output. Let’s assume we have an image of size 28*28. We randomly assign a filter of size 3*3, which is then multiplied with different 3*3 sections of the image to form what is known as a convoluted output. The filter size is generally smaller than the original image size. The filter values are updated like weight values during backpropagation for cost minimization.

Consider the below image. Here filter is a 3*3 matrix which is multiplied with each 3*3 section of the image to form the convolved feature.

18) CNN (Convolutional neural network) – Convolutional neural networks are basically applied to image data. Suppose we have an input of size (28*28*3), If we use a normal neural network, there would be 2352(28*28*3) parameters. And as the size of the image increases the number of parameters becomes very large. We “convolve” the images to reduce the number of parameters (as shown above in filter definition). As we slide the filter over the width and height of the input volume we will produce a 2-dimensional activation map that gives the output of that filter at every position. We will stack these activation maps along the depth dimension and produce the output volume.

You can see the below diagram for a clearer picture.

Source: cs231n

19) Pooling – It is common to periodically introduce pooling layers in between the convolution layers. This is basically done to reduce the number of parameters and prevent over-fitting. The most common type of pooling is a pooling layer of filter size(2,2) using the MAX operation. What it would do is, it would take the maximum of each 4*4 matrix of the original image.

Source: cs231n

You can also pool using other operations like Average pooling, but max-pooling has shown to work better in practice.

20) Padding – Padding refers to adding an extra layer of zeros across the images so that the output image has the same size as the input. This is known as the same padding.

After the application of filters,  the convolved layer in the case of the same padding has a size equal to the actual image.

Valid padding refers to keeping the image as such and having all the pixels of the image which are actual or “valid”. In this case, after the application of filters, the size of the length and the width of the output keeps getting reduced at each convolutional layer.

21) Data Augmentation – Data Augmentation refers to the addition of new data derived from the given data, which might prove to be beneficial for prediction. For example, it might be easier to view the cat in a dark image if you brighten it, or for instance, a 9 in the digit recognition might be slightly tilted or rotated. In this case, the rotation would solve the problem and increase the accuracy of our model. By rotating or brightening we’re improving the quality of our data. This is known as Data augmentation.

Recurrent Neural Network in Deep Learning

Source: cs231n

23) RNN(Recurrent Neural Network) – Recurrent neural networks are used especially for sequential data where the previous output is used to predict the next one. In this case, the networks have loops within them. The loops within the hidden neuron give them the capability to store information about the previous words for some time to be able to predict the output. The output of the hidden layer is sent again to the hidden layer for t time stamps. The unfolded neuron looks like the above diagram. The output of the recurrent neuron goes to the next layer only after completing all the timestamps. The output sent is more generalized and the previous information is retained for a longer period.

The error is then backpropagated according to the unfolded network to update the weights. This is known as backpropagation through time(BPTT).

24) Vanishing Gradient Problem – Vanishing gradient problem arises in cases where the gradient of the activation function is very small. During backpropagation when the weights are multiplied with these low gradients, they tend to become very small and “vanish” as they go further deep in the network. This makes the neural network forget the long-range dependency. This generally becomes a problem in cases of recurrent neural networks where long term dependencies are very important for the network to remember.

This can be solved by using activation functions like ReLu which do not have small gradients.

25) Exploding Gradient Problem – This is the exact opposite of the vanishing gradient problem, where the gradient of the activation function is too large. During backpropagation, it makes the weight of a particular node very high with respect to the others rendering them insignificant. This can be easily solved by clipping the gradient so that it doesn’t exceed a certain value.

End Notes

The media shown in this article is not owned by Analytics Vidhya and are used at the Author’s discretion.

New Robotic Hands Let Deep

A new remote-controlled robotic hand will allow deep-sea divers to handle and feel objects underwater almost as easily as they can in air. This could transform deep-water operations, from marine biology to pipeline repair.

Atmospheric Dive Suits (ADS) are required for dives below three hundred feet. The ADS is a human-shaped hard shell enclosing diver so they can breathe air at normal pressure. ADS have become more sophisticated in some ways over the last century, but they still have primitive lobster-like claws called prehensors rather than hands.

The tremendous pressure at great depths makes gloves impractical, but prehensors are clumsy and awkward. One operator compares them to using chopsticks. Considerable practice is needed to achieve any expertise, and even then, there is only one possible movement or ‘degree of freedom,’ opening or closing the claw. The lack of contact makes it hard to pick up irregularly-shaped objects, and prehensors cannot hold ordinary tools like drills and hammers. Everything is a lot of work, and some jobs are impossible.

Until now. Vishwa Robotics of Cambridge, Mass., has developed a human-like deep-sea robotic hand called the Vishwa Extensor for the US Navy. CEO Bhargav Gajjar says the Extensor so called because it extends human manipulation in extreme environments. Unlike prehensors, it is highly intuitive to use, operated by a glove-like controller with force-feedback so the user can grip as firmly as needed.

“Hydrostatic pressure is the devil of it all.”

Gajjar points out that the Extensor is not an Iron Man-type exoskeleton, because it is remote from the operator’s hand; nor is it a prosthetic, though it has something in common with both. It is a dextrous, remotely operated robotic hand. The Extensor resembles a human hand on the outside, although inside is a complex arrangement of actuators and robotic mechanisms adapted to the pressures of the deep.

“Hydrostatic pressure is the devil of it all. It crumples every single surface,” says Gajjar. “How to get around it and still have a working human-shaped grasper is one of the main challenges.”

The Extensor currently has two fingers and a thumb. Gajjar the last three fingers on humans hands tend to move together, and extra fingers do little to improve performance. The fingers themselves are far more flexible than prehensors, each finger and thumb having four degrees of freedom, aided by three degrees in the wrist, providing the dexterity to grasp and operate hand tools.

The operator can a use a wrench or pick up a nut and attach it to a bolt, fiendishly difficult with prehensors. Challenging tasks, such as working the hatch on a submarine during an underwater rescue, become straightforward. The Extensor can even work the trigger on a drill or other powered device, impossible with a prehensor due to its lack of multiple fingers and opposing thumb.

As well as fitting them to diving suits, the Navy plans to attach giant pairs of robotic arms and Vishwa Extensors to unmanned submarines to reach even greater depths. The operator stays on board ship, touching and manipulating objects deep underwater via a video and haptic (touch) link. This will aid Navy missions such as mine clearance, crash retrievals and salvage.

Underwater vehicles with human-like fingers

The Vishwa Extensor will get its first underwater tests in a test tank at Vishwa Robotics over the next few months, preparatory to deep water testing. Latest tests are planned in the deep ocean simulation chambers of the Navy’s Experimental Diving Unit.

The technology will be benefit commercial divers involved in construction, inspection and salvage. Adding Vishwa Extensors to existing ADS will make routine jobs like underwater welding or operating chipping hammers quicker and safer. In marine archaeology and biology, the delicate handling of the Vishwa Extensor is a big step forward. At the moment it is impossible to pick up fragile sea creatures from the ocean floor.

Gajjar says that the underlying technology could prove equally valuable out of the water. Any detailed work on the ISS requires astronauts to undertake laborious and potentially risky EVA (Extra Vehicular Activity); a remote-controlled spacesuit could be ready quicker and work longer. Disaster disputations involving radiation or chemical spills could also benefit from the effective remote handling. But the hands will debut underwater.

“The problem of teleoperation in the Last Frontier is finally solved,” says Gajjar. “Swimming human avatars, underwater vehicles with human-like fingers and the tactility of the human hands, will finally reach the extreme depths of our planet and make amazing discoveries.”

Vishwa Extensor

Vishwa Robotics CEO Bhargav Gajjar with the Vishwa Extensor

Link Exchange Networks Hurt Google Site Indexing

Link Exchange Networks Hurt Google Site Indexing

If your site participates in link trading or link exchange network schemes with thousands of other sites which result in untargeted and spammy links being served throughout your web site – BEWARE, Google has a keen eye on this search spamming practice.

This does not mean that buying or trading links is bad. Buying and soliciting highly targeted links or exchanging links with sites that either :

3) Are in the same geographic location are your business

…will bring targeted and responsive traffic to your site which has a high chance of converting into a sale or interested visitor.

Such practices are the backbone of the Internet and are not being targeted by the new Google updates.

However, if you participate in a link trading, link selling or link marketing scheme which serves links on your sites, or the sites you trade links with for such spam-slappin’ themed sites as poker, gambling, ringtones, seo contests, mortgages, payday loans and other questionably anchor texted links which have absolutely nothing to do with your site’s content – STOP!

It’s time to halt such link spamming practices and direct your linking budget (whether it be measured in time, favors or dollars) into more targeted link building practices like relevant blogs, trusted directories, niche sites and local business listings programs.

One linking rule of thumb is that if the link would send you a paying customer and people actually find the links on a prominent part of the web page (like above the fold or embedded within main content) – it is of quality.

Here is what Matt Cutts has to say about Google indexing ‘penalties’ on link farms and link swapping network campaigns as he reviews complaints from webmasters who are seeing their sites dropping from the Google BigDaddy update:

This is… a real estate site, this time about a Eastern European country. I see 387 pages indexed currently. Aha, checking out the bottom of the page,

I see this: Poor quality links (Matt shows image)

Linking to a free ringtones site, an SEO contest, and an Omega 3 fish oil site? I think I’ve found your problem. I’d think about the quality of your links if you’d prefer to have more pages crawled. As these indexing changes have rolled out, we’ve improving how we handle reciprocal link exchanges and link buying/selling.

– Moving right along, here’s one from May 4th. It’s another real estate site. The owner says that they used to have 10K pages indexed and now they have 80.

This time, I’m seeing links to mortgages sites, credit card sites, and exercise equipment. I think this is covered by the same guidance as above; if you were getting crawled more before and you’re trading a bunch of reciprocal links, don’t be surprised if the new crawler has different crawl priorities and doesn’t crawl as much.

How To Correctly Identify And Manage Your External Attack Surface

More organizations are moving and restructuring their technology ecosystems to facilitate seamless communication with services that are not hosted on their local network. All publicly accessible assets that customers and employees have access to when interacting with a business online, whether owned and controlled by a company or a third party, are part of the organization’s online ecosystem. This represents the external attack surface of your organization.

Organizations that place their cyber vulnerability and attack surface visibility, understand that their external attack surface needs to be managed as much as the internal attack surface. External attack surface management has become an industry standard and a necessity for a strong cyber security posture.

Building Blocks of Your External Attack Surface

An organization’s external attack surface typically consists of all Internet-connected applications and services accessible over the Internet and is significantly different from all internally-connected applications and tools.

Organizations have many reasons for deploying Internet-connected applications. These applications and services may be a prerequisite for interacting with customers and partners. Otherwise, it might be a requirement for employees working from remote, office, locations. Examples of this are remote desktops and virtual private networks.

Examples of Internet-facing applications include web applications, APIs, SSH servers, VPN gateways, cloud services, Internet-facing firewalls, or other remote access capabilities intentionally or accidentally placed on Internet-facing servers. there is a service. Internet-connected assets can be on-premises, in the cloud, or on any combination of hosted, managed, or virtualized infrastructure.

Also read:

Top 10 Programming Languages for Kids to learn

Introducing External Attack Surface Management

Simply put, external attack surface management (EASM) refers to the processes, technologies, and professional services used to identify these external-facing corporate assets and systems that may be vulnerable to cyber-attacks.

EASM solutions are typically used to automate the discovery of all downstream services your business is exposed to. In many cases, these can be third-party partners. Because they are potentially vulnerable to attack, they can pose real and significant risks to your organization.

External Attack Surface Management Best Practices

Protecting their external attack surface gives organizations control over their cybersecurity posture. To prevent network vulnerabilities from being exploited by malicious actors, you can follow these best practices:

Regularly scan your external facing applications and system services for security vulnerabilities. Automated EASM tools will allow security teams to analyze real-time reports and immediately address security issues that are discovered.

Limit an attacker’s level of access in a compromised application by applying the principle of least privilege to service accounts. Services and APIs can be implemented easily, developers do however need to take responsibility for the secure configuration of these services.

Regularly update your applications and machine software to the latest versions to prevent intruders. Security patches and updates for development platforms and libraries are made available to developers frequently. A responsible organization will always ensure that its tools and plugins are up to date. Not doing this might put both the organization and its partners and clients in danger.

Your online presence is dynamic and constantly changing. Partners and vendors change servers and update links, but organizations have no way of knowing when those changes will occur. By implementing an automated solution these external links can routinely be investigated. Your external attack surface will overlap with that of your partners. Online tools will go a long way in securing your organization.

In Conclusion

Because of the potential damage, a cyberattack can pose, many organizations are incorporating EASM into their enterprise risk management efforts. As a result, rather than addressing the issues on an ad hoc basis, security teams are taking a more proactive approach to strategically managing known and unknown risks, vulnerabilities, and exposed assets.

Visualize Deep Learning Models Using Visualkeras

This article was published as a part of the Data Science Blogathon.

         Image Source: Author

Introduction Neural Networks

Artificial neural networks are computing systems similar to the biological neural network in the human brain. A simple ANN is a collection of connected units or nodes called artificial neurons that are modelled similarly to a biological brain’s neurons. ANN utilizes the brain’s processing as a basis to create algorithms that can be used to model complex patterns and prediction problems. An ANN with more than three layers (input layer, output layer, multiple hidden layers) can be called a ‘deep neural network’. The number of hidden layers in a neural network is commonly referred to as “deep” in a deep learning model. Deep neural networks can have multiple hidden layers, whereas traditional neural networks usually have only 2-3 layers.

Neural Network Architecture

The main components of a neural network are:

Input – The input is a measure of the feature of the model. In simple words, we can say that input is the set of attributes fed into the model for learning purposes.

Weights – Weights are similar to scalar multiplication. The primary purpose of weights in a neural network is to emphasize the attributes that contribute the most to the learning process. It is achieved by applying scalar multiplication to the input value and weight matrix. We can understand the importance of each input and the directionality from the respective weights.

Transfer function – The Transfer function is different from the other components because it takes multiple inputs. The transfer function combines several inputs to one output value so that the activation function can be applied.

Activation Function – An Activation function will transform the number from the transfer function into a value that represents the input. Most of the time, the activation function will be non-linear. Without it, the output would be a linear mixture of the input values, with no ability to incorporate non-linearity into the network. Two common activation functions are – ReLu and sigmoid.

Bias – The purpose of bias is to change the value produced by the activation function.

An artificial neural network comprises three layers – input, output and one or more hidden layers. Each layer consists of several neurons stacked in a row. Similarly, a multi-layer neural network consists of many layers arranged next to each other. The structure of a neural network looks like the image shown below.

Image Source: Author

Visualizing a Neural Network using Keras Library

Now that we have discussed some basics of deep learning and neural networks, we know that deep learning models are complex, and the way they make decisions is also hard to understand. It can be interesting to visualize how a neural network connects various neurons. It can be a great way to visualize the model architecture and share it with your audience while presenting.

The Keras library allows for visualization of the neural networks using the plot_model command.

Creating a Neural Network Model

Before we begin this tutorial, it is expected to have a basic understanding of how to create a Neural Network. It is essential to understand the tool’s utility discussed in this article.

Let us start by creating a basic Artificial Neural Network (ANN) using Keras and its functions.

Feel free to create a different neural network since we are only visualizing the final model and hence, it might be interesting to explore the capabilities of the visual Keras library (discussed later in this article) with a different model.

from keras.models import Sequential from keras.layers import Dense model = Sequential() model.add(Dense(12, input_dim=8, activation='relu')) model.add(Dense(8, activation='relu')) model.add(Dense(1, activation='sigmoid'))

We can display the generated image using the following command.

from keras.utils.vis_utils import plot_model plot_model(model, to_file='model_plot.png', show_shapes=True, show_layer_names=True)

From the above image, we can clearly visualize the model structure and how different layers connect with each other through a number of neurons. Next, let us build a CNN and visualize it using the Keras library.

from keras import models from keras.layers import Dense, Conv2D, MaxPooling2D, Flatten, Activation from keras import layers model = Sequential() model.add(Conv2D(64,(4,4),input_shape=(32,32,3),activation='relu',padding='same')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Conv2D(64,(4,4),input_shape=(32,32,3),activation='relu',padding='same')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Dropout(0.2)) model.add(Conv2D(128,(4,4),input_shape=(32,32,3),activation='relu',padding='same')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Dropout(0.25)) model.add(Conv2D(128,(4,4),input_shape=(32,32,3),activation='relu',padding='same')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Dropout(0.35)) model.add(Flatten()) model.add(Dense(256, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(10, activation='softmax'))

We can visualize this model ‘plot_model’ command used previously.

Here we can visualize the different layers of the neural network along with the number of filters, filter size, no. of neurons, etc. The plot_model command from Keras library is helpful to display ANN as well as CNN structure due to its flowchart style layout of the neural network. But if there are more layers to a CNN, this visualization style requires more space and is difficult to read. There is another library, ‘Visualkeras’, which can easily help us visualize these networks. In this tutorial, we will explore the Visualkeras library and develop visualizations using it.

Using Visualkeras to Visualize the Neural Network

Visualkeras is an open-source Python library that helps in the visualization of the Keras neural network architecture. It provides simple customization to meet a wide range of requirements. Visualkeras generates layered style architectures, which are ideal for CNNs (Convolutional Neural Networks), and graph style architectures, which are suitable for most models, including simple feed-forward networks. It’s one of the most helpful libraries for understanding how different layers are connected.

Let’s start by installing the Visualkeras library in the command prompt.

pip install visualkeras

Next, we will import all the libraries which are required to build a sequential model.

import keras import tensorflow as tf from tensorflow import keras from keras.models import Sequential from tensorflow.keras.layers import Input, Conv2D, Dense, Flatten, Dropout from tensorflow.keras.layers import GlobalMaxPooling2D, MaxPooling2D from tensorflow.keras.models import Model from tensorflow.keras import regularizers, optimizers

Now we will build a simple model with some convolutional and pooling layers.

model = Sequential() model.add(Conv2D(64,(4,4),input_shape=(32,32,3),activation='relu',padding='same')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Conv2D(128,(4,4),input_shape=(32,32,3),activation='relu',padding='same')) model.add(MaxPooling2D(pool_size=(2,2)))

Let’s look for the summary of the defined model.

Now to visualize the neural network model, we will import the Visualkeras library package as shown below.

import visualkeras visualkeras.layered_view(model)

The visualization of the neural network model above shows us the two different layers, i.e. convolutional layers in yellow and pooling layers in pink.

Let’s increase the complexity and add some more layers with a few dropouts to see the effect of visualization.

model = Sequential() model.add(Conv2D(64,(4,4),input_shape=(32,32,3),activation='relu',padding='same')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Conv2D(64,(4,4),input_shape=(32,32,3),activation='relu',padding='same')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Dropout(0.2)) model.add(Conv2D(128,(4,4),input_shape=(32,32,3),activation='relu',padding='same')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Dropout(0.25)) model.add(Conv2D(128,(4,4),input_shape=(32,32,3),activation='relu',padding='same')) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Dropout(0.35)) model.add(Flatten()) model.add(Dense(256, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(10, activation='softmax'))

The summary of the defined model is shown below.

Now, we visualize the model with the newly added layers.

Now we will add the legend to the visualization. The legend property indicates the connection between colour and layer type. It is also possible to provide a custom “PIL.ImageFont” to use otherwise, Visualkeras will use the default PIL font. Depending on your operating system, you may need to specify the entire path to the preferred font. In the case of google colab, copy the required font to the truetype font folder else, you can use the default font.

visualkeras.layered_view(model, legend=True) # without custom font from PIL import ImageFont font = ImageFont.truetype("arial.ttf", 12) visualkeras.layered_view(model, legend=True, font=font) # selected font

Using the following code, we can see the neural network model in 2D space or in flat style.

visualkeras.layered_view(model, legend=True, font=font, draw_volume=False)

The spacing between the layers can be adjusted using the ‘spacing’ variable, as shown below.

visualkeras.layered_view(model, legend=True, font=font, draw_volume=False,spacing=30)

We can customize the colours of the layers using the following code.

from tensorflow.keras import layers from collections import defaultdict color_map = defaultdict(dict)customize the colours color_map[layers.Conv2D]['fill'] = '#00f5d4' color_map[layers.MaxPooling2D]['fill'] = '#8338ec' color_map[layers.Dropout]['fill'] = '#03045e' color_map[layers.Dense]['fill'] = '#fb5607' color_map[layers.Flatten]['fill'] = '#ffbe0b' visualkeras.layered_view(model, legend=True, font=font,color_map=color_map)

Using this library, we can display any neural network layers in a convenient way with a few lines of code. This library is useful when using AutoML tools as the neural network is set up by the tool. With this library, we can easily find out details about the layers of the neural network to better understand the model through colourful visualizations.


Visualizing neural networks is helpful to understand the connections between the layers. It is useful when we want to explain the structure of the built neural network for teaching or presenting purposes. Using a flowchart type visualization for neural networks with multiple hidden layers might be sometimes tedious to read due to space constraints. In this article, we saw an easy method to visualize a neural network using the Visualkeras library. With minimal code, we can quickly display the structure/layout of the neural network and customize it using this library. I hope you enjoyed reading this article. The code is available on my GitHub repository. Try this library for your ANN/CNN architecture and visualize the neural network better.

Author Bio

She loves travelling, reading fiction, solving Sudoku puzzles, and participating in coding competitions in her leisure time.

You can follow her on LinkedIn, GitHub, Kaggle, Medium, Twitter.

The media shown in this article is not owned by Analytics Vidhya and are used at the Author’s discretion. 


Update the detailed information about Misguiding Deep Neural Networks: Generalized Pixel Attack on the website. We hope the article's content will meet your needs, and we will regularly update the information to provide you with the fastest and most accurate information. Have a great day!