The Architecture of Neural Networks

#Machine Learning
#High Tech
July 29, 2017 4 min read

A lot of success in deep neural networks and Deep Learning lays in the meticulous design of the neural network architecture software development. Below you can see the top-1 one crop precision in proportion to the number of operations needed for one forward pass in numerous popular neural network architectures.

Big data science picture

In 1994, the first complex neural network, LeNet 5 was created and this launched Deep Learning exploration. LeNet 5 architecture was rudimentary, especially the fact that image details were spread out across the entire image and convolutions with learnable frameworks are a good way to pick out feature at multiple locations with little parameters.

From 1998-2010, neural networks were in the early development stages. Few people noticed their increasing power yet lots of researchers made progress in this area. More and more data became available due to the rise of cell-phone cameras as well as low-cost digital cameras. Computing power was also gaining steam since CPUs became faster and the GPUs became a general-purpose computing tool. These trends contributed to the slow progress of neural networks and made the tasks they accomplished more interesting.


A deeper and much wider version of LeNet was created by Alex Krizhevsky in 2012 which won the ImageNet competition by a wide margin. It expanded the insights of LeNet into a significantly large neural network which can be used to learn much more complex objects and hierarchies. The innovations were the use of rectified linear units as non-linearities, overlapping max pooling and using GPUs NVIDIA GTX 580 to reduce training time.

During that time, GPUs provided much larger numbers of cores than CPUs as well as 10 times faster training time which led to the use of larger datasets and grander images.

The success of this project started a revolution. Complex neural networks became a staple of Deep Learning which became known as “big neural networks which can accomplish useful things.”

Big data science


This network had a great yet simple idea of using 1X1 complexities to enable more combinational power to the layers of complexity layers.

NiN architecture uses MLP layers after each convolution to make a better combination of features before the next layer. You may think that the 1X1 convolutions conflict with the original ideas of LeNet, but in fact, they help combine convolution features, something that is not possible by simply stacking convolutional layers. This is not the same as using raw pixels to insert into the next layer. 1X1 convolutions spatially incorporate features across feature maps so, in fact, they do not use many parameters shared across all pixels.

The strength of MLP can significantly increase the effectiveness of individual complexities features by combining them into even more complex groups. This is the very same idea used later on by ResNet and Inception. It incorporated an average pooling layer as a component of the last classifier, a practice that will become standard later on.

GoogLeNet and Inception

Christian Szegdy form Google, started a quest to find a method which reduces the computational demand of deep neural networks thus creating the GoogLeNet, the first of its kind inception architecture.

In the Fall of 2014, when Christian Szegedy created GoogLeNet deep learning models became very useful for categorizing the content of images and video frames. Even the most ardent skeptics conceded that Deep Learning and neural nets are here to stay. Since these techniques were very useful to internet giants such as Google, they became interested in the efficient and wide allocation of architectures on their servers.

Christian Szegedy’s goal was to reduce the computational demand of deep neural nets, yet at the same time retaining state-of-the-art performance, or at the very least keeping the computational cost at the current level. With this goal in mind, he and his team created the Inception module:

Big data science picture

At first, it may seem that this is a parallel combination of 1X1, 3X3 and 5X5 convolution filters, but upon closer examination, we see that by using 1X1 convolution blocks he reduces the number of features before the parallel blocks, something that we refer to today as “bottleneck.”

The “bottleneck layer deserves a section of its own, but GoogLeNet uses a classifier that has a low number of operations compared to AlexNet and VGG. It uses median pooling plus a softmax classifier similar to NiN and a stem without inception modules as part of the first layers.

Bottleneck Layer

The bottleneck layer of Inception was inspired by NiN and reduced the number of features and operations in all layers thereby keeping the inference time low. The number of features was reduced by nearly 4 times before passing the data to the expensive convolution modules. This led to significant savings in computational cost and ensured the success of this architecture.

Let’s take a look at this more closely. Imagine you have 252 features coming in and 252 coming out and the inception layer performs 3×3 convolutions. We end up with 250X250 X 3X3 convolutions that need to be performed. This is more than the entire computational budget we have to run this layer in .5 milli-seconds on a Google Server. We can reduce the number of features that must be convolved to 63 or 252/4. First, we perform 252-> 63 1X1 convolutions and then 63 convolutions on each branch and an additional 1X1 convolution from 63->252 features back once more. In total, we get about 70,000 operations instead of almost 600,000 we had before.

Even though we are doing fewer operations, we retain generality in this layer. The bottleneck layers perform remarkably on the ImageNet dataset. The reason it was so successful is that the input features are correlated and the repetition can be removed by combining them as needed with the 1×1 convolutions. Then, they can be expanded again into a meaningful combination for the next layer after convolution with a smaller number of features.

As for the future, we believe that creating neural networks is critically important to the progress of the Deep Learning field. We could speculate into why we have to invest so much time in the creations of architectures instead of using data to tell what to use and how to combine modules. Although this would be very helpful, it remains a work in progress. Also, it is important to note that so far, we have talked about architectures for computer vision. Neural networks architectures have been created in other areas as well and it is interesting to study how these architectures evolved for all the other tasks.

The future is sure to bring many wonderful innovations and software development solutions in the world of neural network architectures. These innovations will also have practical applications and making all of our devices smarter and easier to use.

4.49 5.0
21 Reviews
May 18, 2022
Unity As a Weapon Against Evil: How Volunteering Helps Resist Russia’s Enemy Attack
Innovecs is a global company, we have two large offices in Ukraine located in Kyiv and Mykolayiv. On February 24, Ukrainians woke up to a new reality — a hostile Russian invasion shattered the plans and quiet lives of 45 million people. Instead of getting confused, the Ukrainians united in the name of victory. President Zelenskiy is now compared to King David of Israel, who skillfully dealt with the huge and ugly Goliath. A small country on a map is fighting a huge Russian army. National collective responsibility has been a powerful blow in response to Russia’s legend of a weak and divided Ukraine. Here everyone is either a soldier or a volunteer. Each of us has our own battlefront. Someone hospitably opens the door to migrants, makes dumplings for soldiers in the defense, someone donates to the army, and seeks options to buy bulletproof vests and medical kits for the Armed Forces of Ukraine. Someone performs DDoS attacks on Russian sites and is waging an active war on the information front. Some advise and provide psychological assistance, some treat and organize humanitarian convoys free of charge. Innovecs, like hundreds of others, has joined the financial support of the army. We also organized additional fundraising for each team member who wanted to donate money to support the army. Of course, Innovecsers help not only with funding, but also volunteer. Today we will share some stories of our team members who chose their “battlefront” during the war with Russia.
November 28, 2019
Is Microservices Architecture Only About the Benefits?
Microservices architecture is not a new thing in software development. However, today this approach to building applications is actively gaining traction. In 2018, Camunda conducted research and surveyed 354 companies to determine the value of microservices adoption for fulfilling digital transformation initiatives. Around 63% of respondents mentioned that the transition to microservices helped them improve user experience, employee efficiency, and save on development and infrastructure tools. The respondents also identified top reasons why they had switched to microservices: Additional reasons for switching to the microservices Today, developing a monolithic program comes with various challenges. For example, the more complicated the composition of the monolith is, the more difficult it is for developers to properly maintain it. Let’s take a look at a scenario where there are many teams working on the same monolithic program. If team B fails to deliver high-quality code, it blocks the other teams from pushing their error-free code to the environment. So, all teams wait till team B fixes the code and tries to deploy it again. All teams are dependant on each other, which results in constant delays in delivering features. Additionally, adopting new technologies while having a monolith application can be a very slow process. For example, if you want to apply a new framework, you must rewrite all the code, which can also become a huge challenge and lead to crashes. To minimize these risks, software companies have started actively using microservices application structure. However, this “monolith → microservices” transition may come with its own challenges. So, in this blog post, we are going to cover the following topics: The peculiarities of microservices architecture and some examples Benefits of microservices and drawbacks that can occur along with these benefits Successful cases of switching to microservices structure
Article, Big Data & Highload, Software Development