Simulating simple functions using Neural Networks...in PyTorch

A beginner introduction to neural network by showing how to make a network learn to simulate a simple function in python.

Simulating simple functions using Neural Networks...in PyTorch

Neural networks are generally very good function approximation algorithms. In supervised learning, a data-set is comprised of inputs and outputs, and the supervised learning algorithm learns how to best map examples of inputs to examples of outputs.

In general, neural networks are very simple, only the full architecture is a bit complicated. A node, or a Perceptron, is a the smallest entity inside a network. A neural network comprises of Layers of perceptrons and the way it works is that every perceptron has connections with the perceptrons of the next layer, as shown in the diagram.

https://1.cms.s81c.com/sites/default/files/2021-01-06/ICLH_Diagram_Batch_01_03-DeepNeuralNetwork-WHITEBG.png

Every connection in the network, that is, every arrow has a value associated with it which is called a Weight. These weight are initialized randomly at the time of creating the network and are learnt during the training process.

A trained model is defined as an architecture of neural network and their corresponding weights.

Every node or perceptron holds a number, between 0-1. This number is then multiplied with a weight and then passed on to the next layer. The next layer sums up all the value*weights coming from the previous layer and applies an Activation function which decides if the new perceptron needs to hold a value or not.

Now the same is repeated for the next layers until the final layer is reached. Let's say that our neural network is a classifier, that classifies in 2 classes, Dogs and Cats. That translates into the network by saying that the final layer has 2 nodes/perceptrons. If the network thinks its a dog it will raise the value stored in the first perceptron and if cat is the answer to the query then it raise the value in other node.

For example, let's look at a flow of the information from an animation from Grants Sanderson, where he solved the famous MNIST, hand written number recognition problem using a deep neural network. This network is a classifier which has 10 classes, 0-9.

The image is broken down into pixels, and flattened to produce a single array of 28x28=784 pixels. Every pixel holds a value between 0-1, indicating how bright the pixel is (we are not using a colors for these tasks, even if you had colored images, convert them to gray scale for fast learning and better results).

Training

The training is the process by which we set the weights in the neural network so that the output error is minimized! A neural network is something which minimizes the error and hence getting better and better as the training continues.

Every neural network is a mathematical equation, or like a function, which takes in some data and outputs some. Therefore, it is easy to construct an equation for error by subtracting actual output from desired output. Once the equation of error is known, the task left is to use calculus to find a minima in that error graph, and then find corresponding weights for that minimum error.

If you ever had taken calculus, then it will come easily to you, as we are using calculus to find the minimum error point by slightly moving in the direction of less error at each step by using differentiation of the error equation.

This technique is called Gradient Descent. For the learning process, I would highly recommend you to watch this video by 3blue1brown on gradient descent.

PyTorch

PyTorch is a python framework, developed by Facebook, to speed up Artificial Neural Network research. Using PyTorch is pretty straight forward,

  1. Define a class which holds your neural network.
  2. Define training and testing data by either creating it(like we will do i the demo) or find it from somewhere online(kaggle or pytorch in-built datasets).
  3. Define a loss function, this tells us how far off we are from the desired output.
  4. Define an Optimizer, this goes into the network and adjusts weights after gradients have been calculated.
  5. Assemble all of above in a sequence in python and the model is trained for you.

Everything in pytorch are Tensors. Tensors are like numpy arrays but have a little more structure than them, for example for storing gradient related information. One can easily convert from a tensor to a numpy array and vice-versa by the functions available in the tensor module.

Install the PyTorch framework for you platform before going ahead.

A simple Function

We can define a simple function with one numerical input variable and one numerical output variable and use this as the basis for understanding neural networks for a function simulation.

A simple function to be simulated,

def simpleFunction(x):
	val=(np.sin(x)+np.cos(x))
	#normalize the value so that it never exceeds 1.0
	normalized=val/math.sqrt(2)
	return normalized

Let's create training data from this function using,

import numpy as np

X = np.random.rand(10**4)
y = simpleFunction(X)

A Neural Network

A simple neural network with one hidden layer in pytorch class,

import torch
from torch import nn, optim


class FuncSimulator(nn.Module):
    def __init__(self):
        super(FuncSimulator, self).__init__()
        self.regressor = nn.Sequential(nn.Linear(1, 1024),
                                       nn.ReLU(inplace=True),
                                       nn.Linear(1024, 1024),
                                       nn.ReLU(inplace=True),
                                       nn.Linear(1024, 1))
    def forward(self, x):
        output = self.regressor(x)
        return output

As you can see that the first layer has a dimension of (1,1024) which translates to 1 input and then 1024 connections to the next layer, and in the last layer, we go from 1024 to 1 again as the output of the function is a single value between 0-1.

Initialize the model, loss function and the optimizer

First, we look for a GPU, if that is not available then we fall back to CPU and initialize our model for such.

device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
model = FuncSimulator().to(device)


#Initialize an optimizer to alter weights
optimizer = optim.Adam(model.parameters(), lr=LEARNING_RATE)

#Setting a loss function
criterion = nn.MSELoss(reduction="mean")

There are a plethora of loss functions and the optimizers available to chose from in the module.

One last thing to be done before moving on to the training phase, and that is formatting the test and training data, transforming them into tensors for the PyTorch to use.

from torch.utils.data import TensorDataset, DataLoader
from sklearn.model_selection import train_test_split


X_train, X_val, y_train, y_val = map(torch.tensor, train_test_split(X, y, test_size=0.2))
train_dataloader = DataLoader(TensorDataset(X_train.unsqueeze(1), y_train.unsqueeze(1)), batch_size=BATCH_SIZE,
                              pin_memory=True, shuffle=True)
val_dataloader = DataLoader(TensorDataset(X_val.unsqueeze(1), y_val.unsqueeze(1)), batch_size=BATCH_SIZE,
                            pin_memory=True, shuffle=True)

Training Loop

The complete training process looks like this,

train_loss_list = list()
val_loss_list = list()
for epoch in range(MAX_EPOCH):
    print("epoch %d / %d" % (epoch+1, MAX_EPOCH))
    #let's tell the model that it is time for training
    model.train()
    # training loop
    temp_loss_list = list()
    for X_train, y_train in train_dataloader:
        X_train = X_train.type(torch.float32).to(device)
        y_train = y_train.type(torch.float32).to(device)

        optimizer.zero_grad()

        score = model(X_train)
        loss = criterion(input=score, target=y_train)
        # loss.requires_grad = True
        loss.backward()

        optimizer.step()

        temp_loss_list.append(loss.detach().cpu().numpy())

Among other things, the most core component is this,

	optimizer.zero_grad()
        score = model(X_train)
        loss = criterion(input=score, target=y_train)
        loss.backward()
        optimizer.step()

Let me explain what are we doing here,

  1. optimizer.zero_grad(): Zero the gradients, we do it in every training loop.
  2. score = model(X_train): Getting a prediction out of the model.
  3. loss = criterion(input=score, target=y_train): calculating loss with respect to the target(desired) output.
  4. loss.backward(): Calculate gradients of all the weights in the network.
  5. optimizer.step(): Modify the weights in the network.

This loops for many Epochs and then we try to evaluate the result using the test data.

temp_loss_list = list()
for X_val, y_val in val_dataloader:
    X_val = X_val.type(torch.float32).to(device)
    y_val = y_val.type(torch.float32).to(device)

    score = model(X_val)
    loss = criterion(input=score, target=y_val)

    temp_loss_list.append(loss.detach().cpu().numpy())

val_loss_list.append(np.average(temp_loss_list))

print("\tval loss: %.5f" % val_loss_list[-1])

Running The Code

To execute the code, copy the code into a single file, name it pytorch.py.

$ python3 pytorch

Starting the training
epoch 1 / 15
epoch 2 / 15
epoch 3 / 15
epoch 4 / 15
epoch 5 / 15
epoch 6 / 15
epoch 7 / 15
epoch 8 / 15
epoch 9 / 15
epoch 10 / 15
epoch 11 / 15
epoch 12 / 15
epoch 13 / 15
epoch 14 / 15
epoch 15 / 15
Training Finished successfully

Testing Now
Value Loss: 0.00051
Testing finished

By now, we have successfully simulated our function using the Neural Network!!

We can save the network by torch.save(model.state_dict(), "model.pt"), and load the model by model = torch.load("model.pt").

Now, to use the model inside your code or anywhere else to calculate simpleFunction(value), just use model(torch.tensor(value))!! Thats it!

Conclusion

We have seen that simple functions are simulated using neural networks and can easily be trained. A neural network has some components which need to be defined by the programmer and the rest is handled by the framework PyTorch.

In the upcoming articles, we shall see how to use Pytorch pretrained models to solve some of our problems. Thankyou for reading the article.

Sign up for the blog-newsletter! I promise that we do not spam...