Building a neural network from scratch

3 min readFeb 18, 2023

A neural network (NN) is a type of machine learning model inspired by the structure and function of the human brain. It is a network of interconnected processing nodes or neurons, which work together to learn patterns and make predictions based on input data. Neural networks are used in a wide range of applications, such as image and speech recognition, natural language processing, and even game playing.

A neural network is typically organized into layers of nodes, including an input layer that receives the raw data, one or more hidden layers that process the data, and an output layer that produces the model’s predictions. Each node in a layer receives input from the nodes in the previous layer, applies a nonlinear activation function to produce an output, and passes that output to the nodes in the next layer.

The weights and biases of the neural network are adjusted during the training process to minimize the difference between the predicted outputs and the actual outputs for a given set of input data. This process is often performed using an optimization algorithm such as stochastic gradient descent. Once the neural network is trained, it can make predictions on new, unseen data.

Building a neural network from scratch can be a complex and involved process, but I can give you a general overview of the steps involved:

Data preparation: Collect and preprocess the data that will be used to train the neural network. This can include data cleaning, normalization, and feature extraction.
Network architecture design: Choose the structure of the neural network, including the number of layers, the type of layers (e.g., dense, convolutional, recurrent), and the activation functions.
Weight initialization: Randomly initialize the weights for each neuron in the network.
Forward propagation: Feed the input data through the network to obtain a predicted output.
Loss calculation: Calculate the difference between the predicted and actual output using a loss function.
Backward propagation: Use the calculated loss to update the weights in the network, starting from the output layer and working backward through the layers.
Optimization: Repeat steps 4–6 many times, adjusting the weights to minimize the loss function.
Evaluation: Test the network's performance on a validation set and adjust as needed.
Prediction: Use the trained network to make predictions on new data.

import numpy as np

# Define the sigmoid activation function
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

# Define the network architecture
input_size = 2
hidden_size = 4
output_size = 1

# Initialize the weights randomly
W1 = np.random.randn(hidden_size, input_size)
W2 = np.random.randn(output_size, hidden_size)

# Define the input and output data
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
Y = np.array([[0], [1], [1], [0]])

# Train the network
for i in range(10000):
    # Forward propagation
    Z1 = np.dot(W1, X.T)
    A1 = sigmoid(Z1)
    Z2 = np.dot(W2, A1)
    Y_pred = sigmoid(Z2)
    
    # Calculate the loss
    error = Y_pred - Y.T
    loss = np.mean(error ** 2)
    
    # Backward propagation
    dZ2 = error * Y_pred * (1 - Y_pred)
    dW2 = np.dot(dZ2, A1.T)
    dZ1 = np.dot(W2.T, dZ2) * A1 * (1 - A1)
    dW1 = np.dot(dZ1, X)
    
    # Update the weights
    W2 -= 0.1 * dW2
    W1 -= 0.1 * dW1
    
    # Print the loss every 1000 iterations
    if i % 1000 == 0:
        print(f"Loss after iteration {i}: {loss}")

# Print the final loss
print(f"Final loss: {loss}")

The output is
Loss after iteration 0: 0.28316157405898473

Loss after iteration 1000: 0.24980740032014326

Loss after iteration 2000: 0.24923967573976497

Loss after iteration 3000: 0.2473247840390975

Loss after iteration 4000: 0.22116130400204662

Loss after iteration 5000: 0.11651340186677622

Loss after iteration 6000: 0.046334899241306175

Loss after iteration 7000: 0.024005281854347655

Loss after iteration 8000: 0.015303880978218788

Loss after iteration 9000: 0.010975356286640588

Final loss: 0.008458927128356802

Building a neural network from scratch

Written by Mehmet Akif Cifci

No responses yet