MuTorch: A lightweight Deep Learning framework with PyTorch-like API

MuTorch is a deep learning framework built for educational purposes and to build a deep understanding of how neural networks work. This is not intended to be efficient, but rather very simplistic. The goal is to create a framework that is easy to understand, modify, and learn from. It does not use external libraries and is built from scratch using only Python lists and operators.

Some of the nomenclature used in this framework is inspired by PyTorch and are defined below:

  • Node: A node is the most basic unit of the computational graph and represents either a value or an operation.

  • Neuron: A neuron is a collection of node which represents a single unit of the neural network. It uses a stack of nodes to compute Wx + b and then applies an activation function to the result.

  • Tensor: A tensor is a collection of nodes and can be multi-dimensional. It is used to represent the input and output of a neuron.

  • Layer: A layer performs computations on a tensor and returns a tensor.

The framework is built in a modular way, and can be extended to include new layers, activation functions, and optimizers. Examples of how to use the framework to build node-level, tensor-level, or a full fledged Sequential MLP are provided [here]. Loss functions and optimizer implementations are also provided to build an end-to-end understanding of neural network training process. The framework also provides a simple way to visualize the computational graph using Graphviz.

Building a neural Network

The framework can be used in a few ways:

Node-level: The framework can be used to build a single node and then use it to build a computational graph.

Tensor-level: An example of tensor-level operation is shown below.

Sequential MLP: The framework can be used to build a complete neural network through sequential layers. An example is shown below.


The framework provides a simple way to build optimizers and use them during the neural network training process. Few examples of optimizers provided within the framework include SGD, Adam, etc.

Loss Functions

Loss functions can be easily built using the MuTorch framework. Few example losses implemented within the framework include MSE, L1, SmoothL1, etc.

Training a Neural Network

Putting it all together, the framework can be used to build a complete neural network and train it using a loss function and an optimizer. An example is shown below:

Ground truth: [0.0, 0.0, 0.0, 1.0, 1.0, 1.0]

Model output before training: [0.2625455157756882, 0.06476329944788511, 0.09198898000403521, 0.13625100888110614, 0.3455240473051357, 0.04765093283946377]

#################### Starting model training...

Epoch 500/500 - Loss: 0.000075: 100% ||||||||||||||||||||||||| [00:07<00:00, 69.41it/s]

#################### Training complete!

Model output after training: [0.012926827557352217, 0.008991985611204682, 0.0027829626242188233, 0.9912839449884999, 0.9925483053096971, 0.9921878687511471]

Model weights can further be saved and loaded using and model.load(filename) respectively.

This was just a toy example, but it can easily be extended to a more realistic problem such as classification. One can also further dissect the model to visualize the decision boundary as shown in the figure above. 

For a better understanding of the framework, examples on how to train models for realistic problems, please see the [Demo Notebook].

Feel free to take MuTorch for a spin. Happy Learning!