Applied Math for Creative Coders
  1. Math Models for Creative Coders
  2. AI
  3. The Perceptron
  • Math Models for Creative Coders
    • Maths Basics
      • Vectors
      • Matrix Algebra Whirlwind Tour
      • Things at Right Angles
      • content/courses/MathModelsDesign/Modules/05-Maths/70-MultiDimensionGeometry/index.qmd
    • Tech
      • Tools and Installation
      • Adding Libraries to p5.js
      • Using Constructor Objects in p5.js
      • The Open Sound Protocol
    • Geometry
      • Circles
      • Complex Numbers
      • Fractals
      • Affine Transformation Fractals
      • L-Systems
      • Kolams and Lusona
    • Media
      • Fourier Series
      • Additive Sound Synthesis
      • Making Noise Predictably
      • The Karplus-Strong Guitar Algorithm
    • AI
      • Working with Neural Nets
      • The Perceptron
      • The Multilayer Perceptron
      • MLPs and Backpropagation
      • Gradient Descent
    • Projects
      • Projects

On this page

  • Inspiration
  • What is a Perceptron?
  • Perceptrons in Code
  • References
  1. Math Models for Creative Coders
  2. AI
  3. The Perceptron

The Perceptron

Published

November 20, 2024

Modified

July 19, 2025

Inspiration

What is a Perceptron?

The perceptron was invented by Frank Rosenblatt is considered one of the foundational pieces of neural network structures. The output is viewed as a decision from the neuron and is usually propagated as an input to other neurons inside the neural network.

Perceptron

Perceptron

Math Intuition

  • We can imagine this as a set of inputs that averaged in weighted fashion.

\[ y_k = sign~(~\sum_{k=1}^n W_k*x_k + b~) \tag{1}\]

  • Since the inputs are added with linear weighting, this effectively acts like a linear transformation of the input data.
    • A linear equation of this sort is the general equation of an n-dimensional plane.
    • If we imagine the input as representing the n-coordinates in a plane, then the multiplications scale/stretch/compress the plane, like a rubber sheet. (But do not fold it.)
    • If there were only 2 inputs, we could mentally picture this with a handkerchief.
  • More metaphorically, it seems like the neuron is consulting each of the inputs, asking for their opinion, and then making a decision by attaching different amounts of significance to each opinion.
  • The Structure should remind you of Linear Regression !

So how does it work? Consider the interactive diagram below:

  • The coordinate axes are as shown X, Y, and Z

  • The grey and yellow points are the data we wish to classify into two categories, unsurprisingly “yellow” and grey”.

  • The Weight vector line is a vector of all the weights in the Perceptron.

  • Now, as per the point-normal form of an n-dimensional plane, the multiplication of the input data with the weight vector is like taking a vector dot product ( aka inner product) ! And: every point on the plane has a dot product of ZERO. See the purple vector which is normal to the Weight vector: its dot product with the Weight vector is zero.

  • Data points that are off this “normal plane” in either direction (above and below) will have dot-products which will be positive or negative depending upon the direction!

  • Hence we can use the dot-product POLARITY to decide if a point is above or below the plane defined by the Weight vector. Which is what is done in the threshold-based activation!

  • The bias \(b\) defines the POSITION of the plane; and the Weights define the direction. Together, they classify the points based on the Equation 1.

  • Try to move the slider to get an intuition of how the plane moves with the bias. Clearly, the bias is very influential in deciding the POLARITY of the dot-products. When it aligns with the purple vector (\(dot product = 0\)), it works best.

Why “Linear”?

Why are (almost) all operations linear operations in a NN?

  • We said that the weighted sums are a linear operation, but why is this so?
  • We wish to be able to set-up analytic functions for performance of the NN, and be able to differentiate them to be able to optimize them.
  • Non-linear blocks, such as threshold blocks/signum-function based slicers are not differentiable and we are unable to set up such analysis.
  • Note the title of this reference.

Why is there a Bias input?

  • We want the weighted sum of the inputs to mean something significant, before we accept it.
  • The bias is subtracted from the weighted sum of inputs, and the bias input could also (notionally) have a weight.
  • The bias is like a threshold which the weighted sum has to exceed; if it does, the neuron is said to fire.

So with all that vocabulary, we might want to watch this longish video by the great Dan Shiffman:

Perceptrons in Code

  • R
  • p5.js

Let us try a simple single layer NN in R. We will use the R package neuralnet.

Show the Code
# Load the package
# library(neuralnet)

# Use iris
# Create Training and Testing Datasets
df_train <- iris %>% slice_sample(n = 100)
df_test <- iris %>% anti_join(df_train)
head(iris)
Show the Code
# Create a simle Neural Net
nn <- neuralnet(Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width,
  data = df_train,
  hidden = 0,
  # act.fct = "logistic", # Sigmoid
  linear.output = TRUE
) # TRUE to ignore activation function

# str(nn)

# Plot
plot(nn)

# Predictions
# Predict <- compute(nn, df_test)
# Predict
# cat("Predicted values:\n")
# print(Predict$net.result)
#
# probability <- Predict$net.result
# pred <- ifelse(probability > 0.5, 1, 0)
# cat("Result in binary values:\n")
# pred %>% as_tibble()

To Be Written Up.

References

  1. The Neural Network Zoo - The Asimov Institute. http://www.asimovinstitute.org/neural-network-zoo/
  2. It’s just a linear model: neural networks edition. https://lucy.shinyapps.io/neural-net-linear/
  3. Neural Network Playground. https://playground.tensorflow.org/
  4. Rohit Patel (20 Oct 2024). Understanding LLMs from Scratch Using Middle School Math: A self-contained, full explanation to inner workings of an LLM. https://towardsdatascience.com/understanding-llms-from-scratch-using-middle-school-math-e602d27ec876
  5. Machine Learning Tokyo: Interactive Tools for ML/DL, and Math. https://github.com/Machine-Learning-Tokyo/Interactive_Tool
  6. Anyone Can Learn AI Using This Blog. https://colab.research.google.com/drive/1g5fj7W6QMER4-03jtou7k1t7zMVE9TVt#scrollTo=V8Vq_6Q3zivl
  7. Neural Networks Visual with vcubingx
    • Part 1. https://youtu.be/UOvPeC8WOt8
    • Part 2. https://www.youtube.com/watch?v=-at7SLoVK_I
  8. Practical Deep Learning for Coders: An Online Free Course.https://course.fast.ai

Text Books

  1. Michael Nielsen. Neural Networks and Deep Learning, a free online book. http://neuralnetworksanddeeplearning.com/index.html
  2. Simone Scardapane. (2024) Alice’s Adventures in a differentiable Wonderland.https://www.sscardapane.it/alice-book/

Using R for DL

  1. David Selby (9 January 2018). Tea and Stats Blog. Building a neural network from scratch in R. https://selbydavid.com/2018/01/09/neural-network/
  2. torch for R: An open source machine learning framework based on PyTorch. https://torch.mlverse.org
  3. Akshaj Verma. (2020-07-24). Building A Neural Net from Scratch Using R - Part 1 and Part 2. https://rviews.rstudio.com/2020/07/20/shallow-neural-net-from-scratch-using-r-part-1/ and https://rviews.rstudio.com/2020/07/24/building-a-neural-net-from-scratch-using-r-part-2/

Maths

  1. Parr and Howard (2018). The Matrix Calculus You Need for Deep Learning.https://arxiv.org/abs/1802.01528
R Package Citations
Package Version Citation
neuralnet 1.44.2 @neuralnet
Back to top
Working with Neural Nets
The Multilayer Perceptron

License: CC BY-SA 2.0

Website made with ❤️ and Quarto, by Arvind V.

Hosted by Netlify .