Skip to playerSkip to main contentSkip to footer
  • 2 days ago
πŸ”₯ Want to master the Perceptron algorithm?
In this tutorial, you’ll learn step by step how to implement the Perceptron algorithm in Python using both NumPy and PyTorch.

βœ… Perfect for beginners in machine learning and students who want to understand the core logic before diving deep into neural networks.

πŸš€ What you’ll learn in this video:

What is the Perceptron algorithm? πŸ€”

How Perceptron works (linearly separable data)

Code Perceptron from scratch using NumPy 🐍

Build the same model using PyTorch πŸš€

Visualize decision boundaries

Test and improve your model performance

πŸ‘‰ Whether you're studying for ML interviews or starting deep learning, this tutorial builds your foundation!

πŸ”₯ Subscribe to NanoTechBoost for more AI, machine learning, and coding tutorials!

#Perceptron #MachineLearning #PythonTutorial #PyTorch #NumPy #DeepLearning #NanoTechBoost #AIProgramming #LearnMachineLearning #MLTutorial #codingforbeginners
#Perceptron
#MachineLearning
#PythonTutorial
#PyTorch
#NumPy
#DeepLearning
#NanoTechBoost
#AIProgramming
#LearnMachineLearning
#MLTutorial
Transcript
00:00hello everyone assalamualaikum welcome back in this video i am going to show you how we can
00:07implement a perceptron in python using numpy and pytorch so i will be using jupyter notebooks
00:14because i think for simpler code examples it's actually quite nice to use jupyter notebooks
00:21because i can execute one thing at a time and it will make things easier to explain
00:26however later on we have to think i think move on to python script files because when the deep
00:34learning models become larger and larger managing them and in jupyter notebooks can be a little bit
00:40like tedious and also dangerous well because it's easy to lose the overview and debugging is a little
00:47harder if you have separate cells and stuff like that and then the danger is that you might execute
00:54things out of order and also you want to import certain aspects from different files
01:01you don't want to have everything in one notebook because then it becomes really confusing and
01:06unmanageable but we will get to these parts later on so back to this topic of perceptron implementation in
01:15numpy and pytorch let me now walk through my numpy notebook
01:19now the pytorch notebook is actually very very similar which is one of the cool things about pytorch
01:28because it is very similar to numpy except there are some extra features that we will be using later
01:35on and i will have a lecture on pytorch where i will explain you these differences so for right now
01:42it doesn't make such a big difference whether we use numpy or pytorch notebook i will also show you a
01:48step-by-step comparison after i explain the numpy notebook so you will see that's actually difference
01:54difference between the two but let's do one thing at a time right all right let's include some headings
02:04difference and then we will get on with the code
02:11you
02:13you
02:15you
02:17you
02:21you
02:23you
02:34So, I'm importing some libraries for those who have not used notebooks.
02:46This one here, this command is for showing plots in the notebook.
02:50It's technically not necessary anymore but sometimes on some computers, plots will not
02:56be shown in a notebook if you don't include this line.
02:59And it doesn't hurt to include that line.
03:02So I always do this.
03:37I am not up to you.
03:39I am not up to you, so much more.
03:42This video is playlist.
03:44Please put your time?
03:45I'll show this video for the video.
03:47It's been a long time to be continued.
03:49It's been a long time.
03:51It's been a long time for the podcast to see some of the images.
04:25Okay, so I've typed some code.
04:48Now, let's just try to understand this and then we will move on.
04:52So here I'm just loading the dataset.
04:56So there is nothing really interesting happening here, but I will go through this step by step.
05:01So the dataset is like some toy data that I generated.
05:05Let me show you how that looks like.
05:10So here I have two feature columns.
05:12I didn't include any column header, but this is the first feature.
05:16This is the second feature value and this is the class label here.
05:20So there are zeros and ones and you can see the dataset is not shuffled.
05:25And actually it's helpful for learning if the dataset is shuffled.
05:28It will make the learning a little bit faster in Perceptron.
05:32So here I'm loading the data into NumPy.
05:34I can also use Pandas, but I thought it might be overkill because it's relatively simple data set.
05:42And then I'm assigning the features to X, which is a matrix, and then Y, which is a class label.
05:48I can maybe show you just how they look like.
05:52So typing this and so this is X, it's a matrix, and then this is Y, which is a class level array.
05:59So here it's shuffled because I actually executed this whole bunch of code.
06:08So you can then already guess what's going on here.
06:11So here I'm loading the data and then just printing some summary information.
06:16It's always, I think, a good idea to do that to get an idea of things.
06:23So we have 50 labels from class 0 and 50 labels from class 1.
06:28We have 100 data points in total and 2 feature columns and also 100 labels.
06:35So for example, here we can see these and these numbers match.
06:44And that's what we expect.
06:47Then here I'm shuffling the dataset so that they are not all in order.
06:52They are shuffled.
06:53And how I'm doing that is I have to shuffle X and Y together, right?
06:58So everything otherwise will be mixed up.
07:01Then the features won't correspond to the class labels anymore.
07:05So how am I doing that is I'm creating the shuffle index.
07:08So I can just show you creating this shuffle index here.
07:11And how it looks like this is just numbers from 0 to 99.
07:17So the 100 indices and then I'm actually shuffling these indices.
07:21So here I'm generating a random number generator.
07:26And then I'm shuffling these indices here.
07:30So you can look at these after shuffling.
07:32So after I execute that, you will see that they are now in random order.
07:39And then I'm using that to select the data points from X to Y.
07:44From X and Y.
07:45And then X and Y will be shuffled based on the shuffle index here.
07:51So that's how we shuffle.
07:53And then I will use the first 70 data points for training.
07:57And the last 30.
07:58So we have these 100 data points from 70 to 100.
08:02The last 30 data points will be for our test set.
08:06Later on, we will be seeing or using more convenient fields to load data in PyTorch.
08:10So there are some loading utilities here.
08:14I'm just doing it step by step.
08:17So you get a feeling of what's basically going on.
08:20And then I'm normalizing the data.
08:22So this is sometimes also called standardization.
08:26Here I'm standardizing the data such that after standardization,
08:29it will have mean 0 and unit variance.
08:32So I'm subtracting the mean and dividing by the standard deviation.
08:37So here I'm computing the mean and standard deviation of my sample.
08:41And then I'm subtracting the mean and dividing by standard deviation.
08:44And then both will be having mean 0 and standard deviation 1.
08:49So unit variance.
08:51So you can actually check that.
08:54Okay, very close to 0.
08:56So this is 17 digits after 0.
08:59So 0.000 or something up to 2.
09:04It's very small.
09:05It's almost identical to 0 essentially.
09:09And then for the standard deviation, it should be around 1.
09:12So yeah, it is around 1.
09:15So the data is standardized.
09:17Well, why am I doing that?
09:20It kind of speeds up training a little bit.
09:23It's like stabilizing the training for perceptron.
09:27It's not that necessary.
09:29But it is a good practice to do that for other types of optimization algorithms.
09:34Later on when we talk about stochastic gradient descent.
09:38So this is something like standardization that is usually recommended.
09:42The only types of machine learning models where this is really not that necessary is for tree-based models.
09:48But all other machine learning and deep learning models, I know they really can benefit from that.
09:56Especially stochastic gradient descent will just learn faster.
10:01Okay, now let's take a look at our data.
10:05Now let me write some code for that.
10:07So here, this is our training set, how it looks like.
10:20You can see it's around centered at 0.
10:24We have two classes, class 0, these circles here and then the square for class 1.
10:30So feature 1 and feature 2, that's our training set.
10:33And there should be 70 examples and then the remaining 30 examples in our test set down below.
10:45So what we want to do is what we want to train our model on the training set and then evaluate it on the test set.
10:51Okay, now let's implement our preceptron model.
10:58So yes, the preceptron code.
11:01Let's type this one and then I'll get back to you.
11:06Now you can see it's relatively short.
11:09I'm implementing it using a forward and backward method.
11:14And why I'm doing this is because that is also how things are done in PyTorch.
11:22And it will make things more familiar later on if I start using this happen.
11:26But let's start at the top.
11:28So I'm implementing it as a class here.
11:31And you should be, I think, familiar with Python classes.
11:35So here what I'm doing is I'm running the constructor.
11:39This is a special class method, a constructor.
11:41I'm giving it a number of features because that's what I need to know, the number of weights.
11:47And I'm using here the implementation where the weights and the bias are separate because that is more convenient.
11:53So I don't have to modify the feature vector.
11:56And what I'm doing here is I'm initializing the weight vector and the bias unit.
12:03The bias unit is just a single value.
12:05That's just one number.
12:06And the weights, the weight vector, it depends on the number of features.
12:10So I make this a row vector.
12:13So this is then equal to M, the number of features.
12:19So here I'm just setting up my weights and bias and setting them to zero.
12:24Later on, for certain algorithms, for stochastic gradient descent,
12:31it's better to initialize them to small random numbers here for the perceptron.
12:36It's not necessary.
12:39But for neural networks, it will be necessary later on.
12:42We will see that.
12:44Now in the forward method, I'm computing net inputs actually here in the linear.
12:49And then I'm computing the predictions.
12:53That's my threshold.
12:55So this is the net input.
12:57I'm calling it linear because later on, we will also see linear layers in PyTorch.
13:02They are called linear.
13:03And they are basically computing net input.
13:06So this is, you can see the dot product between the input vector and the weights.
13:11And then I'm adding the bias here.
13:13All right.
13:14Then here we have our threshold function.
13:17This threshold function is just using NumPy.
13:20Here's how this works is it's saying if linear.
13:25So if the net input is greater than zero, then output one, otherwise output zero.
13:31So it's our forward method.
13:34And here is our backward method.
13:37In the backward method, why am I calling it like that?
13:40It is for computing the errors.
13:43So usually when we have deeper neural networks, we will use something called backpropagation
13:48where we look at the outputs.
13:50And then based on the outputs, we adjust the inputs.
13:53So in that way, we run the forward method to produce the predictions.
13:57And then we compute the errors and then update.
14:00So it will become more clear when we have a deeper network where there are really like
14:06a backpropagation going on.
14:07So these are our two methods.
14:10Backward is computing the errors, which is the difference between true class labels and
14:14the predictions.
14:15And forward is used to get the predictions in the first place.
14:18So we implemented here the prediction that's going on here that we discussed in
14:23slides also.
14:26Prediction is equal to step A.
14:28And then step B is the backpropagation which gives us the errors.
14:34And now we have to put everything together.
14:37So I implemented this train method here.
14:41So this train method is basically the whole thing here in the slide as you can see.
14:47So for epoch in the number of epochs, so this is for every training epoch.
14:56And then for every training example, we perform the forward pass, the backward pass and update.
15:03Since backward is already doing all forward, we just call backward here.
15:09There's some reshape here going on as well.
15:12And that is because we are making the vector dimensions match.
15:14Otherwise, you will get some errors.
15:16So here, this will be one row and m columns.
15:20So it will be, I think, this is called a row vector because this is just one and multiple
15:25columns because it looks like a row.
15:27And here, we have this row vector and this has to have the same dimensions.
15:34I'm just making the same dimensions so you can compute everything nicely.
15:41Otherwise, you will find there will be dimension mismatch.
15:44So there's just a reshaping going on here.
15:46And then here, we perform the update.
15:48So again, I'm doing the reshape afterwards so that we get the original dimensions back.
15:53Because the weights here, see, we are matching the original dimensions.
15:58So we are just reshaping so we can add it to it.
16:01Otherwise, there will also be a dimension mismatch if this is just a single number.
16:06Or if there is 1 times m vector instead of m times 1 vector.
16:12And then also, we update the bias.
16:14So the bias, it's just updating it by the errors.
16:19And then next thing is evaluating it.
16:21So evaluating the performance here, I'm just doing the forward pass and then compute the
16:26accuracy.
16:27The accuracy is computed by checking how many of the predictions match the true label and
16:33then divide by the data set size.
16:35So it will be giving me a number between 0 and 1.
16:38So this is my perceptron algorithm.
16:41Sorry, perceptron class that we just implemented.
16:45And now I'm going to train it.
16:51So yeah, initializing it.
17:09so yeah initializing it and then training it for five epochs and then I will print the model
17:26parameters afterwards you can see that it's pretty fast so we get the weights the weight
17:32vector and then the bias here and now we can evaluate it like compute the accuracy let's do that
17:53so the test set accuracy is 93% it's not quite 100% on the training set it should actually have
18:01100% right because it's linearly separable this data set and it should converge if it's linearly
18:07separable and now everything is classified correctly test set is not as good as you can
18:15see because P may overfit so let's take a look at the decision boundaries okay so here is some
18:22complicated code to compute the decision boundaries it's actually not that compelling complicated it's
18:30what I did is just rearrange things here so what we have is if you think about it the decision
18:37boundary is greater or equal to zero so everything hinges upon zero so if we have our computations
18:42it's x0 times w0 right plus this and put x1 times y1 then plus the bias so this hinges upon zero what
18:55we are doing here is we are taking one fixed number so let's say we are taking for feature zero the minus
19:02value of minus two so we are going to the left hand side here and then we want to find so this is
19:10for x0 so x0 is the x-axis and x1 is the y-axis so we take minus two here and we want to find the
19:22corresponding x1 value so this is x0 at minus two so this is what we do what is the corresponding x1
19:30value that we want to find so we have to rearrange this solving for x1 right so what we do is we move
19:37this stuff and this to the right hand side so we have x1 value here so I get the x1 value I'm calling
19:45it min and the reason is because it's the left hand side then I'm doing the same thing for the right
19:51hand side so I'm doing it for the right hand side here I'm get again settings x0 to some value I'm
19:59setting it to 2 here and then I'm finding the corresponding y-axis value which is the x1 max
20:05here so I'm doing the same thing just rearranging now I'm just using a max value and then I'm
20:12connecting these lines and that's how I get this I've done this here for the left hand side for the
20:18training set and right hand side for the test set so one is the training and one is the test set
20:23now about the decision boundary it doesn't change actually because it's the same for training and
20:30test set just the data set is different because the decision boundary only depends on w right so we are
20:36providing these these are fixed values we are providing them and the decision boundary only depends
20:41on the model parameters so the decision boundary does not change here this is for the training set
20:47and this is for the test set here on the right hand side and you can see in the training set it
20:53perfectly classifies these examples and on the right hand side this is just this is the test that
20:59you can see it's maybe overfitting some of the data too closely I mean there's no other way actually
21:05but it happens that here is this case it doesn't perform well in this case actually there's a different
21:13way if you would fit the boundary like this something like this more straight then you may get these
21:22right but it just happens that these data points are not in the training set so the model doesn't
21:27know that it should shift boundary more to the right here so in this way the model does actually
21:33a good job on the training set but on the test set it's not so good so in that way it's actually
21:39this term is called overfitting because it fits the training data a little bit too closely and
21:45doesn't generalize so well on the test set so this is how the numpy code works
21:51and now for the pytaj code I will not be writing this down the complete code because
21:58to be honest everything is the same most of the things are same I will only talk about the differences
22:06so I don't need to talk about this much I think because the code is essentially the same
22:12all things are same except for the class okay so there are some differences though let's talk about
22:19only the differences so I've prepared a slide for that the differences that I feel we need to discuss
22:27and also know that we will be talking about this in more detail when I talk about pytaj code
22:34in the coming videos so here I highlighted the differences on the left hand side this is the
22:42number implementation and on the right hand side is the pytaj implementation so the on the left hand
22:47side and this is the same code that we just wrote and you can see that there are not that many
22:55differences so the way the weights and biases are implemented here we are using numpy zeros here
23:00we are using torch zeros here we had a bit more we should be a bit more specific instead of saying
23:08numpy float we say torch float 32 it's a 32 bit I have this device here because of the way I implemented
23:17things that would also run on the GPU if there is a GPU available if no GPU is available it will use the CPU
23:25so there's this device here which is provided optionally it's not necessary though and what's a
23:32little bit more is it's a little bit different so here I mean there are multiple ways you can write
23:38that you can also use a plus function to be honest I just happen to use torch dot add but I could have
23:45also used a plus and then the mm is for matrix multiplication in numpy we really write dot
23:54in torch we write pipe in write pytaj we write like mm for matrix multiplication but in pytaj dot
24:05function can also do matrix multiplication so in the way it is kind of like the same thing it's it just
24:11looks a little bit different the wear function in pytaj is a bit more I would say involved not that much
24:18more involved but it has to have placeholders such that such as having a one and zero it needs to have
24:26a tensor here so I'm creating this as placeholders here and providing them but it's the same concept
24:32and then what's a little bit different here is the last part so instead of numpy dot sum it's torch dot
24:38sum here I'm converting it to the float because otherwise it would be an integer and then an integer
24:43divided by some value will give an integer what we want to have is a float because it's a fraction
24:49between zero and one so if you don't do that you will get back an integer and that's not correct
24:54because the value of accuracy is between zero and one right which is why I am casting this to float
25:00but again the pytaj code will be covered in more detail later so that's that's what I wanted to say
25:07about the code if you have any questions feel free to drop a comment I hope you understood how
25:12to code this perceptron algorithm using both numpy and pytaj

Recommended