Think about the difference between regression and we can take another technique or another means there are two types in your supervised learning one is regression and another one is your classification so in order to go with the regression .
Algorithm i am taking your output as sales so i want to predict the sales so the sales are varying or we can take this output as weight or sales anything like whichever is possible these are changing based on the input parameters so i am taking the input parameters as .
We can take five input parameters we are taking five input parameters and i don't want to go for uh output as sales i just want to go to the output parameter as the weight what is the amount of weight so i want to predict the amount of weight loss so this is weight .
Loss is my output parameter so let me write it down this is my output and in order to differentiate between there are how many inputs there are five input parameters one two three .
Four and five there are five input parameters now in order to make a statement about this output now this nodes need to have a connection with the output parameter so now what are the input parameters for example there can be different input parameters one is that number of hours you are spending on a gym related to .
Workout your agenda it can be your age group it can be the diet you are following or it can be the number of calories you are going with so based on that i just want to make a statement about the output so now what is the equation of your output this is called as based on all these input parameters you are able to get your y predicted .
This is y predicted so what is the equation of your y predicted is equal to now for every input parameter there need to be for example we can't directly balance x 1 plus x 2 x 3 x 4 x 5 equal to y then we call don't call it as a function we just call it as an addition so y is y is equal to sum of input parameters so .
We can just say what is your algorithm so output is equal to sum of input parameters so that's not our equation so now every input is having a certain coefficient which is m1 and the coefficient i'm calling it as m2 and the coefficient i'm calling it as m3 m4 m5 now if you notice one thing based on .
Change in this coefficients they're going to be changing your x one value for example when you have a multiplication now you can take a value of your x as ten now when you are multiplying this ten with one you will be getting one val you will be getting ten if you take the same ten value and if you multiply with 2 you .
Are able to get it as 20. if you take the same 10 value and if you multiply it with 0.5 so you are able to get it as 5. so now by adding a coefficient what exactly is having so whenever you are adding a coefficient to a particular input value based on change in the coefficient your inputs are changing so in order to balance your output in order .
To balance your output you can't change your input parameters for example the amount of diet you are taking you can't change it the amount of age or your gender we can't change these input parameters but this input parameters nature can be changed based on the quality or based on the value of this coefficient values is it right .
For example by changing this coefficient what happens automatically this input is going to change now what is the equation here m1 x1 plus m2 x2 plus m3 x3 plus so on and so forth .
M5 x5 plus now uh is it possible like every time whatever the information we have at every time we are able to get the equal to y predictor no they're going to be certain amount of error and we call that error as your intercept which is called as your c now this is a normal equation .
Of your regression algorithm and this is a linear regression equation now this is your equation now the information is flooding from this node to all these nodes are submitted at this particular output node and they're going to be a certain amount of error the same example now i'm i'm having a knowledge i'm teaching certain amount of information .
To you now you don't receive and so even though you are receiving all the information they're going to be a bit of error if i'm not wrong is it right they're going to be a bit of error or not while i'm teaching you so you don't receive entire content they're going to be a bit of error so how can we balance an .
Equation with that particular other is with by based on this intercept equation these are your coefficients and these are your intercept now you got your y predicted i want to understand for every y predicted value so now you are getting this y predicted based on this equation but based on these parameters based on a .
Particular hours of time he he's spending on a gym based on the gender based on the uh what is that age or based on the food they are eating based on the amount of calories they are consuming so we are able to get one type of weight but they're going to be .
Certain weight which we already grabbed it from the people for example it is aditya so when aditya is from this age group with this gender with this uh time he's spending on a gym with this kind of food he is eating with this many hours of or this much amount of calories if he's consuming based on that i noticed by training him for three months or by .
Noticing about by doing it for a couple of months i am able to notice this level of weight loss so that is called as your actual that is called as your actual y actual now what is your loss function loss function is equal to what is your loss function y .
Actual minus y predicted whole square by n is it right able to understand this once uh confirm me able to get this so while i'm teaching machine learning i'll be a bit slow why in the sense i don't know how many people going to grab .
It and all that for that reason i'm going a bit slow so now in order to balance your left hand side is equal to right hand side so you can't change your input parameters you can't change your output the only way in order to make your left hand side is equal to right hand side or in order to make your loss function with the least in order to .
Reduce your loss function what is the only possible way we have it in order to reduce your loss function in order to make your loss as zero what is the way we have it so how we can make your loss function 0 we can't change your output input parameters you can't change .
Your input parameter so sorry this is your output parameter let me indicate it in yellow color this is your output parameter this is your input parameters this one x2 x3 x5 and all this so you can't change this outputs and this input parameters so the only way in order to reduce your loss is what .
So how can you reduce your loss function you want to make your equation to balance left hand side is equal to right hand side so in order to balance left hand side is equal to right hand side your loss need to be zero so you can't change your actual output you can't change your actual input parameters then what is the way to reduce your loss .
Function to make to make it zero so every time the goal is lhs need to be equal to right hand side so how we can make this loss function as 0 without changing our outputs or inputs you can unmute and you can speak as well no problem absolutely so the only way we have in .
Order to uh in order to make your loss function to change or to make it to zero is in order by changing by changing your coefficient values or slope values by changing this coefficient values and this .
Intercept value going to result in change in your loss function by changing your coefficients and the intercept they're going to be change in your y predictor so when there is a change in your y predictor there going to be a change in your loss function though so that is a basic idea of any algorithm if you take .
Machine learning if you take anything that is the concept so now the way you can't change inputs for example the way a customer is giving the input you can't change it so the way the output actual output is you can't change it for example now my i am a patient and my data is related to i am with this particular age group i am related to .
With this particular height i am related to with this particular weight i am related to with this maximum heart rate with this uh normal uh heart rate or blood pressure so i gave all this parameter based on that what is the output so what exactly uh your sugar levels i want to measure your sugar levels based on these .
Parameters i want to measure your sugar levels so that is the actual sugar levels which you are measuring through a measurement or through some sort of tools or through a report or a to a blood test or whatever it may be now the actual output and this input parameters which a customer is having or which a patient is having we can't change it so .
The only way to make your loss function as zero is like by changing this particular coefficients and the intercept value at a point where this loss is zero we are calling okay these are the right coefficients and the right intercept value for this equation so this is what related to a regression model then when we are getting and .
Knocking the door of your classification how your classification model going to look like again in a classification i can take two use cases or two things one is i want to make a statement whether it is whether there is a cancer and there is .
No cancer i want to make a statement whether there is a cancer or no cancer so in order to make this statement whether there is a cancer or no cancer i want to take it based on the input parameters again the same we can take the five input parameters all these input parameters are passing .
Through cancer and all these parameters are passing through no cancer now if you look on to it again the equation i will be writing this is called as now this cancer or no cancer values need .
To be probability values or they need to be continuous values here this y predicted is a continuous value so this cancer and no cancer values need to be continuous values or they need to be probability values in lines of your classification continuous values .
In cancer no here if you look onto it it needs to be a probability value so i am saying it as y we can say probability of c we can say pc okay that would be better so pc is called as probability of c is equal to again the equation is this is x1 .
X2 x3 x4 x5 now if you look on to this equation m1 x1 plus m2 x2 plus so on and so forth plus m5 x5 plus i am writing it down as c1 now look on .
To it these are continuous parameters and when this continuous parameters are mapped to pc now the value going to range between 0 to 1 or it going to range between some minus infinite to plus infinite this value is going to range as a probability or is going to range from minus infinite to plus infinite .
See aditya now see now this equation and this equation both are same if you look on to yp equation and this equation both are same then how it can give probability value this equation how it can give probability value and how this equation can give a value which ranges from minus infinite to plus infinite that's my doubt .
So how this values can give you probability now i don't think so right it for sure this values and this value is going to remain same right probability of cancer the same equation m 1 x 1 plus m 2 x 2 so on and so forth m phi x y again this value is going to be continuous now in order to translate .
This particular equation into a probability you use a function so you use a function so here i'm denoting f as we use a function so example i'm saying it has a sigmoid function so what is that function i'm calling it as a sigmoid function is equal to 1 by 1 plus e to the power of minus x so .
Whatever the value you have it right m 1 x 1 plus m 2 x 2 all this equation you place it in place of this x so minus m 1 x 1 plus m 2 x 2 so we are able to translate this data into 0 to 1. so now you are using a function in order to convert this particular thing to 0 to 1. now you so now what exactly you are doing in order to translate this input .
Parameters into a function or into a 0 to 1 you are using a function so that's what we are doing in a classification so the probability of no cancer is equal to again function of already we have m1 to cancer so now up to m5 we have it now m6 x1 plus m7 x2 plus so on and so forth .
M uh 9 m9 or m10 m10 i think so 6 7 8 9 10 m10 x5 plus c2 so based on this equation again we are able to get the probability of and see no cancer so in this two probabilities which one is higher for example this .
Probability value is higher for example this probability value is 0.52 now we try to make a classification as there is a cancer if the probability value is something around zero point this one this probability of no cancer is 0.6 we make a statement as probability of no cancer kant if the probability value is 0.5 on 0.5 then how .
You make a statement then we try to look on to the count so which data is repeating more number of times probability of cancer we have more data or probability of no cancer or number of records we have the value counts in pandas so which which which particular column has more records okay no cancer patients are more in count so we try to .
Make a classification as no cancer so this is how we perform a classification algorithm and this is how we perform a regression algorithm you got the difference a high level difference overall class i am not speaking of a specific algorithm but in a overall picture how a classification is working how a regression is working .
Now again if you look on to all this now loss function is just a metric to measure the quality of your uh input and the sorry the quality of your predicted and the actual values so but uh what is the exact drivers of algorithm if you look onto it what are the actual drivers of your algorithm what are the actual drivers to make this .
Prediction so accurate what are the actual drivers already we answered it the actual drivers are your coefficients and your intercept values if the coefficients and the intercept values are not changing if they are not in a right way i think your probability is going to effect if you look on to it what exactly differing between if you .
You are going in at completely top view of an algorithms of regression algorithms and a classification now both the algorithm contains inputs both are containing let let us just uh tick them so both the contains inputs so you got your inputs in all the cases you got your coefficients but what are .
Different the coefficients are changing and again if you look on to it what is changing here the loss function is differed in regression and the loss function is another one in your classification so based on the loss function in order to get a probability you are adding one more probability one more mathematical equation so now how .
These people constructed if you go back to the history of algorithms how these people constructed all these algorithms is like through the math itself now they are using different different equations which are there in math and they are trying now who is a researcher or who is a data no i'm not saying data scientist who is a researcher in data science or .
Who is a researcher in machine learning is this guy is a person who look on to the real world scenarios and he tried to search and he tried to formulate that into a format of an equation and in order to measure that particular equation or in order to measure the quality of the equation they try to search the right relevant loss functions .
That's the reason i don't see every time the standard loss function is entropy sometimes it can be a hinge loss sometimes you can use a kl divergency so sometimes you can use mean squared error or i mean absolute error or a hinge uh what is that pseudo or hoover or it differs so now based on the requirement the researchers are trying to pick the .
Right relevant math and they are trying to use it so that they are able to achieve the reality through the math so they are trying for example i want to create something which need to make a human voice so now what i need to change i need to change i can't change the input of my voice so now i just want to go with the proper coefficients again .
Even in that in order to recognize my voices or in order to speak something or in order to understand my voice or anything so there is an input like voice and there are coefficients there are intercepts and the loss function may not be the same loss function which we used for regression or which we used for entropy there going to be another loss .
Function which can measure the frequency of our voices and which can be more accurate or more precise so wherever you go like you're knocking the door of machine learning or you're knocking the door of deep learning i think this is the basic uh idea or the top view of your regression algorithms and the classification algorithms so .
What is changing architecture is changing and what is changing for example now here we collected five inputs now again you are collecting six inputs what happens six intercepts sorry six coefficients and one intercept going to be generated so again by changing this coefficients and intercept they're going .
To be change in your accuracy clear my friends once please confirm me able to understand this high level picture any doubts you can ask me here if you have any doubts