Linear Regression in Python
Learn how to build and evaluate a Linear regression using Python’s Scikit-learn package.
--
What is regression in machine learning?
Regression is a subfield of supervised machine learning. It aims from model to predict the continuous target variable/ dependent variable (Y). Example : salary, price, temperature etc.
What is linear regression ?
Linear regression is a type of supervised learning algorithm, commonly used for predictive analysis .It is widely used in biological, behavioral and social sciences to describe possible relationships between a dependent variable/ target variable(Y) and one or more independent variable/ feature variables(X) using a best fit straight line (also known as regression line).
Types of linear regression
There are two types of linear regression:
- Simple linear regression- It is the simplest case where we have only one dependent variable(Y) and one independent variable(X).
Equation Y = b0 + b1* X
2. Multi linear regression-It is a generalization of simple linear regression to the case of more than one independent variables(X).
Equation Y = b0 + b1*X1 + b2*X2 + b3*X3...
Without data you’re just another person with an opinion.
Now we will learn how to build simple linear regression model using data. You can download the data from here.
- Import all necessary library
2. Now we are going to read the data using pandas read_csv function.(use pandas.read_json to read json data)
3. Now we are going to check null values from our data.
Their is a null value in y column(target variable) .So we are going to replace null values with the medium of y columns data. fillna function is used here to replace null value with median .
4. Now we are going to divide our train and test dataset in ratio of 70:30 , using sklearn library import train_test_split.
Here X is the independent variable/feature variable and Y is the dependent variable/target variable
5. Predict the values of test data.
Predicting the test data using the regression model .
The equation made here is a simple linear equation (Y = b0 + b1* X , b0 is intercept and b1 is the slope/coefficient of the equation.)
Accuracy of model is 99% .
U can find the code link here.
Thank you for the reading..
This is my first blog .I would love to have your feedbacks.