Linear Regression in Python

Learn how to build and evaluate a Linear regression using Python’s Scikit-learn package.

Khushipatidar
3 min readOct 22, 2020

--

What is regression in machine learning?

Regression is a subfield of supervised machine learning. It aims from model to predict the continuous target variable/ dependent variable (Y). Example : salary, price, temperature etc.

What is linear regression ?

Linear regression is a type of supervised learning algorithm, commonly used for predictive analysis .It is widely used in biological, behavioral and social sciences to describe possible relationships between a dependent variable/ target variable(Y) and one or more independent variable/ feature variables(X) using a best fit straight line (also known as regression line).

Types of linear regression

There are two types of linear regression:

  1. Simple linear regression- It is the simplest case where we have only one dependent variable(Y) and one independent variable(X).
Equation Y = b0 + b1* X

2. Multi linear regression-It is a generalization of simple linear regression to the case of more than one independent variables(X).

Equation Y = b0 + b1*X1 + b2*X2 + b3*X3...

Without data you’re just another person with an opinion.

Now we will learn how to build simple linear regression model using data. You can download the data from here.

Workflow of linear regression model
  1. Import all necessary library
Importing python’s library

2. Now we are going to read the data using pandas read_csv function.(use pandas.read_json to read json data)

Reading data using pandas

3. Now we are going to check null values from our data.

Null values

Their is a null value in y column(target variable) .So we are going to replace null values with the medium of y columns data. fillna function is used here to replace null value with median .

4. Now we are going to divide our train and test dataset in ratio of 70:30 , using sklearn library import train_test_split.

Here X is the independent variable/feature variable and Y is the dependent variable/target variable

Model Selection

5. Predict the values of test data.

Accuracy

Predicting the test data using the regression model .

The equation made here is a simple linear equation (Y = b0 + b1* X , b0 is intercept and b1 is the slope/coefficient of the equation.)

Accuracy of model is 99% .

U can find the code link here.

Thank you for the reading..

This is my first blog .I would love to have your feedbacks.

--

--

Khushipatidar

I am undergraduate student . Exploring knowledge of data science through different projects and platforms.