Showing posts with label multiple linear regression. Show all posts
Showing posts with label multiple linear regression. Show all posts

Sunday, September 17, 2017

Multiple linear regression in Python

Sometimes we need to do a linear regression, and we know most used spreadsheet software does not do it well nor easily.

In the other hand, a multiple regression in Python, using the scikit-learn library - sklearn - it is rather simple.


import matplotlib.pyplot as plt
import pandas as pd
from sklearn.linear_model import LinearRegression

# Importing the dataset
dataset = pd.read_csv('data.csv')
# separate last column of dataset as dependent variable - y
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values

# build the regressor and print summary results
regressor = LinearRegression()
regressor.fit(X,y)
print('Coefficients:\t','\t'.join([str(c) for c in regressor.coef_]))
print('R2 =\t',regressor.score(X,y, sample_weight=None))

#plot the results if you like
y_pred = regressor.predict(X)
plt.scatter(y_pred,y)
plt.plot([min(y_pred),max(y_pred)],[min(y_pred),max(y_pred)])
plt.legend()
plt.show()