How to Implement Linear Regression With Multiple variables?

The following article describes How to Implement Linear Regression With Multiple variables.

Problem Statement

Use the sklearn library to create a linear regression with multiple variables. Load a well known dataset split it into training and testing sets, and then train the model to predict a target variable based on one or more features. For instance, you can use the following dataset.

https://raw.githubusercontent.com/sachinmotwani20/NPTEL-ML_Datasets/main/ScoresPrediction.

Solution

In order to create a linear regression model with multiple variables using the scikit-learn (sklearn) library, you can follow these steps.

Import the necessary libraries.
Load the dataset from the provided URL.
Split the dataset into training and testing sets.
Create and train the linear regression model.
Evaluate the model’s performance.

The following Python code example does this.

# Import necessary libraries
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
from urllib.request import urlopen
import matplotlib.pyplot as plt

# Load the dataset from the provided URL
url = "https://raw.githubusercontent.com/sachinmotwani20/NPTEL-ML_Datasets/main/ScoresPrediction.csv"
data = pd.read_csv(urlopen(url))

# Check the first few rows of the dataset
print(data.head())

# Separate the features (X) and target variable (y)
X = data.drop('FinalYrScore', axis=1)  # Assuming 'FinalYrScore' is the target variable
y = data['FinalYrScore']

# Split the dataset into training and testing sets (e.g., 80% training, 20% testing)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and train the linear regression model
model = LinearRegression()
model.fit(X_train, y_train)

# Make predictions on the test set
y_pred = model.predict(X_test)

# Evaluate the model's performance
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f"Mean Squared Error (MSE): {mse}")
print(f"R-squared (R2) Score: {r2}")

# Plotting the linear regression line
plt.figure(figsize=(10, 6))
plt.scatter(X_test['FirstYrScore'], y_test, color='blue', label='Actual Scores')
plt.plot(X_test['FirstYrScore'], y_pred, color='red', linewidth=2, label='Linear Regression Line')
plt.xlabel('First Year Score')
plt.ylabel('Final Year Score')
plt.title('Linear Regression for Score Prediction')
plt.legend()
plt.grid(True)
plt.show()

Output

**Linear Regression With Multiple Variables**

This code will load the dataset, split it into training and testing sets, train a linear regression model, evaluate its performance, and then plot the linear regression line along with the actual data points.

How to Implement Linear Regression With Multiple variables?

Problem Statement

Solution

Further Reading

Leave a Reply Cancel reply

Problem Statement

Solution

Further Reading

You may also like...

How to Deploy a Machine Learning Model?

Creating Basic Charts using Plotly

Vanishing and Exploding Gradient Descent

Leave a Reply Cancel reply