How to plot multiple linear regression in Python

Learn how to plot multiple linear regression in Python. Discover different methods, tips, real-world applications, and how to debug common errors.

Published on:

Tue

Mar 10, 2026

Updated on:

Fri

Mar 13, 2026

The Replit Team

ON THIS PAGE

Example H2

Multiple linear regression helps you understand relationships between several variables. Python offers powerful libraries to visualize these complex models, which makes insights clear and accessible for any data analysis project.

Here, you'll learn essential techniques to plot your regression models. You'll get practical tips, see real-world applications, and receive advice to debug common issues, so you can master your data visualizations.

Basic plot with `sklearn` and matplotlib

import numpy as np import matplotlib.pyplot as plt from sklearn.linear_model import LinearRegression from sklearn.datasets import make_regression X, y = make_regression(n_samples=100, n_features=2, noise=20, random_state=42) model = LinearRegression().fit(X, y) y_pred = model.predict(X) plt.scatter(y, y_pred) plt.plot([min(y), max(y)], [min(y), max(y)], 'k--') plt.xlabel('Actual values'); plt.ylabel('Predicted values') plt.show()--OUTPUT--[A scatter plot showing actual vs predicted values with a diagonal dashed line representing perfect predictions]

This code generates a classic "actual vs. predicted" plot, a fundamental tool for evaluating any regression model. The core of the visualization is plt.scatter(y, y_pred), which maps the true values against your model's predictions. A strong model will produce points that cluster tightly in a linear fashion.

To make the evaluation easier, plt.plot() adds a dashed diagonal line. This line isn't part of your data—it represents a perfect model where every prediction exactly matches the actual value. The closer your scatter points are to this reference line, the better your model's performance.

Common visualization techniques

While that first plot is essential, you can uncover more nuanced insights by using specialized libraries and adding dimensions to your visualizations.

Using `seaborn` for regression plots

import pandas as pd import seaborn as sns from sklearn.datasets import make_regression X, y = make_regression(n_samples=100, n_features=2, noise=20, random_state=42) df = pd.DataFrame(X, columns=['Feature 1', 'Feature 2']) df['Target'] = y sns.pairplot(df, x_vars=['Feature 1', 'Feature 2'], y_vars='Target', kind='reg', height=5) plt.show()--OUTPUT--[A pair of plots showing the relationship between each feature and the target variable with regression lines]

Seaborn's pairplot function is a powerful tool for visualizing multiple relationships at once. It works best with pandas DataFrames, which is why the code first converts the data. The key is the kind='reg' argument—it tells seaborn to automatically overlay a linear regression model on each scatter plot.

Each plot shows the relationship between an individual feature and the target variable.
The regression line visualizes the trend, while the shaded area represents the model's confidence interval.

Creating 3D plots for multiple linear regression

from mpl_toolkits.mplot3d import Axes3D X, y = make_regression(n_samples=100, n_features=2, noise=20, random_state=42) model = LinearRegression().fit(X, y) x_surf = np.linspace(X[:, 0].min(), X[:, 0].max(), 50) y_surf = np.linspace(X[:, 1].min(), X[:, 1].max(), 50) x_surf, y_surf = np.meshgrid(x_surf, y_surf) z_surf = model.intercept_ + model.coef_[0] * x_surf + model.coef_[1] * y_surf fig = plt.figure(figsize=(10, 8)) ax = fig.add_subplot(111, projection='3d') ax.scatter(X[:, 0], X[:, 1], y, color='b') ax.plot_surface(x_surf, y_surf, z_surf, color='r', alpha=0.3) plt.show()--OUTPUT--[A 3D plot showing data points as blue dots and a red semi-transparent regression plane]

When your model uses two features, you can visualize the regression as a plane in 3D space. This code sets up a 3D canvas using projection='3d' to map your two features and the target variable.

The blue dots, plotted with ax.scatter, represent your actual data points scattered in three dimensions.
The red surface is the regression plane itself. It's generated using ax.plot_surface and shows your model's predicted values for every combination of the two features.

Using `plotly` for interactive regression plots

import plotly.graph_objects as go X, y = make_regression(n_samples=100, n_features=2, noise=20, random_state=42) model = LinearRegression().fit(X, y) y_pred = model.predict(X) fig = go.Figure() fig.add_trace(go.Scatter(x=y, y=y_pred, mode='markers', name='Data points')) fig.add_trace(go.Scatter(x=[min(y), max(y)], y=[min(y), max(y)], mode='lines', line=dict(color='red', dash='dash'))) fig.update_layout(xaxis_title='Actual', yaxis_title='Predicted') fig.show()--OUTPUT--[An interactive scatter plot with actual vs predicted values and a red dashed diagonal line]

Plotly’s main advantage is interactivity, letting you create plots you can zoom, pan, and inspect by hovering. You build the visualization by initializing a go.Figure() object and then adding graphical elements called "traces" to it.

The first add_trace() call uses go.Scatter with mode='markers' to plot your data points.
A second call adds the diagonal reference line, this time using mode='lines'.

This layered approach gives you precise control over your final interactive chart.

Advanced techniques and customizations

With the fundamentals covered, you can now build more sophisticated plots to visualize model uncertainty and isolate the impact of each predictor.

Visualizing regression with confidence intervals

import statsmodels.api as sm X, y = make_regression(n_samples=100, n_features=1, noise=20, random_state=42) X = sm.add_constant(X) # Add constant for statsmodels model = sm.OLS(y, X).fit() predictions = model.get_prediction(X) summary_frame = predictions.summary_frame(alpha=0.05) plt.scatter(X[:, 1], y, alpha=0.5) plt.plot(X[:, 1], summary_frame['mean'], color='red') plt.fill_between(X[:, 1], summary_frame['mean_ci_lower'], summary_frame['mean_ci_upper'], color='pink', alpha=0.3) plt.show()--OUTPUT--[A plot showing the regression line with shaded pink confidence intervals and blue data points]

The statsmodels library is great for statistical details like confidence intervals. The key is using predictions.summary_frame(), which calculates the upper and lower bounds for your model's predictions. This lets you visualize the model's uncertainty directly using plt.fill_between() to create the shaded region.

The red line is your model's regression line—its best estimate.
The shaded pink area represents the 95% confidence interval, showing the range where the true regression line is likely to be found.

Creating partial regression plots with `statsmodels`

X, y = make_regression(n_samples=100, n_features=3, noise=20, random_state=42) X = sm.add_constant(X) model = sm.OLS(y, X).fit() fig = plt.figure(figsize=(12, 4)) sm.graphics.plot_partregress_grid(model, fig=fig) plt.tight_layout() plt.show()--OUTPUT--[A grid of partial regression plots showing the relationship between each feature and the target while controlling for other features]

Partial regression plots help you see the unique effect of each feature on the target. The sm.graphics.plot_partregress_grid() function is a powerful tool that generates a separate plot for every predictor in your model. It's a great way to isolate variables.

Each plot shows the relationship between one feature and the target variable.
It does this after accounting for the effects of all other features.
This lets you judge the true impact of a single predictor, separate from the others.

Using specialized libraries like `yellowbrick`

from sklearn.model_selection import train_test_split from yellowbrick.regressor import PredictionError X, y = make_regression(n_samples=100, n_features=2, noise=20, random_state=42) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2) model = LinearRegression() visualizer = PredictionError(model) visualizer.fit(X_train, y_train) visualizer.score(X_test, y_test) visualizer.show()--OUTPUT--[A specialized prediction error plot showing model performance with metrics and an ideal diagonal line]

The yellowbrick library offers high-level tools for model diagnostics that work directly with scikit-learn. It simplifies creating insightful plots. The PredictionError visualizer, for example, gives you a polished and informative version of the standard actual vs. predicted plot, but with added context from your train-test split.

You wrap your model with PredictionError(model) and use its fit() and score() methods.
The visualizer handles the plotting logic for you, showing how the model performs on both training and testing data.
It also automatically displays key performance metrics like the R² score directly on the chart.

Move faster with Replit

Replit is an AI-powered development platform that transforms natural language into working applications. You can describe what you want to build, and Replit Agent creates it—complete with databases, APIs, and deployment.

For the regression plotting techniques we've explored, Replit Agent can turn them into production-ready applications:

Build a sales forecasting tool that uses multiple marketing inputs and visualizes the predicted outcome against actuals on an interactive chart.
Create a performance diagnostics dashboard that generates partial regression plots to isolate the impact of each feature in your model.
Deploy a real estate valuation app that predicts housing prices based on features and displays the regression plane in a 3D plot.

Describe your app idea, and Replit Agent will write the code, test it, and fix issues automatically. Try Replit Agent and turn your data analysis concepts into working software.

Common errors and challenges

Even with powerful tools, you might run into a few common plotting snags; here’s how to solve them.

If your model comparison plots look nearly identical, you're probably plotting predictions on the training data. Since LinearRegression is optimized for this data, different models can produce very similar results. To see the real difference, evaluate performance on unseen data by plotting predictions from your test set (X_test) against the actual test values (y_test).

Sometimes a seaborn plot using the hue parameter won't automatically generate a colorbar, leaving your visualization incomplete. To fix this, assign the plot to a variable. This gives you an object you can use to manually add and label a colorbar, ensuring your plot is clear and self-explanatory.

In 3D visualizations, the regression plane from plot_surface can hide your data points. You can solve this by setting the alpha parameter to a value below 1, which makes the plane semi-transparent. Fine-tuning the viewing angle also helps you find the best perspective to showcase the relationship between the plane and your data.

Troubleshooting incorrect model comparison visualizations with `LinearRegression`

It's easy to mix up your axes and accidentally plot predictions against an input feature instead of the actual target values. This mistake creates a misleading visualization that makes your model seem flawless. The following code demonstrates this common pitfall.

import numpy as np import matplotlib.pyplot as plt from sklearn.linear_model import LinearRegression from sklearn.datasets import make_regression X, y = make_regression(n_samples=50, n_features=1, noise=10, random_state=42) model = LinearRegression().fit(X, y) y_pred = model.predict(X) plt.scatter(X, y) # Actual vs X plt.scatter(X, y_pred, color='red') # Predicted vs X plt.legend(['Actual', 'Predicted']) plt.show()

The code plots both actual (y) and predicted (y_pred) values against the input feature (X). This shows how the model fits the data, not how accurate its predictions are. See the corrected approach below.

import numpy as np import matplotlib.pyplot as plt from sklearn.linear_model import LinearRegression from sklearn.datasets import make_regression X, y = make_regression(n_samples=50, n_features=1, noise=10, random_state=42) model = LinearRegression().fit(X, y) y_pred = model.predict(X) plt.scatter(y, y_pred) # Actual vs Predicted plt.plot([min(y), max(y)], [min(y), max(y)], 'k--') plt.xlabel('Actual values'); plt.ylabel('Predicted values') plt.show()

The corrected code properly evaluates the model by plotting actual values against predicted values using plt.scatter(y, y_pred). This creates the essential "actual vs. predicted" chart where points should cluster around the diagonal line representing perfect predictions. This error is common when you're focused on fitting the model, not evaluating it. Always ensure your axes compare y and y_pred to accurately judge performance, not the input features.

Fixing missing colorbar labels in `seaborn` regression plots

Sometimes, when you create a seaborn regression plot with a color gradient, the colorbar that explains the color scale doesn't appear. This makes the plot difficult to interpret. The code below shows an example of this issue using lmplot.

import pandas as pd import seaborn as sns import matplotlib.pyplot as plt from sklearn.datasets import make_regression X, y = make_regression(n_samples=100, n_features=1, noise=20) df = pd.DataFrame({'feature': X.flatten(), 'target': y}) g = sns.lmplot(x='feature', y='target', data=df, scatter_kws={'c': y}) plt.show()

The lmplot function doesn't create a colorbar because passing color data via scatter_kws bypasses seaborn's automatic legend handling. This leaves the color gradient without a key. The corrected code below shows how to fix this.

The corrected code works by removing the scatter_kws argument. When you pass color data this way, you're using a matplotlib feature that seaborn's lmplot doesn't automatically track for its legend—which is why the colorbar disappears.

To have seaborn generate a proper legend or colorbar, you should use its dedicated parameters like hue instead. This ensures the library can interpret the data correctly and build the visualization as intended.

Resolving 3D plot visibility issues with `plot_surface`

When creating a 3D regression plot, the regression plane can sometimes completely cover your data points. This makes it impossible to see how well the model fits the data. The code below demonstrates this common issue with plot_surface.

from mpl_toolkits.mplot3d import Axes3D import numpy as np import matplotlib.pyplot as plt from sklearn.linear_model import LinearRegression X = np.random.rand(100, 2) y = 3*X[:, 0] + 2*X[:, 1] + np.random.randn(100)*0.1 model = LinearRegression().fit(X, y) fig = plt.figure() ax = fig.add_subplot(111, projection='3d') ax.scatter(X[:, 0], X[:, 1], y) ax.plot_surface(np.unique(X[:, 0]), np.unique(X[:, 1]), model.predict(X).reshape(len(np.unique(X[:, 0])), -1)) plt.show()

The code incorrectly constructs the surface grid for plot_surface using np.unique and reshape. This generates an opaque plane that hides the data points. The corrected code below demonstrates the proper approach to creating the visualization.

from mpl_toolkits.mplot3d import Axes3D import numpy as np import matplotlib.pyplot as plt from sklearn.linear_model import LinearRegression X = np.random.rand(100, 2) y = 3*X[:, 0] + 2*X[:, 1] + np.random.randn(100)*0.1 model = LinearRegression().fit(X, y) x_surf = np.linspace(0, 1, 20) y_surf = np.linspace(0, 1, 20) x_surf, y_surf = np.meshgrid(x_surf, y_surf) z_surf = model.intercept_ + model.coef_[0] * x_surf + model.coef_[1] * y_surf fig = plt.figure() ax = fig.add_subplot(111, projection='3d') ax.scatter(X[:, 0], X[:, 1], y) ax.plot_surface(x_surf, y_surf, z_surf, alpha=0.3) plt.show()

The corrected code fixes the visibility issue by properly generating the regression plane. It uses np.meshgrid to create a uniform grid from your feature data, then calculates the plane's surface using the model's coefficients. The key is setting alpha=0.3 in plot_surface, which makes the plane semi-transparent. This ensures you can see both the regression surface and the actual data points underneath, revealing how well your model fits the data.

Real-world applications

With these techniques and fixes, you can now apply regression plots to solve practical problems in fields like real estate and marketing.

Visualizing real estate price predictions with `LinearRegression`

A classic real-world example is predicting home prices, where you can train a LinearRegression model on housing data and then plot its predictions against actual values to see how well it performs.

import numpy as np import matplotlib.pyplot as plt from sklearn.linear_model import LinearRegression from sklearn.model_selection import train_test_split # Create synthetic housing data np.random.seed(42) n = 100 sqft = np.random.randint(800, 3500, n) bedrooms = np.random.randint(1, 6, n) price = 50000 + 100 * sqft + 10000 * bedrooms + np.random.normal(0, 50000, n) # Prepare data and train model X = np.column_stack((sqft, bedrooms)) y = price X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42) model = LinearRegression().fit(X_train, y_train) y_pred = model.predict(X_test) # Visualize results plt.scatter(y_test, y_pred, alpha=0.6) plt.plot([min(y_test), max(y_test)], [min(y_test), max(y_test)], 'r--') plt.xlabel('Actual Home Price ($)'); plt.ylabel('Predicted Price ($)') plt.show()

This example simulates a real-world scenario by first generating synthetic housing data. The price is calculated from sqft and bedrooms, with np.random.normal adding realistic noise to the values. After splitting the data, the model learns from the training set and is then evaluated on the unseen test set.

The plot visualizes this evaluation.
It maps the model's price predictions (y_pred) against the actual prices (y_test).
The red dashed line shows what perfect predictions would look like, providing a clear benchmark for model accuracy.

Visualizing website traffic with seasonal patterns

Modeling seasonal data, like website traffic, is another powerful application where you can visualize how well a linear regression model captures recurring patterns.

import numpy as np import matplotlib.pyplot as plt from sklearn.linear_model import LinearRegression # Generate synthetic website traffic data with weekly seasonality days = np.arange(1, 91) # 90 days of data weekly_pattern = days % 7 # Day of week (0-6) seasonal_effect = np.sin(2*np.pi*days/7) * 200 # Weekly seasonality trend = days * 5 # Upward trend traffic = 1000 + trend + seasonal_effect + np.random.normal(0, 100, 90) # Base + trend + season + noise # Create features for modeling seasonality X = np.column_stack([days, np.sin(2*np.pi*days/7), np.cos(2*np.pi*days/7)]) model = LinearRegression().fit(X, traffic) predicted_traffic = model.predict(X) # Visualize actual vs predicted with seasonality plt.figure(figsize=(10, 5)) plt.plot(days, traffic, 'o', alpha=0.5, label='Actual traffic') plt.plot(days, predicted_traffic, 'r-', label='Predicted trend') plt.xlabel('Day'); plt.ylabel('Visitors'); plt.legend() plt.show()

This code generates synthetic website traffic data with both a steady upward trend and a weekly seasonal pattern. The key to modeling this is how the features are engineered. Instead of just feeding the day number to the model, the code creates features using np.sin and np.cos. This technique transforms the time data into a circular format that a linear model can use to learn the recurring weekly cycle.

The blue dots on the plot represent the noisy, actual traffic data for each day.
The red line shows the model's predictions, effectively capturing both the overall growth trend and the weekly fluctuations.

Get started with Replit

Turn these plotting techniques into a real tool. Tell Replit Agent: “Build a dashboard that uses partial regression plots to analyze feature impact” or “Create a tool to visualize sales predictions with a 3D regression plane.”

Replit Agent writes the code, tests for errors, and deploys your app automatically. Start building with Replit and bring your data visualization ideas to life.

Get started free

Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.

Get started free

Get started for free

Create & deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.

Get started for free