How to plot a logistic regression in Python

Learn how to plot logistic regression in Python. Discover different methods, tips, real-world applications, and how to debug common errors.

Published on:

Tue

Mar 17, 2026

Updated on:

Wed

Mar 18, 2026

The Replit Team

ON THIS PAGE

Example H2

Logistic regression is a key classification algorithm. To truly understand its performance, you need effective visualizations. Python’s powerful libraries help you plot model results and gain deeper insights.

In this article, you'll explore various techniques to visualize your model. You'll find practical tips, see real world applications, and get common debug advice to help you master logistic regression plots with confidence.

Creating a basic logistic regression plot

import numpy as np import matplotlib.pyplot as plt from sklearn.linear_model import LogisticRegression from sklearn.datasets import make_classification X, y = make_classification(n_samples=100, n_features=1, random_state=42) model = LogisticRegression().fit(X.reshape(-1, 1), y) X_test = np.linspace(min(X), max(X), 100).reshape(-1, 1) plt.scatter(X, y, color='black') plt.plot(X_test, model.predict_proba(X_test)[:, 1], color='blue') plt.axhline(y=0.5, color='red', linestyle='--') plt.show()--OUTPUT--[Output: A scatter plot showing black data points at y=0 and y=1, with a blue S-shaped curve representing the logistic regression probability, and a red horizontal dashed line at y=0.5 marking the decision boundary]

This code first generates synthetic data using make_classification to simulate a binary classification problem. A LogisticRegression model is then trained on this data. The resulting plot visualizes both the raw data and the model's predictive behavior.

The black dots, plotted with plt.scatter, represent the original data points.
The blue S-shaped curve shows the output of model.predict_proba(), which calculates the probability of belonging to the positive class for a smooth range of values.
The red dashed line, created with plt.axhline(y=0.5), marks the decision boundary—the threshold where the model's prediction flips.

Customizing your logistic regression visualizations

Beyond the basic S-curve, you can create more powerful visualizations by customizing styles, plotting 2D decision boundaries, and generating probability heatmaps.

Using different colors and styles for clarity

import numpy as np import matplotlib.pyplot as plt from sklearn.linear_model import LogisticRegression from sklearn.datasets import make_classification X, y = make_classification(n_samples=100, n_features=1, random_state=42) model = LogisticRegression().fit(X.reshape(-1, 1), y) plt.scatter(X[y==0], [0.05]*sum(y==0), color='blue', s=60, label='Class 0') plt.scatter(X[y==1], [0.95]*sum(y==1), color='red', s=60, label='Class 1') X_test = np.linspace(min(X)-1, max(X)+1, 100).reshape(-1, 1) plt.plot(X_test, model.predict_proba(X_test)[:, 1], color='green', linewidth=2) plt.legend() plt.show()--OUTPUT--[Output: A plot with blue dots at the bottom (Class 0) and red dots at the top (Class 1), connected by a green S-shaped curve showing the probability transition]

You can make your plot much clearer with a few style adjustments. This code separates the data points by their actual class, making it easier to see how the model performs for each group. It’s a simple way to add a lot of context to your visualization.

The data points for Class 0 (blue) and Class 1 (red) are plotted with distinct colors and labels.
Points are slightly offset from the y-axis—at 0.05 and 0.95—to prevent them from overlapping with the plot's borders.
A call to plt.legend() displays a key, which helps identify what each color represents.

Visualizing decision boundaries for 2D features

import numpy as np import matplotlib.pyplot as plt from sklearn.linear_model import LogisticRegression from sklearn.datasets import make_classification X, y = make_classification(n_samples=100, n_features=2, random_state=42) model = LogisticRegression().fit(X, y) x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1 y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1 xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.1), np.arange(y_min, y_max, 0.1)) Z = model.predict(np.c_[xx.ravel(), yy.ravel()]).reshape(xx.shape) plt.contourf(xx, yy, Z, alpha=0.4) plt.scatter(X[:, 0], X[:, 1], c=y, edgecolor='k') plt.show()--OUTPUT--[Output: A 2D plot with colored regions showing the decision boundary between classes, with scattered data points of different colors representing each class]

When your data has two features, you can plot the decision boundary that separates the classes. This code works by creating a dense grid of points that covers the entire plot area, allowing you to visualize how the model would classify any point in that space.

The np.meshgrid function generates this coordinate grid.
Your model then runs predict on every point in the grid to determine its class.
plt.contourf colors the background based on these predictions, creating shaded regions that represent the model's decision areas.
Finally, the original data points are scattered on top, showing how the model classifies them.

Creating probability heatmaps

import numpy as np import matplotlib.pyplot as plt from sklearn.linear_model import LogisticRegression from sklearn.datasets import make_classification X, y = make_classification(n_samples=100, n_features=2, random_state=42) model = LogisticRegression().fit(X, y) x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1 y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1 xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.1), np.arange(y_min, y_max, 0.1)) probs = model.predict_proba(np.c_[xx.ravel(), yy.ravel()])[:, 1] plt.contourf(xx, yy, probs.reshape(xx.shape), cmap='RdBu', alpha=0.8) plt.scatter(X[:, 0], X[:, 1], c=y, edgecolor='k', cmap='RdBu') plt.colorbar(label='Probability') plt.show()--OUTPUT--[Output: A color gradient heatmap showing probability values across the feature space, with darker red and blue regions indicating higher probability for each class]

A probability heatmap offers more nuance than a simple decision boundary. Instead of just showing which class a point belongs to, it visualizes the model's confidence in its predictions across the entire feature space.

The key is using model.predict_proba(), which returns the probability for each class instead of a hard classification.
plt.contourf then fills the plot with a color gradient based on these probabilities, showing a smooth transition.
A call to plt.colorbar() adds a legend, making it easy to see how colors map to specific probability values.

Advanced visualization techniques

With the basics covered, you're ready to visualize multi-class models, evaluate performance with ROC curves, and even peek inside the model by plotting coefficients.

Working with multi-class logistic regression

import numpy as np import matplotlib.pyplot as plt from sklearn.linear_model import LogisticRegression from sklearn.datasets import make_classification X, y = make_classification(n_samples=300, n_features=2, n_classes=3, n_informative=2, random_state=42) model = LogisticRegression(multi_class='multinomial', solver='lbfgs').fit(X, y) x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1 y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1 xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.1), np.arange(y_min, y_max, 0.1)) Z = model.predict(np.c_[xx.ravel(), yy.ravel()]).reshape(xx.shape) plt.contourf(xx, yy, Z, alpha=0.4, cmap='viridis') plt.scatter(X[:, 0], X[:, 1], c=y, edgecolor='k', cmap='viridis') plt.show()--OUTPUT--[Output: A plot showing three distinct colored regions representing the decision boundaries for three classes, with scattered points colored according to their class]

Visualizing a multi-class model is a natural extension of plotting 2D decision boundaries. The key difference is how you set up the data and the model itself. This allows you to see how the model separates three or more distinct groups across the feature space.

The data is generated with n_classes=3 to create three distinct categories.
Your LogisticRegression model is configured with multi_class='multinomial', which enables it to handle problems with more than two outcomes.
The resulting plot shows separate decision regions for each class, colored to match the data points.

Creating ROC curves for model evaluation

import matplotlib.pyplot as plt from sklearn.linear_model import LogisticRegression from sklearn.datasets import make_classification from sklearn.metrics import roc_curve, auc from sklearn.model_selection import train_test_split X, y = make_classification(n_samples=1000, random_state=42) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3) model = LogisticRegression().fit(X_train, y_train) y_pred_proba = model.predict_proba(X_test)[:, 1] fpr, tpr, _ = roc_curve(y_test, y_pred_proba) plt.plot(fpr, tpr, label=f'ROC curve (AUC = {auc(fpr, tpr):.2f})') plt.plot([0, 1], [0, 1], 'k--') plt.xlabel('False Positive Rate'); plt.ylabel('True Positive Rate') plt.legend() plt.show()--OUTPUT--[Output: A graph showing the ROC curve as a line curving upward from (0,0) to (1,1), with the area under the curve (AUC) value displayed in the legend, and a diagonal dashed reference line]

An ROC (Receiver Operating Characteristic) curve is a standard way to evaluate your model's performance. It visualizes how well the classifier distinguishes between classes across all possible thresholds. The code first splits the data, trains the model, and then uses predict_proba to get probabilities for the test set.

The roc_curve function calculates the true positive rate (how many positives are correctly identified) versus the false positive rate (how many negatives are incorrectly labeled as positive).
A curve that bows toward the top-left corner indicates a better model.
The auc function computes the Area Under the Curve—a single score summarizing performance. A score of 1.0 is perfect, while the dashed line represents a random 0.5 baseline.

Visualizing model coefficients

import numpy as np import matplotlib.pyplot as plt from sklearn.linear_model import LogisticRegression from sklearn.datasets import load_breast_cancer data = load_breast_cancer() X, y = data.data, data.target model = LogisticRegression(C=1.0).fit(X, y) coef = model.coef_[0] feature_names = data.feature_names[:8] # Using first 8 features for clarity plt.figure(figsize=(10, 6)) plt.barh(feature_names, coef[:8]) plt.xlabel('Coefficient value') plt.title('Logistic Regression Coefficients') plt.show()--OUTPUT--[Output: A horizontal bar chart showing the coefficient values for the first 8 features, with bars extending left (negative) or right (positive) from the center, indicating their influence on the model predictions]

Plotting coefficients lets you peek inside your model to see what it considers important. After training, the code extracts the learned weights for each feature using model.coef_. These values show how much each feature influences the final prediction, and the bar chart makes it easy to compare their impact.

A positive coefficient suggests that as a feature's value increases, so does the probability of the positive outcome.
A negative coefficient suggests the opposite.

It’s a direct way to interpret which features are the strongest predictors in your model.

Move faster with Replit

The visualizations you've learned are powerful, but turning them into full-fledged applications is the next step. Replit is an AI-powered development platform that transforms natural language into working software. Describe what you want to build, and Replit Agent creates it—complete with databases, APIs, and deployment.

For the logistic regression techniques we've explored, Replit Agent can turn them into production tools:

Build a credit risk assessment dashboard that visualizes decision boundaries, showing how factors like income and loan amount impact approval.
Create a model evaluation utility that automatically generates and compares ROC curves for different classifiers, helping you pick the best-performing one.
Deploy an interactive feature importance analyzer that plots model coefficients from a dataset, like the breast cancer data, to explain which features are the strongest predictors.

With Replit Agent, you can go from a simple description to a deployed application faster than ever. It handles the coding, testing, and deployment, letting you focus on the core idea. Try building your next data visualization tool with Replit Agent.

Common errors and challenges

Even with powerful tools, you might run into a few common snags when visualizing your logistic regression models.

You’ll often see an error about input shapes, especially when working with a single feature. Scikit-learn expects a 2D array for its input, but a single feature is often a 1D array. The fix is simple: use reshape(-1, 1) on your data. This tells NumPy to create a column vector with as many rows as needed, which satisfies the model's input requirements.

The colormap you choose can either clarify or confuse your probability heatmaps. For probabilities that simply go from low to high, a sequential colormap like viridis works well. But for visualizing the decision boundary, a diverging colormap like RdBu is often better. It centers on a neutral color at the 0.5 probability mark and uses contrasting colors to show the model's lean toward each class.

When one class has far fewer samples than another, your plots can be misleading. The minority class might get lost in a sea of majority data points, making the model appear more effective than it is. To counter this, you can adjust your plotting style.

Use different marker sizes or symbols to make the minority class stand out visually.
Focus on plots that handle imbalance well, like ROC curves or Precision-Recall curves, which give a truer sense of performance.

Fixing input shape errors with `reshape`

A frequent stumbling block is the ValueError you get when your input data isn't shaped correctly. Scikit-learn expects a 2D array for features, even if you only have one. The code below demonstrates what happens when you pass a 1D array.

import numpy as np from sklearn.linear_model import LogisticRegression X = np.random.randn(100) y = np.random.randint(0, 2, 100) model = LogisticRegression() model.fit(X, y) # Error: Expected 2D array, got 1D array

The model.fit function receives a 1D array for X but expects a 2D array, which causes the error. Even a single feature needs to be structured as a column. The following example demonstrates the fix.

By calling X.reshape(-1, 1), you convert the 1D feature array into the 2D column vector that scikit-learn expects. This simple transformation satisfies the input requirements for the model.fit function, resolving the error. Keep an eye out for this issue whenever you're working with a single feature, as it's a common pitfall. Checking your array's dimensions before training can save you a lot of debugging time.

Choosing appropriate colormaps for probability plots

Your choice of colormap can make or break a probability plot. While a sequential map like viridis works for showing a simple low-to-high progression, it can obscure the decision boundary, making it hard to see where the model's confidence shifts.

The following code demonstrates this issue by using viridis where a diverging colormap would be more effective.

import numpy as np import matplotlib.pyplot as plt from sklearn.linear_model import LogisticRegression from sklearn.datasets import make_classification X, y = make_classification(n_samples=100, n_features=2, random_state=42) model = LogisticRegression().fit(X, y) xx, yy = np.meshgrid(np.linspace(-3, 3, 100), np.linspace(-3, 3, 100)) probs = model.predict_proba(np.c_[xx.ravel(), yy.ravel()])[:, 1] plt.contourf(xx, yy, probs.reshape(xx.shape), cmap='viridis') plt.scatter(X[:, 0], X[:, 1], c=y)

The viridis colormap creates a single-color gradient, which doesn't clearly distinguish between probabilities above and below the 0.5 threshold. This obscures the decision boundary. The following code demonstrates a better approach for this kind of plot.

The corrected code uses a diverging colormap like RdBu_r. This is a much better choice for probability plots because it centers a neutral color around the 0.5 probability mark. The colors then diverge, clearly showing which class the model favors and how confident it is. This makes the decision boundary—the line where predictions flip—immediately obvious. You should use diverging colormaps whenever you need to visualize a value that deviates from a central point.

Visualizing imbalanced class data effectively

When your dataset is imbalanced, standard plots can be deceptive. The majority class can overwhelm the visualization, making it difficult to see how the model performs on the much smaller minority class. This can hide significant performance issues. The following code demonstrates this problem.

import matplotlib.pyplot as plt from sklearn.linear_model import LogisticRegression from sklearn.datasets import make_classification X, y = make_classification(n_samples=1000, weights=[0.9, 0.1], random_state=42) model = LogisticRegression().fit(X, y) plt.scatter(X[:, 0], X[:, 1], c=y) plt.title('Logistic Regression Results')

The weights=[0.9, 0.1] argument makes the minority class points nearly invisible, hiding potential classification issues. This makes it difficult to assess the model's true performance. The following example provides a clearer visualization.

import matplotlib.pyplot as plt from sklearn.linear_model import LogisticRegression from sklearn.datasets import make_classification X, y = make_classification(n_samples=1000, weights=[0.9, 0.1], random_state=42) model = LogisticRegression(class_weight='balanced').fit(X, y) plt.scatter(X[y==0, 0], X[y==0, 1], label='Majority Class', alpha=0.5) plt.scatter(X[y==1, 0], X[y==1, 1], label='Minority Class', s=80) plt.legend()

This corrected code tackles imbalanced data in two ways. First, it sets class_weight='balanced' in the model, helping it pay more attention to the minority class. Second, it adjusts the plot itself to give you a more honest look at performance.

The majority class is made semi-transparent with alpha=0.5.
The minority class is plotted with a larger marker size using s=80, making it stand out.

This ensures both classes are clearly visible.

Real-world applications

Beyond the code, these plots help solve real problems, like visualizing patient risk with LogisticRegression and analyzing financial fraud with a confusion_matrix.

Visualizing patient risk factors with `LogisticRegression`

In healthcare, you can use logistic regression to plot the probability of a condition like malignancy against key patient data, allowing for a visual risk assessment.

import matplotlib.pyplot as plt from sklearn.datasets import load_breast_cancer from sklearn.linear_model import LogisticRegression data = load_breast_cancer() X, y = data.data[:, :2], data.target model = LogisticRegression().fit(X, y) plt.scatter(X[:, 0], X[:, 1], c=model.predict_proba(X)[:, 1], cmap='RdYlBu_r') plt.colorbar(label='Malignant Probability') plt.xlabel(data.feature_names[0][:10]); plt.ylabel(data.feature_names[1][:10]) plt.show()

This code snippet demonstrates how to visualize a model's confidence on a scatter plot. It trains a LogisticRegression model using the first two features from the breast cancer dataset. The plot then visualizes each data point based on those two features.

Instead of coloring points by their true class, the code uses the output of model.predict_proba() to set the color.
This means each point's color directly represents the model's calculated probability for the positive class (malignancy).
The colorbar serves as a legend, mapping the color gradient to these probability scores.

Analyzing financial fraud patterns with `confusion_matrix`

In fraud detection, a confusion_matrix gives you a clear, at-a-glance summary of your model's performance, showing exactly how it handles both normal and fraudulent transactions.

import matplotlib.pyplot as plt from sklearn.linear_model import LogisticRegression from sklearn.datasets import make_classification from sklearn.metrics import ConfusionMatrixDisplay from sklearn.model_selection import train_test_split X, y = make_classification(n_samples=1000, random_state=42, weights=[0.9, 0.1]) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3) model = LogisticRegression(class_weight='balanced').fit(X_train, y_train) ConfusionMatrixDisplay.from_estimator(model, X_test, y_test, display_labels=['Normal', 'Fraud']) plt.show()

This code simulates an imbalanced dataset using make_classification, where one class is ten times more common than the other. This setup is typical for problems like fraud detection, where fraudulent transactions are rare.

The LogisticRegression model uses class_weight='balanced', a key setting that tells the model to give more importance to the underrepresented class during training.
Finally, ConfusionMatrixDisplay.from_estimator generates a visual breakdown of the model's predictions on unseen test data, helping you quickly assess its accuracy for both 'Normal' and 'Fraud' categories.

Get started with Replit

Turn your knowledge into a real tool with Replit Agent. Try prompts like, “Build a utility that plots an ROC curve from a CSV” or “Create a dashboard that visualizes a model’s decision boundary.”

The agent writes the code, tests for errors, and deploys your app. Start building with Replit.

Get started free

Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.

Get started free

Get started for free

Create & deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.

Get started for free