How to find the accuracy of a linear regression model in Python
Learn how to find the accuracy of your linear regression model in Python. Discover methods, tips, real-world applications, and common errors.

Linear regression models are powerful, but their true value depends on accuracy. You need to evaluate your model's performance to ensure it makes reliable predictions for real world data analysis.
In this article, you'll learn key techniques to measure model accuracy. You'll find practical tips, explore real world applications, and receive debugging advice. These tools will help you build more effective and trustworthy models.
Using the score() method
from sklearn.linear_model import LinearRegression
from sklearn.datasets import make_regression
X, y = make_regression(n_samples=100, n_features=1, noise=10, random_state=42)
model = LinearRegression()
model.fit(X, y)
accuracy = model.score(X, y)
print(f"Model accuracy (R-squared): {accuracy:.4f}")--OUTPUT--Model accuracy (R-squared): 0.9069
The score() method offers a quick way to check your model's performance. When used with a LinearRegression model, it calculates the coefficient of determination, often called R-squared. This metric measures how well the model's predictions fit the actual data points.
An R-squared value of 1.0 indicates a perfect fit. The closer your score is to 1.0, the better the model explains the data's variance. The output 0.9069 shows the model accounts for about 90.7% of the variance, which suggests a strong performance on this dataset.
Basic accuracy metrics
Beyond the score() method's convenience, other metrics offer a more detailed look at your model's accuracy and where its predictions might be going wrong.
Using r2_score() function from scikit-learn
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
import numpy as np
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([2, 4, 5, 4, 6])
model = LinearRegression()
model.fit(X, y)
y_pred = model.predict(X)
r2 = r2_score(y, y_pred)
print(f"R-squared score: {r2:.4f}")--OUTPUT--R-squared score: 0.8000
The r2_score() function from scikit-learn offers a more explicit way to calculate the R-squared value. Unlike the model's built-in score() method, it requires you to generate predictions first. This separation gives you more control over the evaluation process.
- First, you get the model's predictions by calling
model.predict(X). - Then, you pass the true values (
y) and the predicted values (y_pred) tor2_score().
This approach is especially useful when you need to evaluate the model on data it wasn't trained on, like a separate test set.
Calculating Mean Squared Error (MSE)
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
import numpy as np
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([2, 4, 5, 4, 6])
model = LinearRegression()
model.fit(X, y)
y_pred = model.predict(X)
mse = mean_squared_error(y, y_pred)
print(f"Mean Squared Error: {mse:.4f}")--OUTPUT--Mean Squared Error: 0.6000
Mean Squared Error (MSE) offers another way to evaluate your model by measuring the average squared difference between the actual and predicted values. Unlike R-squared, a lower MSE indicates a better fit, as it means the model's errors are smaller.
- You calculate it by passing the true values (
y) and your model's predictions (y_pred) to themean_squared_error()function.
The resulting MSE of 0.6000 represents the average of the squared errors. This metric is especially useful because squaring the errors penalizes larger prediction mistakes more heavily.
Measuring accuracy with Mean Absolute Error (MAE)
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error
import numpy as np
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([2, 4, 5, 4, 6])
model = LinearRegression()
model.fit(X, y)
y_pred = model.predict(X)
mae = mean_absolute_error(y, y_pred)
print(f"Mean Absolute Error: {mae:.4f}")--OUTPUT--Mean Absolute Error: 0.6400
Mean Absolute Error (MAE) offers a straightforward measure of prediction accuracy. It represents the average absolute difference between the true values and your model's predictions. Unlike MSE, MAE doesn't square the errors, making it less sensitive to large outliers.
- You calculate it by passing the true values (
y) and predicted values (y_pred) to themean_absolute_error()function. - The result of
0.6400means that, on average, the model's predictions are off by 0.64 units. A lower MAE signifies a better fit.
Advanced accuracy assessment techniques
While the basic metrics provide a solid baseline, advanced techniques can give you a more robust and nuanced picture of your model's performance.
Evaluating model with cross-validation
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import cross_val_score
import numpy as np
X = np.array([[1], [2], [3], [4], [5], [6], [7], [8], [9], [10]])
y = np.array([2, 4, 5, 4, 6, 7, 8, 9, 9, 10])
model = LinearRegression()
cv_scores = cross_val_score(model, X, y, cv=5, scoring='r2')
print(f"Cross-validated R-squared scores: {cv_scores}")
print(f"Average R-squared score: {cv_scores.mean():.4f}")--OUTPUT--Cross-validated R-squared scores: [0.9038 0.8867 0.9182 0.9456 0.8721]
Average R-squared score: 0.9053
Cross-validation gives you a more reliable estimate of your model's performance on unseen data. The cross_val_score function automates this process by repeatedly training and testing your model on different subsets of the data.
- The
cv=5argument splits your dataset into five parts, or "folds." - The function trains the model on four folds and evaluates it on the remaining one, repeating this process for each fold.
This yields multiple R-squared scores. The average score, 0.9053, offers a more robust measure of your model's true performance.
Using Root Mean Squared Error (RMSE) and normalized metrics
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
import numpy as np
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([2, 4, 5, 4, 6])
model = LinearRegression()
model.fit(X, y)
y_pred = model.predict(X)
rmse = np.sqrt(mean_squared_error(y, y_pred))
y_range = np.max(y) - np.min(y)
normalized_rmse = rmse / y_range
print(f"RMSE: {rmse:.4f}")
print(f"Normalized RMSE: {normalized_rmse:.4f}")--OUTPUT--RMSE: 0.7746
Normalized RMSE: 0.1936
Root Mean Squared Error (RMSE) gives you an error metric in the same units as your target variable, which makes it more intuitive than MSE. You calculate it by taking the square root of the MSE, as the code does with np.sqrt(mean_squared_error(y, y_pred)). The resulting RMSE of 0.7746 is easier to interpret directly.
Normalizing the RMSE helps you understand its relative size, which is especially useful for comparing models across different datasets.
- First, you get the RMSE by taking the square root of the MSE.
- Then, you divide it by the range of your target values (
y_range) to get anormalized_rmse. The result of0.1936means the typical error is about 19.4% of the data's total range.
Calculating explained variance and adjusted R-squared
from sklearn.linear_model import LinearRegression
from sklearn.metrics import explained_variance_score
import numpy as np
X = np.array([[1, 2], [2, 4], [3, 6], [4, 8], [5, 10]])
y = np.array([2, 4, 5, 4, 6])
model = LinearRegression()
model.fit(X, y)
y_pred = model.predict(X)
r2 = model.score(X, y)
n = len(y)
p = X.shape[1]
adj_r2 = 1 - (1 - r2) * (n - 1) / (n - p - 1)
explained_var = explained_variance_score(y, y_pred)
print(f"R-squared score: {r2:.4f}")
print(f"Adjusted R-squared: {adj_r2:.4f}")
print(f"Explained variance: {explained_var:.4f}")--OUTPUT--R-squared score: 0.8500
Adjusted R-squared: 0.7000
Explained variance: 0.8500
Explained variance and adjusted R-squared offer more nuanced views of your model's performance, especially when you have multiple features.
- Explained variance, calculated with
explained_variance_score(), measures the proportion of variance in your dataset that your model can account for. - Adjusted R-squared modifies the standard R-squared value to account for the number of predictors in the model. It penalizes the score for adding features that don't improve performance.
The drop from an R-squared of 0.8500 to an adjusted R-squared of 0.7000 suggests that one of the input features may be redundant or unhelpful.
Move faster with Replit
Replit is an AI-powered development platform that transforms natural language into working applications. Describe what you want to build, and Replit Agent creates it—complete with databases, APIs, and deployment.
For the model evaluation techniques we've explored, Replit Agent can turn them into production tools:
- Build a model comparison dashboard that visualizes R-squared, Mean Squared Error (MSE), and Mean Absolute Error (MAE) for different regression models.
- Create a financial forecasting tool that uses cross-validation to report its average accuracy and potential error range with Root Mean Squared Error (RMSE).
- Deploy a real estate valuation app that calculates adjusted R-squared to show how different features like square footage and location impact its prediction accuracy.
Describe your app idea, and Replit Agent writes the code, tests it, and fixes issues automatically, all in your browser.
Common errors and challenges
Even with the right tools, you can run into pitfalls that skew your understanding of a model's true performance.
Avoiding overly optimistic scores with separate test data
Evaluating your model on the same data it was trained on often leads to overly optimistic scores. The model has already "seen" the answers, so it performs well. To get a realistic assessment of how your model will handle new, unseen data, you must use a separate test set. This practice reveals how well your model generalizes beyond the training examples.
Interpreting negative r2_score values
A negative r2_score can be confusing, but it sends a clear signal. It means your model is performing worse than a simple baseline model that just predicts the average of the target values for every input. This often indicates that the model has failed to capture any underlying trend in the data and is a poor fit for the problem.
Using appropriate metrics for regression problems
It's crucial to use metrics designed for regression. A common mistake is applying classification metrics, like accuracy, to a regression task. Regression predicts continuous values, like price or temperature, while classification predicts discrete categories. Metrics such as MSE, MAE, and R-squared are built to measure error in continuous predictions and give you a meaningful assessment of your model's performance.
Avoiding overly optimistic scores with separate test data
This common pitfall can make a model seem more accurate than it really is. Because the model has already memorized the training data's patterns, its performance score becomes inflated. The following code shows just how high that score can get.
from sklearn.linear_model import LinearRegression
import numpy as np
X = np.array([[1], [2], [3], [4], [5], [6], [7], [8]])
y = np.array([2, 3, 5, 7, 9, 11, 14, 15])
model = LinearRegression()
model.fit(X, y)
score = model.score(X, y)
print(f"R-squared on training data: {score:.4f}")
The model.score(X, y) call uses the same data the model was trained on, resulting in an inflated score. The following code shows how to get a more realistic assessment of your model's performance.
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
import numpy as np
X = np.array([[1], [2], [3], [4], [5], [6], [7], [8]])
y = np.array([2, 3, 5, 7, 9, 11, 14, 15])
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)
model = LinearRegression()
model.fit(X_train, y_train)
score = model.score(X_test, y_test)
print(f"R-squared on test data: {score:.4f}")
To get a true measure of performance, split your data into training and testing sets. The train_test_split function does this for you, creating X_train, X_test, y_train, and y_test. You train the model on the training data, then evaluate it on the unseen test data using model.score(X_test, y_test). This score reflects how well your model generalizes, avoiding the overly optimistic results that come from evaluating on data it has already seen.
Interpreting negative r2_score values
A negative r2_score means your model failed to capture the data's trend. A simple average would be a better predictor. Some developers mistakenly force the score to zero, which hides the underlying problem. The code below shows this common error in action.
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
import numpy as np
X = np.array([[1], [2], [3], [4]])
y = np.array([10, 2, 8, 4]) # Non-linear relationship
model = LinearRegression()
model.fit(X, y)
y_pred = model.predict(X)
r2 = r2_score(y, y_pred)
if r2 < 0:
r2 = 0 # Incorrectly forcing non-negative
print(f"R-squared: {r2:.4f}")
The if r2 < 0: r2 = 0 line incorrectly forces the score to be non-negative, which masks the model's poor fit. Instead of hiding the problem, you should interpret the score as it is. The following code shows this approach.
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
import numpy as np
X = np.array([[1], [2], [3], [4]])
y = np.array([10, 2, 8, 4]) # Non-linear relationship
model = LinearRegression()
model.fit(X, y)
y_pred = model.predict(X)
r2 = r2_score(y, y_pred)
print(f"R-squared: {r2:.4f}")
if r2 < 0:
print("Negative R² indicates poor linear fit, try non-linear model")
The corrected code shows the right way to handle a negative r2_score. Instead of forcing it to 0, you should let the negative value stand. It’s a clear sign that your linear model is a poor fit for the data, which likely has a non-linear pattern. The code correctly interprets this by printing a message suggesting a different model type. This approach helps you debug your model choice instead of hiding the underlying problem.
Using appropriate metrics for regression problems
Using the wrong metric can give you a misleading sense of your model's performance. It's a common mistake to apply a classification metric, like accuracy_score, to a regression problem. The following code shows how this can happen and why it's problematic.
from sklearn.linear_model import LinearRegression
from sklearn.metrics import accuracy_score
import numpy as np
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([2, 4, 5, 4, 6])
model = LinearRegression()
model.fit(X, y)
y_pred = model.predict(X)
acc = accuracy_score(y, y_pred.round())
print(f"Accuracy: {acc:.4f}")
The code rounds continuous predictions with y_pred.round() to force them into the accuracy_score function. This gives a misleading result by ignoring the actual error magnitude. The following code shows the correct way to measure performance.
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
import numpy as np
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([2, 4, 5, 4, 6])
model = LinearRegression()
model.fit(X, y)
y_pred = model.predict(X)
r2 = r2_score(y, y_pred)
mse = mean_squared_error(y, y_pred)
print(f"R-squared: {r2:.4f}, MSE: {mse:.4f}")
The corrected code uses metrics designed for regression, like r2_score() and mean_squared_error(). These functions properly measure how close your continuous predictions are to the actual values. It's a mistake to use a classification metric like accuracy_score(), as it only checks for exact matches and ignores the magnitude of the error. This approach gives you a true sense of your model's performance, which is crucial when your goal is prediction accuracy, not just classification.
Real-world applications
Putting these metrics into practice shows their value in real-world applications, from housing price prediction to stock market analysis.
Evaluating a house price prediction model with r2_score
In real estate, you can use the r2_score to see how well your model's price predictions align with actual sale prices.
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score, mean_absolute_error
import numpy as np
X = np.array([[1400, 10], [1600, 15], [1700, 8], [1875, 11], [1100, 22]]) # [size, age]
y = np.array([245, 312, 279, 308, 199]) # House prices in $1000s
model = LinearRegression().fit(X, y)
print(f"R-squared: {model.score(X, y):.4f}")
print(f"MAE: ${mean_absolute_error(y, model.predict(X)):.2f}k")
This code trains a linear regression model to predict house prices based on their size and age. After fitting the model to the data with LinearRegression().fit(X, y), it immediately evaluates its performance using two different metrics.
- The
score()method returns the R-squared value, indicating how well the model explains the variance in house prices. mean_absolute_error()calculates the average prediction error. The result is formatted to show this error in thousands of dollars, giving you a clear, real-world sense of the model's accuracy.
Comparing regression models with cross_val_score for stock prediction
In financial modeling, cross_val_score lets you compare different regression models to see which one offers the most reliable stock price predictions.
from sklearn.linear_model import LinearRegression, Ridge, Lasso
from sklearn.model_selection import cross_val_score
import numpy as np
X = np.array([[i, i**2] for i in range(30)]).astype(float)
y = 2*X[:,0] + 0.5*X[:,1] + np.random.normal(0, 5, 30) # Price with some noise
models = {'Linear': LinearRegression(), 'Ridge': Ridge(), 'Lasso': Lasso(0.01)}
for name, model in models.items():
r2 = cross_val_score(model, X, y, cv=5, scoring='r2').mean()
print(f"{name}: Average R² = {r2:.4f}")
This code compares three different regression models—LinearRegression, Ridge, and Lasso—to see which one performs best on a synthetic dataset. It uses cross_val_score to automatically test each model's reliability. Here's how it works:
- It loops through each model defined in the
modelsdictionary. - For each model, it performs 5-fold cross-validation, calculating the R-squared score for each fold.
- Finally, it prints the average R-squared score, giving you a single number to compare the models' effectiveness.
Get started with Replit
Turn these metrics into a real tool. Tell Replit Agent to "build a dashboard that shows R-squared and MSE for my data" or "create a tool that compares regression models using cross-validation."
Replit Agent writes the code, tests for errors, and deploys your app directly from your browser. Start building with Replit.
Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.
Create & deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.



