How to make AI in Python

Learn to make AI in Python. You'll find different methods, tips, real-world applications, and ways to debug common errors.

How to make AI in Python
Published on: 
Tue
Mar 3, 2026
Updated on: 
Thu
Mar 5, 2026
The Replit Team Logo Image
The Replit Team

Artificial intelligence development in Python unlocks powerful possibilities, from data analysis to automation. The language's extensive libraries and simple syntax make it an ideal choice for AI projects.

In this article, you'll learn core AI techniques and discover practical tips for your own projects. You will also explore real-world applications and find essential advice to debug your models, which lets you build with confidence.

Using scikit-learn for your first AI model

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier

X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

model = RandomForestClassifier()
model.fit(X_train, y_train)

print(f"Model accuracy: {model.score(X_test, y_test):.2f}")--OUTPUT--Model accuracy: 0.96

This example uses scikit-learn to train a model on the classic Iris dataset. The most important step is splitting the data with train_test_split. This lets you validate the model's performance on data it hasn't seen during training, which is a crucial practice to avoid overfitting.

  • The RandomForestClassifier is a powerful algorithm that builds multiple decision trees to improve predictive accuracy and control for errors.
  • You train the model on your data using the model.fit() method, then check its performance on the test set with model.score().

Building neural networks with popular frameworks

While scikit-learn is great for foundational models, frameworks like TensorFlow, PyTorch, and NLTK let you build sophisticated neural networks and tackle natural language.

Creating a neural network with TensorFlow and Keras

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

model = Sequential([
   Dense(128, activation='relu', input_shape=(4,)),
   Dense(64, activation='relu'),
   Dense(3, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
print(model.summary())--OUTPUT--Model: "sequential"
_________________________________________________________________
Layer (type)                Output Shape              Param #  
=================================================================
dense (Dense)               (None, 128)               640      
dense_1 (Dense)             (None, 64)                8256      
dense_2 (Dense)             (None, 3)                 195      
=================================================================
Total params: 9,091
Trainable params: 9,091
Non-trainable params: 0
_________________________________________________________________
None

This code builds a neural network using Keras's Sequential API, which lets you stack layers in order. The model starts with an input layer expecting four features, defined by input_shape=(4,), and includes a hidden layer to process information.

  • The Dense layers are fully connected, and the relu activation function helps the model learn complex patterns.
  • The final layer uses a softmax activation to output a probability for each of the three possible classes.

Before training, you configure the model with model.compile(). This step sets the optimizer that guides the learning process, the loss function that measures error, and the metrics you'll use to evaluate performance.

Implementing machine learning with PyTorch

import torch
import torch.nn as nn

class SimpleNN(nn.Module):
   def __init__(self):
       super(SimpleNN, self).__init__()
       self.layer1 = nn.Linear(4, 64)
       self.layer2 = nn.Linear(64, 3)
       self.relu = nn.ReLU()
       
   def forward(self, x):
       x = self.relu(self.layer1(x))
       x = self.layer2(x)
       return x

model = SimpleNN()
print(model)--OUTPUT--SimpleNN(
 (layer1): Linear(in_features=4, out_features=64, bias=True)
 (layer2): Linear(in_features=64, out_features=3, bias=True)
 (relu): ReLU()
)

PyTorch offers a more flexible, object-oriented way to build models. You define your network as a class that inherits from nn.Module, giving you full control over its structure and behavior.

  • In the __init__() method, you set up the network's components, such as the nn.Linear layers.
  • The forward() method defines the data's path through the network, applying activation functions like nn.ReLU() between layers to process the information.

Natural language processing with NLTK

import nltk
from nltk.tokenize import word_tokenize
from nltk.sentiment import SentimentIntensityAnalyzer

nltk.download('punkt', quiet=True)
nltk.download('vader_lexicon', quiet=True)

text = "I love working with artificial intelligence in Python!"
tokens = word_tokenize(text)
analyzer = SentimentIntensityAnalyzer()
sentiment = analyzer.polarity_scores(text)

print(f"Tokens: {tokens[:5]}...")
print(f"Sentiment: {sentiment}")--OUTPUT--Tokens: ['I', 'love', 'working', 'with', 'artificial']...
Sentiment: {'neg': 0.0, 'neu': 0.492, 'pos': 0.508, 'compound': 0.8074}

The Natural Language Toolkit (NLTK) is a powerful library for processing human language. The code first performs tokenization, using word_tokenize to split the sentence into a list of individual words. This step prepares the text for deeper analysis.

  • The SentimentIntensityAnalyzer then evaluates the emotional tone of the text.
  • Calling the polarity_scores() method calculates the sentiment, returning a dictionary with scores for positive, negative, and neutral feelings, plus a single compound score.

Advanced AI techniques and applications

Building on these foundational frameworks, you can now tackle more specialized fields like reinforcement learning, computer vision, and generative AI.

Reinforcement learning with OpenAI Gym

import gym
import numpy as np

env = gym.make('CartPole-v1')
observation = env.reset()

total_reward = 0
for _ in range(100):
   action = env.action_space.sample()  # Random action
   observation, reward, done, info = env.step(action)
   total_reward += reward
   if done:
       break

print(f"Observation shape: {observation.shape}")
print(f"Total reward: {total_reward}")--OUTPUT--Observation shape: (4,)
Total reward: 23.0

OpenAI Gym provides standardized environments for training reinforcement learning agents. This code initializes the CartPole-v1 environment, a classic task where an agent learns to balance a pole on a cart. The agent interacts with this world by taking actions and receiving feedback to guide its learning.

  • The core of the interaction is the env.step() method, which executes an action and returns the new state (observation), a reward, and a done flag to signal the end of an episode.
  • In this example, the agent isn't smart yet—it's just taking random actions using env.action_space.sample().
  • The goal is to train an agent to choose actions that maximize its total reward over time.

Computer vision with OpenCV

import cv2
import numpy as np

image = np.zeros((200, 200, 3), dtype=np.uint8)
cv2.rectangle(image, (50, 50), (150, 150), (0, 255, 0), -1)
cv2.circle(image, (100, 100), 30, (0, 0, 255), -1)

gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray, 30, 100)

print(f"Image shape: {image.shape}")
print(f"Detected edges: {np.sum(edges > 0)} pixels")--OUTPUT--Image shape: (200, 200, 3)
Detected edges: 628 pixels

OpenCV is a powerful library for computer vision tasks. This code first creates a blank image as a NumPy array, then draws shapes on it using functions like cv2.rectangle() and cv2.circle(). This demonstrates how you can programmatically generate or modify visual content before analysis.

  • The image is converted to grayscale with cv2.cvtColor(), a common preprocessing step that simplifies image data.
  • Next, the cv2.Canny() function performs edge detection, identifying the outlines of the shapes by finding sharp changes in pixel intensity.

Generative AI with GANs

import tensorflow as tf
from tensorflow.keras.layers import Dense, Reshape, Input
from tensorflow.keras.models import Model

def build_generator():
   input_layer = Input(shape=(100,))
   x = Dense(128, activation='relu')(input_layer)
   x = Dense(784, activation='sigmoid')(x)
   output = Reshape((28, 28, 1))(x)
   return Model(input_layer, output)

generator = build_generator()
print(f"Input shape: {generator.input_shape}")
print(f"Output shape: {generator.output_shape}")--OUTPUT--Input shape: (None, 100)
Output shape: (None, 28, 28, 1)

This code defines the generator half of a Generative Adversarial Network (GAN), a model designed to create new, synthetic data. The generator's role is to produce convincing fakes—in this case, images. It starts with a random noise vector, which acts as a seed for the creative process.

  • The Dense layers transform this random input into a more structured format, and the sigmoid activation function scales the output to a pixel-friendly range between 0 and 1.
  • Finally, the Reshape layer converts the flat data array into a 28x28 pixel image, which is the generator's final output.

Move faster with Replit

Replit is an AI-powered development platform that transforms natural language into working applications. You can describe what you want to build, and Replit Agent creates it—complete with databases, APIs, and deployment.

For the AI techniques we've explored, Replit Agent can turn them into production tools:

  • Build a sentiment analysis tool that processes customer reviews, using the principles of natural language processing.
  • Create an image recognition utility that identifies and categorizes objects in uploaded photos.
  • Deploy a simple game where an agent learns to navigate its environment through trial and error.

Describe your app idea, and Replit Agent writes the code, tests it, and fixes issues automatically, all in your browser.

Common errors and challenges

Building AI in Python involves navigating common errors, but with the right techniques, you can solve them efficiently.

  • Handling missing values in scikit-learn models is a crucial preprocessing step. Models can't process datasets with missing values, which often causes training to fail before it begins. You'll need to handle these gaps before feeding data to an algorithm like RandomForestClassifier.
  • The most straightforward solution is using an imputer. scikit-learn's SimpleImputer can replace missing entries with the mean, median, or most frequent value in a column, ensuring your dataset is complete.
  • Fixing input shape errors in TensorFlow models is a frequent challenge. You'll often see a ValueError when the data you're training on doesn't fit the dimensions your model's first layer was built to accept. This mismatch brings everything to a halt.
  • To fix this, first check your data's shape and make sure the input_shape parameter in your initial layer—like Dense(128, activation='relu', input_shape=(4,))—matches your data's feature count. The model.summary() output is your best friend here, as it lays out the expected dimensions for every layer.
  • Preventing memory leaks with PyTorch gradients is key for efficient model evaluation. PyTorch's automatic differentiation is powerful, but it builds a computation history that can cause memory to skyrocket if you don't disable it during inference.
  • The best practice is to wrap any code that doesn't need gradient tracking, like validation loops, inside a with torch.no_grad(): block. This simple context manager tells PyTorch to skip building the computation graph, saving significant memory. You can also call .detach() on a tensor to remove it from the graph when you only need its value.

Handling missing values in scikit-learn models

Missing data is a common roadblock in machine learning. Most scikit-learn models, like RandomForestRegressor, can't process datasets with empty cells, which will cause your training process to fail. The code below shows what happens when you try to train with missing values.

from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
import numpy as np

X, y = load_diabetes(return_X_y=True)
X[10:20, 0] = np.nan  # Create missing values

model = RandomForestRegressor()
model.fit(X, y)  # Will fail with missing values

The code fails because np.nan intentionally introduces missing values into the dataset. The RandomForestRegressor cannot process these gaps when model.fit() is called, which triggers an error. The following example shows how to resolve this.

from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.impute import SimpleImputer
import numpy as np

X, y = load_diabetes(return_X_y=True)
X[10:20, 0] = np.nan  # Create missing values

imputer = SimpleImputer(strategy='mean')
X_imputed = imputer.fit_transform(X)
model = RandomForestRegressor()
model.fit(X_imputed, y)

The fix is to use scikit-learn's SimpleImputer to fill in the missing data before training. By setting strategy='mean', it replaces each np.nan value with the average of its column. The imputer.fit_transform() method applies this logic, creating a clean dataset. Now, when you call model.fit() on the imputed data, the training process runs without errors. It's a crucial preprocessing step to watch for with any dataset.

Fixing input shape errors in TensorFlow models

Input shape errors are a classic 'gotcha' in TensorFlow, usually triggering a ValueError when your data's dimensions don't align with the model's input_shape. This mismatch halts training. The following code shows what happens when the training data has an unexpected shape.

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

model = Sequential([
   Dense(64, activation='relu', input_shape=(20,)),
   Dense(10, activation='softmax')
])

X_train = tf.random.normal((100, 28))  # Mismatch with model input shape
model.fit(X_train, tf.random.normal((100, 10)), epochs=5)

The error occurs because the model's first layer expects 20 features via input_shape=(20,), but the training data X_train provides 28. This dimensional mismatch halts the process. The corrected code below shows how to fix this.

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

X_train = tf.random.normal((100, 28))
y_train = tf.random.normal((100, 10))

model = Sequential([
   Dense(64, activation='relu', input_shape=(28,)),  # Fixed input shape
   Dense(10)
])
model.fit(X_train, y_train, epochs=5)

The fix is to align the model's `input_shape` with your data's dimensions. In the corrected code, the first `Dense` layer's `input_shape` is set to `(28,)` to match the 28 features in the training data, ensuring the model knows what to expect. This error often appears when you first define your model or when data preprocessing steps change the number of features, so always double-check that your data's shape matches the `input_shape` argument.

Preventing memory leaks with PyTorch gradients

PyTorch's autograd engine accumulates gradients every time you call loss.backward(). Forgetting to clear them between training iterations is a common mistake that leads to uncontrolled memory growth and eventual crashes. The following code demonstrates this exact problem in action.

import torch

model = torch.nn.Linear(1000, 1000)
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
x = torch.randn(100, 1000)

for i in range(10):
   y_pred = model(x)
   loss = y_pred.sum()
   loss.backward()  # Gradients accumulate without clearing
   optimizer.step()

In this loop, loss.backward() adds new gradients on top of old ones from previous iterations. Because they're never cleared, they compound, leading to incorrect model updates and growing memory use. See how to fix this below.

import torch

model = torch.nn.Linear(1000, 1000)
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
x = torch.randn(100, 1000)

for i in range(10):
   optimizer.zero_grad()  # Clear previous gradients
   y_pred = model(x)
   loss = y_pred.sum()
   loss.backward()
   optimizer.step()

The fix is to call optimizer.zero_grad() at the start of each training loop. This clears out the old gradients from the previous pass. If you forget this step, PyTorch keeps adding new gradients to the old ones, which corrupts your model's learning and eats up memory. It's a crucial step to include right before you calculate the loss and run loss.backward() in every training iteration.

Real-world applications

Beyond fixing errors, you can tackle real-world tasks like time series forecasting with Prophet and hyperparameter tuning with Optuna.

Time series forecasting with Prophet

Prophet simplifies time series forecasting by automatically detecting trends and seasonality, making it an excellent tool for creating reliable predictions from time-based data.

from prophet import Prophet
import pandas as pd
import numpy as np

# Create example data
df = pd.DataFrame({
   'ds': pd.date_range(start='2020-01-01', periods=365),
   'y': np.random.normal(0, 1, 365).cumsum() + 100
})

model = Prophet()
model.fit(df)
future = model.make_future_dataframe(periods=30)
forecast = model.predict(future)
print(forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].tail(3))

The code prepares a pandas DataFrame with two specific columns: ds for dates and y for the values you want to predict. This structure is a requirement for using Prophet. The model then uses this data to learn patterns.

  • Training happens with a single call to model.fit().
  • You then create a placeholder for future dates using model.make_future_dataframe().
  • Finally, model.predict() fills this future data with predictions, including the forecast itself (yhat) and its confidence intervals.

Optimizing model hyperparameters with Optuna

Finding the right settings for your model can feel like guesswork, but Optuna turns it into a systematic process by automatically testing hyperparameter combinations to find the most effective ones.

import optuna
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import cross_val_score

def objective(trial):
   X, y = load_iris(return_X_y=True)
   n_estimators = trial.suggest_int('n_estimators', 10, 100)
   max_depth = trial.suggest_int('max_depth', 2, 10)
   
   model = RandomForestClassifier(n_estimators=n_estimators, max_depth=max_depth)
   return cross_val_score(model, X, y, cv=5).mean()

study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=20)
print(f"Best parameters: {study.best_params}")
print(f"Best accuracy: {study.best_value:.4f}")

This code automates finding the best settings for a RandomForestClassifier using Optuna. It works by defining an objective function that the library can call repeatedly to test different configurations.

  • Inside this function, the trial object suggests values for hyperparameters like n_estimators and max_depth within a specified range.
  • The function then trains a model with these settings and returns its average accuracy, which is calculated using cross_val_score.

The study object runs this process for 20 trials, intelligently searching for the hyperparameter combination that maximizes the model's accuracy.

Get started with Replit

Turn what you've learned into a real tool. Tell Replit Agent to "build a dashboard to visualize Optuna study results" or "create a utility that uses OpenCV to detect shapes in an image."

The agent writes the code, tests for errors, and deploys your app from a single prompt. Start building with Replit.

Get started free

Create and deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.

Get started for free

Create & deploy websites, automations, internal tools, data pipelines and more in any programming language without setup, downloads or extra tools. All in a single cloud workspace with AI built in.