Skip to main content
Article·scikit-learn.org
machine-learningdata-analysisclassificationpythondata-sciencescikit-learnlogistic-regression

scikit-learn

Get started with scikit-learn for machine learning in Python. Learn to load data, train models, and make predictions with this powerful library.

beginner30 minutes6 steps
The play
  1. Install scikit-learn
    Install scikit-learn using pip. This command will download and install the latest version of the library along with its dependencies.
  2. Load the Iris dataset
    Load the Iris dataset, a classic dataset for classification, using scikit-learn's built-in datasets module.
  3. Split data into training and testing sets
    Split the dataset into training and testing sets using `train_test_split`. This allows you to evaluate the performance of your model on unseen data.
  4. Train a Logistic Regression model
    Create and train a Logistic Regression model using the training data. Logistic Regression is a linear model used for classification tasks.
  5. Make predictions
    Use the trained model to make predictions on the test data.
  6. Evaluate the model
    Evaluate the performance of the model using metrics like accuracy. This provides insight into how well the model is generalizing to new data.
Starter code
# Install scikit-learn
# pip install scikit-learn

# Load the Iris dataset
# from sklearn.datasets import load_iris
# iris = load_iris()
# X, y = iris.data, iris.target
# print(X.shape)
# print(y.shape)

# Split data into training and testing sets
# from sklearn.model_selection import train_test_split
# X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# print(X_train.shape)
# print(X_test.shape)

# Train a Logistic Regression model
# from sklearn.linear_model import LogisticRegression
# model = LogisticRegression(random_state=42)
# model.fit(X_train, y_train)

# Make predictions
# y_pred = model.predict(X_test)
# print(y_pred)

# Evaluate the model
# from sklearn.metrics import accuracy_score
# accuracy = accuracy_score(y_test, y_pred)
# print(f'Accuracy: {accuracy}')
Source
scikit-learn — Action Pack