Unit-3 Packaging ML Model
Unit-3 Packaging ML Model
Outcome: The model is saved in iris_model.pkl, which can be loaded later for predictions.
Step 2: Dependency Management
• Every ML model relies on libraries like scikit-learn, numpy, or
pandas.
• Packaging requires managing these dependencies to ensure
the model works in any environment.
✓requirements.txt: This file lists the libraries required to
run the ML model.
✓environment.yml: If using Conda, this file captures the
Python version, dependencies, and system packages.
Example requirements.txt:
scikit-learn==1.0.2
joblib==1.1.0
numpy==1.21.0
Step 3: Project Structure
• Organizing your code, model, and resources is crucial for packaging.
• A well-structured project ensures code maintainability and makes it easy to build the package.
• Here’s an example project structure:
Step 4: Configuration Files
• Configuration files, typically in formats like .yaml, .json, or .ini, store environment-
specific settings (e.g., paths, thresholds, or model hyperparameters) so that the
package can be easily reconfigured in different environments.
Example config.yaml:
Usage: The config.yaml file defines where the model is stored and which features are used for
prediction
Step 5: Building the Package
• Once the model and scripts are in place, create a setup.py file to define how your
project can be installed as a Python package.
• The setup file specifies the package name, version, dependencies, and entry points
(if needed).
Example setup.py:
from setuptools import setup, find_packages
setup(
name="ml_model_package",
version="0.1",
packages=find_packages(),
install_requires=[
"scikit-learn",
"numpy",
"joblib"
],
entry_points={
'console_scripts': [
'predict=ml_model_package.predict:main',
]
}
)
Step 6: Distribution and Installation
Once the setup file is ready, you can build and install the package.
1. Install Locally:
pip install .
This command installs the package in your Python environment.
python ml_model_package/predict.py
Example Dockerfile:
Build and run the container:
Step-by-Step Process to Build an ML Package:
Step 1: Create a Virtual Environment
• Before packaging, create a virtual environment to ensure that all
dependencies are isolated.
1. Open Command Prompt and create a virtual environment:
ml_env\Scripts\activate
Step 2: Create a Project Directory
# Load dataset
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
# Train model
model = RandomForestClassifier()
model.fit(X_train, y_train)
import joblib
import numpy as np
def predict_iris():
# Predict using the model
# Sample input for prediction
sample_input = [[5.1, 3.5, 1.4, 0.2]] # Example for Iris dataset
# Make prediction
prediction = model.predict(sample_input)
print(f"Prediction: {prediction}")
if __name__ == "__main__":
predict_iris()
Step 4: Define the Package with setup.py
The setup.py file defines how your project will be packaged and installed.
import sys
sys.path.append('.')
setup(
name="ml_model_package",
version="0.1",
packages=find_packages(),
install_requires=[
"scikit-learn",
"joblib",
"numpy"
],
entry_points={
'console_scripts': [
'predict=ml_model_package.predict:predict_iris',
]
}
)
Step 5: Create __init__.py in ml_model_package
The __init__.py file is used to mark a directory as a Python package. It also allows you to
initialize or configure things when the package is imported. For your ml_model_package,
the __init__.py file can remain simple or can be used to import functions or classes to
make them accessible at the package level.
Example Contents for __init__.py
Suppose you want to make the predict function from the predict.py module directly accessible
when someone imports the package. If you have multiple useful functions across different
modules (e.g., train_model.py and predict.py), you can add them in __init__.py to provide
access to everything at the top level of your package
Now, if someone imports your package, they can access the predict_iris function.
Step 6: Create config.yaml file
Create config.yaml file in ml_model_package
Check config.yaml FileEnsure the config.yaml file has the correct path to the model:
Add the following code to config.yaml
# config.yaml
model_path: "data/iris_model.pkl"
input_columns: ["sepal_length", "sepal_width", "petal_length", "petal_width"]
Step 6: Install the Package Locally
pip install .
This will package the entire project, including the scripts and dependencies, and
install it in your virtual environment.
Step 7: Run the Prediction Script
Once the package is installed, you can use the predict command directly from
the terminal:
predict