Profit Prediction Model Summary
This report presents the development and analysis of a profit prediction model for a retail
business using machine learning. The primary goal of this model is to forecast profits based on
key sales factors, which will help the business make informed decisions to enhance profitability.
As I began the analysis, the data included various details such as order information, customer
names, product categories, sales figures, discounts, and profits. After reviewing the dataset, I
found no missing values, which indicated that the data was clean and ready for analysis. The key
statistics from this dataset revealed an average sales amount of ₹1,496.60, an average discount
of 22.68%, and an average profit of ₹374.94.
For the model, I selected six key features to help predict profit, including Category, Sub
Category, City, Region, Sales, and Discount. The target variable, which is what we are trying
to predict, was set as profit. I thought this selection was appropriate as it included both
quantitative and categorical data relevant to profit outcomes.
To develop the model, I chose a Random Forest Regressor, believing it was suitable because it
effectively handles both numerical and categorical data. The dataset was split into two sets: 80%
for training and 20% for testing. During the preprocessing phase, I standardized the numerical
features, such as sales and discount, using scaling. I also transformed the categorical features,
such as city and category, using one-hot encoding to make them suitable for the model. As I built
the model with 100 decision trees, I assessed its consistency through 5-fold cross-validation,
which I believe is a good method to check the model’s reliability across different data subsets.
After implementing the model, I evaluated its performance using several key metrics. The Cross-
validation RMSE was 200.60 (+/- 3.02), and the Mean Absolute Error (MAE) was 161.08.
The Root Mean Squared Error (RMSE) came out to be 202.04, while the R-squared score
was 0.3166. This score suggests that the model explains about 31.66% of the variance in profit.
As I analyzed the results, I recognized that while the model has some predictive power, there is
still ample room for improvement to enhance its accuracy.
In examining the key findings, I noticed several important aspects. First, the feature importance
indicated that Sales had the highest impact on profits, followed by discounts, and then factors
like region, specific cities, and product categories. I observed a strong link between higher sales
and higher profits, as expected. Moreover, the analysis highlighted that discounts significantly
influenced profit predictions, emphasizing the need for careful management of discount
strategies. I also noticed that different regions, particularly the East, Central, and South, had
varying effects on profits. Cities like Pudukkottai, Kanyakumari, and Vellore were crucial for
profit prediction, which could help the business focus its marketing efforts effectively.
Additionally, certain product categories, especially food grains, were found to have a notable
impact on profitability.
Considering the business implications and strategies, I believe the company should primarily
focus on increasing overall sales, given that sales are the biggest factor affecting profit. I
recommend strategies such as promotions or product bundling to enhance sales volume.
Furthermore, since discounts significantly impact profits, it is important to analyze which
discount levels yield the most profit. Implementing dynamic pricing, which adjusts prices based
on demand, could also be beneficial.
As I think about regional strategies, the model indicates that different regions affect profits in
various ways. The company should allocate resources to high-performing regions while
addressing challenges in areas that are underperforming. Tailoring marketing efforts to specific
cities that showed a strong influence on profits could also improve outcomes. For category
management, ensuring proper inventory for product categories like food grains is essential. I
think the company should consider expanding or promoting these categories further.
Looking ahead, the current model is useful but can be improved. I suggest regularly updating the
model with new data to track changes and enhance accuracy. Additionally, exploring other
factors, such as seasonal trends or customer demographics, could further refine predictions.
Trying out different algorithms or adjusting model settings might also lead to better performance.
In conclusion, the profit prediction model offers valuable insights into what drives profitability in
retail. As I visualize the trend line between actual and predicted profits, I notice that it is quite
close, which suggests that the model is on the right track. By focusing on key areas like
increasing sales, managing discounts, and developing tailored strategies based on regional
insights, the company can make more informed decisions to improve profitability. Although the
model accounts for some of the variation in profits, I believe there is still potential for refinement
and the inclusion of additional data sources to enhance its predictive power.