Zomato Data Analysis Using Python
Last Updated :
16 May, 2025
Understanding customer preferences and restaurant trends is important for making informed business decisions in food industry. In this article, we will analyze Zomato’s restaurant dataset using Python to find meaningful insights. We aim to answer questions such as:
- Do more restaurants provide online delivery compared to offline services?
- Which types of restaurants are most favored by the general public?
- What price range do couples prefer for dining out?
Implementation for Zomato Data Analysis using Python.
Below steps are followed for its implementation.
Step 1: Importing necessary Python libraries.
We will be using Pandas, Numpy, Matplotlib and Seaborn libraries.
Python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
Step 2: Creating the data frame.
You can download the dataset from here.
Python
dataframe = pd.read_csv("/content/Zomato-data-.csv")
print(dataframe.head())
Output:
DatasetStep 3: Data Cleaning and Preparation
Before moving further we need to clean and process the data.
1. Convert the rate column to a float by removing denominator characters.
- dataframe['rate']=dataframe['rate'].apply(handleRate): Applies the handleRate function to clean and convert each rating value in the 'rate' column.
Python
def handleRate(value):
value=str(value).split('/')
value=value[0];
return float(value)
dataframe['rate']=dataframe['rate'].apply(handleRate)
print(dataframe.head())
Output:
Converting rate column to float2. Getting summary of the dataframe use df.info().
Python
Output:
Summary of dataset3. Checking for missing or null values to identify any data gaps.
Conclusion: There is no NULL value in dataframe.
Step 4: Exploring Restaurant Types
1. Let's see the listed_in (type) column to identify popular restaurant categories.
Python
sns.countplot(x=dataframe['listed_in(type)'])
plt.xlabel("Type of restaurant")
Output:

Conclusion: The majority of the restaurants fall into the dining category.
2. Votes by Restaurant Type
Here we get the count of votes for each category.
Python
grouped_data = dataframe.groupby('listed_in(type)')['votes'].sum()
result = pd.DataFrame({'votes': grouped_data})
plt.plot(result, c='green', marker='o')
plt.xlabel('Type of restaurant', c='red', size=20)
plt.ylabel('Votes', c='red', size=20)
Output:
Text(0, 0.5, 'Votes')

Conclusion: Dining restaurants are preferred by a larger number of individuals.
Step 5: Identify the Most Voted Restaurant
Find the restaurant with the highest number of votes.
Python
max_votes = dataframe['votes'].max()
restaurant_with_max_votes = dataframe.loc[dataframe['votes'] == max_votes, 'name']
print('Restaurant(s) with the maximum votes:')
print(restaurant_with_max_votes)
Output:
Highest number of votesStep 6: Online Order Availability
Exploring the online_order column to see how many restaurants accept online orders.
Python
sns.countplot(x=dataframe['online_order'])
Output:

Conclusion: This suggests that a majority of the restaurants do not accept online orders.
Step 7: Analyze Ratings
Checking the distribution of ratings from the rate column.
Python
plt.hist(dataframe['rate'],bins=5)
plt.title('Ratings Distribution')
plt.show()
Output:

Conclusion: The majority of restaurants received ratings ranging from 3.5 to 4.
Step 8: Approximate Cost for Couples
Analyze the approx_cost(for two people) column to find the preferred price range.
Python
couple_data=dataframe['approx_cost(for two people)']
sns.countplot(x=couple_data)
Output:

Conclusion: The majority of couples prefer restaurants with an approximate cost of 300 rupees.
Step 9: Ratings Comparison - Online vs Offline Orders
Compare ratings between restaurants that accept online orders and those that don't.
Python
plt.figure(figsize = (6,6))
sns.boxplot(x = 'online_order', y = 'rate', data = dataframe)
Output:

Conclusion: Offline orders received lower ratings in comparison to online orders which obtained excellent ratings.
Step 10: Order Mode Preferences by Restaurant Type
Find the relationship between order mode (online_order) and restaurant type (listed_in(type)).
- pivot_table = dataframe.pivot_table(index='listed_in(type)', columns='online_order', aggfunc='size', fill_value=0): Creates a pivot table counting restaurants by type and online order availability.
Python
pivot_table = dataframe.pivot_table(index='listed_in(type)', columns='online_order', aggfunc='size', fill_value=0)
sns.heatmap(pivot_table, annot=True, cmap='YlGnBu', fmt='d')
plt.title('Heatmap')
plt.xlabel('Online Order')
plt.ylabel('Listed In (Type)')
plt.show()
Output:

Conclusion: Dining restaurants primarily accept offline orders whereas cafes primarily receive online orders. This suggests that clients prefer to place orders in person at restaurants but prefer online ordering at cafes.
You can download the source code from here: Zomato Data Analysis
Similar Reads
Sequential Data Analysis in Python Sequential data, often referred to as ordered data, consists of observations arranged in a specific order. This type of data is not necessarily time-based; it can represent sequences such as text, DNA strands, or user actions.In this article, we are going to explore, sequential data analysis, it's t
8 min read
Exploratory Data Analysis in Python | Set 1 This article provides a comprehensive guide to performing Exploratory Data Analysis (EDA) using Python focusing on the use of NumPy and Pandas for data manipulation and analysis.Step 1: Setting Up EnvironmentTo perform EDA in Python we need to import several libraries that provide powerful tools for
4 min read
Learn Data Science Tutorial With Python Data Science has become one of the fastest-growing fields in recent years, helping organizations to make informed decisions, solve problems and understand human behavior. As the volume of data grows so does the demand for skilled data scientists. The most common languages used for data science are P
3 min read
Complete Data Analytics Training using Excel, SQL, Python & PowerBI Ready to deep dive into the world of Data Analytics to become excel in Data Analytics? So, get ready to resolve all the doubts of your curious minds. Yes, you hear right. We GeeksforGeeks with our comprehensive 'Data Analytics Training' using Excel, SQL, Python & PowerBI help you get deep insigh
8 min read
Top 15 Python Libraries for Data Analytics [2025 updated] Python is the language that has gained preference in data analytics due to simplicity, versatility and a very powerful ecosystem of libraries. If you are dealing with large data sets conducting statistical analysis or visualizing insights, it has a very wide range of libraries to facilitate the proc
10 min read
SweetViz | Automated Exploratory Data Analysis (EDA) SweetViz is an open-source Python library, this is used for automated exploratory data analysis (EDA), it helps data analysts/scientists quickly generate beautiful & highly detailed visualizations. The output, we get is a fully self-contained HTML application. The system built reports around qui
4 min read
Medical Analysis Using Python: Revolutionizing Healthcare with Data Science In recent years, the intersection of healthcare and technology has given rise to groundbreaking advancements in medical analysis. Imagine a doctor faced with lots of patient information and records, searching for clues to diagnose complex disease? Analysing this data is like putting together a medic
9 min read
Model Building for Data Analytics Prerequisite - Life Cycle Phases of Data Analytics After formulating the problem and preprocessing the data accordingly. We select the type of model we should build for our model. Like if our problem requires our result to have higher explainability then we use models like Linear regression or decis
5 min read
How to Setup Anaconda For Data Science? To start any data science project itâs important to set up your computer with the right tools. Anaconda is one of the most widely used platforms for data science with Python because it consist of many useful libraries and tools which are pre-installed. Please make sure your laptop or PC has at least
4 min read
Basic Python Charts Python Chart is part of data visualization to present data in a graphical format. It helps people understand the significance of data by summarizing and presenting huge amounts of data in a simple and easy-to-understand format and helps communicate information clearly and effectively. In this articl
6 min read