Python Core Interview Questions
Python Core Interview Questions
1. What are Python’s key features that make it popular for AI and automation?
• Simplicity and Readability: Python’s clean syntax makes it easy to write and
understand code, especially for complex algorithms in AI and automation.
• Rich Libraries: Python has a wide range of libraries like TensorFlow, PyTorch,
NumPy, Pandas, and Scikit-learn for AI, and libraries like Selenium, BeautifulSoup,
and requests for automation tasks.
• Extensibility: Python integrates well with other languages (e.g., C, C++) and tools,
making it suitable for building scalable solutions in AI and automation.
• == (Equality operator): Compares the values of two objects. It checks if the data or
content in the objects are the same. Example:
• a = [1, 2, 3]
• b = [1, 2, 3]
• a = [1, 2, 3]
• b=a
• Mutable Data Types: These types allow modification of their content after creation.
Example: Lists, sets, dictionaries.
• lst = [1, 2, 3]
• Immutable Data Types: These types cannot be modified after creation. Example:
Strings, tuples, integers.
• tup = (1, 2, 3)
• # tup[0] = 100 # TypeError: 'tuple' object does not support item assignment
• Shallow Copy: Creates a new object but does not create copies of nested objects;
it only copies references to them. Example:
• import copy
• b = copy.copy(a)
• b[0][0] = 100
• Deep Copy: Creates a new object and also recursively copies all objects nested
within it. Example:
• import copy
• a = [[1, 2], [3, 4]]
• b = copy.deepcopy(a)
• b[0][0] = 100
• lst = [1, 2, 3]
• tup = (1, 2, 3)
• Set: Unordered, mutable collection that does not allow duplicate values.
• st = {1, 2, 3}
Python supports multi-threading using the threading module, but due to the Global
Interpreter Lock (GIL), it is not suitable for CPU-bound tasks. Python threads are better
suited for I/O-bound tasks (e.g., file operations, network requests). For CPU-bound tasks,
multiprocessing is preferred because it bypasses the GIL by creating separate processes.
import threading
def print_numbers():
for i in range(5):
print(i)
thread = threading.Thread(target=print_numbers)
thread.start()
thread.join()
• with: A context manager that automatically handles opening and closing files.
Example:
f.write('Hello, World!')
• __init__: This method initializes an instance of the class after the object is created. It
is called when a new object is instantiated.
• class MyClass:
• self.name = name
• __new__: This method is responsible for creating a new instance of the class. It is
called before __init__.
• class MyClass:
• def __new__(cls):
• return super().__new__(cls)
• __call__: This method allows an instance of the class to be called like a function.
• class MyClass:
• def __call__(self):
• return 'Hello'
• obj = MyClass()
• Encapsulation: Bundling the data (attributes) and methods (functions) that operate
on the data into a single unit (class) and restricting access to the internals using
access modifiers (private, public, etc.). Example:
• class Car:
• def get_model(self):
• return self.__model
• class Vehicle:
• def start(self):
• class Car(Vehicle):
• def drive(self):
• return "Driving the car"
• class Dog:
• def speak(self):
• return "Woof"
• class Cat:
• def speak(self):
• return "Meow"
• Abstraction: Hiding the complex implementation details and exposing only the
necessary functionality. Example:
• class Animal(ABC):
• @abstractmethod
• def speak(self):
• pass
• Instance method: Defined by default in a class, it takes the instance (self) as the
first parameter.
• class MyClass:
• def instance_method(self):
• print(self)
• staticmethod: A method that does not require access to an instance or class and
does not take self or cls as the first argument.
• class MyClass:
• @staticmethod
• def static_method():
• classmethod: A method that takes cls as its first parameter, which represents the
class itself, not an instance.
• class MyClass:
• @classmethod
• def class_method(cls):
The Method Resolution Order (MRO) is the order in which Python looks for a method in the
class hierarchy. Python uses the C3 Linearization algorithm to determine the MRO in the
case of multiple inheritance.
Example:
class A:
def method(self):
print("Method in A")
class B(A):
def method(self):
print("Method in B")
class C(A):
def method(self):
print("Method in C")
class D(B, C):
pass
print(D.mro())
Output:
• Inheritance: One class inherits the properties and methods of another class.
Example: Dog is a Mammal.
Duck typing in Python means that the type or class of an object is determined by its
behavior (methods and properties) rather than its explicit inheritance or interface. If an
object behaves like a certain type, it can be treated as that type.
Example:
class Dog:
def speak(self):
print("Woof")
class Duck:
def speak(self):
print("Quack")
def make_speak(animal):
Operator overloading allows custom behavior for standard operators. You define special
methods like __add__, __sub__, etc., to overload operators.
Example:
class Point:
self.x = x
self.y = y
The Singleton pattern ensures that only one instance of a class exists. You can implement it
by controlling the instantiation using a class variable.
Example:
class Singleton:
_instance = None
def __new__(cls):
if not cls._instance:
cls._instance = super().__new__(cls)
return cls._instance
18. What is the difference between AI, ML, and Deep Learning?
• Deep Learning: A subset of ML that deals with neural networks with many layers
(also called deep neural networks). These models are capable of learning from large
amounts of unstructured data like images, audio, and text. Examples include CNNs
(Convolutional Neural Networks) for image recognition and RNNs (Recurrent Neural
Networks) for time series data.
• Unsupervised Learning: Here, the algorithm is given unlabeled data and must find
patterns or structures within the data.
Example: Clustering customers into different segments based on purchasing
behavior without any pre-labeled groups.
• Overfitting occurs when a model learns the details and noise in the training data to
such an extent that it negatively impacts the performance of the model on new data
(generalization).
Prevention Techniques:
• Bias refers to errors due to overly simplistic models that cannot capture the
underlying data structure. Variance refers to errors due to a model being too
complex and sensitive to small fluctuations in the training data.
• The tradeoff is that increasing model complexity (e.g., more features or deeper
trees) decreases bias but increases variance. The goal is to find the optimal balance
where both bias and variance are minimized, leading to good generalization.
22. What are the main differences between logistic regression and decision trees?
• Decision Trees: A non-linear model that splits data into branches based on feature
values to make predictions. Decision trees are interpretable and can handle both
classification and regression tasks. Example: Predicting if a loan will be approved
based on multiple features like income, credit score, etc.
• Boosting: Involves sequentially training models, where each new model corrects
the errors of the previous ones. Boosting reduces bias. Example: AdaBoost and
Gradient Boosting are popular boosting algorithms.
o ReLU (Rectified Linear Unit): Popular for hidden layers because it prevents
vanishing gradients.
• Resampling Techniques:
• Synthetic Data Generation: Use methods like SMOTE (Synthetic Minority Over-
sampling Technique) to generate synthetic samples for the minority class.
• Filter Methods: Select features based on statistical tests (e.g., correlation, Chi-
square test).
• Embedded Methods: Feature selection occurs during the model training process
(e.g., Lasso regression, decision trees).
4. Feature engineering: Create new features that may help the model.
• Use case: PCA is used when dealing with high-dimensional data to reduce the
number of features while retaining most of the variance in the data. Example:
Reducing the number of features in a dataset of images while maintaining important
information.
30. What evaluation metrics would you use for a classification model?
• Precision: The number of true positives divided by the total number of positive
predictions.
• Recall: The number of true positives divided by the total number of actual positives.
1. Train and save the model using libraries like scikit-learn or TensorFlow.
2. Create a Flask web application with routes to handle input data and return
predictions.
4. Pass input data from HTTP requests to the model and return the predictions as HTTP
responses.
5. Deploy the Flask app on a server or cloud platform (e.g., AWS, Heroku).
• Type Checking: FastAPI uses Python's type hints to validate request and response
types automatically, reducing errors.
3. Batch Processing: For large datasets, process data in batches rather than
individually.
Web scraping involves extracting data from websites. It typically involves sending HTTP
requests to retrieve the HTML content of web pages, then parsing and extracting specific
information from that content (e.g., using regex, CSS selectors, or XPath). Scrapers
simulate human browsing to collect data in a structured format like JSON or CSV.
import requests
url = 'https://news.ycombinator.com/'
response = requests.get(url)
print(headline.text)
• Selenium: A tool for automating web browsers. It allows you to simulate user
interactions, like clicks and scrolling, and retrieve dynamic content generated by
JavaScript. Example: You can use Selenium to scrape a page that loads content
after the initial HTML is loaded (AJAX calls).
• BeautifulSoup: A Python library used to parse and extract data from static HTML. It
is not suited for dynamically loaded content. Example: BeautifulSoup can parse
HTML to extract information from a static page, but if content is dynamically loaded
(via JavaScript), it may not be sufficient.
To automate a browser in Python, you can use Selenium, which can control web browsers
like Chrome or Firefox. Selenium interacts with web elements and performs tasks like
clicking buttons, filling forms, and navigating between pages.
Example:
driver = webdriver.Chrome()
driver.get('https://www.example.com')
button = driver.find_element_by_id('submit')
button.click()
driver.quit()
37. What are headless browsers, and why are they useful?
A headless browser is a web browser that does not display a graphical user interface. It
can be controlled programmatically to interact with web pages, similar to a regular
browser, but without the need for a visible interface.
Use case: Headless browsers are useful in environments where displaying a browser
interface is not needed, such as in automated testing or web scraping.
Example: You can use Selenium with a headless browser to scrape a website in an
automated script:
options = Options()
options.headless = True
driver = webdriver.Chrome(options=options)
driver.get('https://www.example.com')
Handling CAPTCHA is difficult as it's designed to prevent automation. However, you can
approach it in a few ways:
• Automated Interaction: For simple CAPTCHA challenges, tools like Selenium may
simulate user interaction if the CAPTCHA is not too complex.
• Request API access: Some websites provide an API for developers, allowing access
to the data without dealing with CAPTCHA.
An API (Application Programming Interface) is a set of rules that allow different software
applications to communicate with each other. APIs enable automation by allowing you to
interact with external systems programmatically, sending and receiving data without
manual intervention.
import requests
response = requests.get('https://api.example.com/data')
data = response.json()
To send a POST request in Python, you can use the requests library:
import requests
url = 'https://www.example.com/api'
print(response.text)
An API token is a unique identifier that grants access to an API. It is used for authentication
to verify the user or application making the request. Tokens are typically passed in the HTTP
headers to secure API endpoints.
Example:
import requests
• Check the rate limit headers: Many APIs return rate limit information in the
response headers (X-RateLimit-Remaining, X-RateLimit-Reset).
• Pause requests: If you hit the rate limit, implement a sleep or wait mechanism to
pause your requests until the limit is reset.
Example:
import time
response = requests.get('https://api.example.com/data')
remaining_requests = int(response.headers['X-RateLimit-Remaining'])
if remaining_requests == 0:
reset_time = int(response.headers['X-RateLimit-Reset'])
time.sleep(sleep_time)
44. What are cron jobs, and how do you schedule Python scripts?
A cron job is a time-based job scheduler in Unix-like operating systems. It allows you to run
scripts or commands at specified times or intervals.
2. Add a cron job entry, specifying the time and command to run the Python script.
0 5 * * * /usr/bin/python3 /path/to/script.py
@app.task
return x + y
You can use Redis as the message broker for Celery, allowing you to manage and distribute
tasks across worker nodes. Redis stores tasks in a queue, and Celery workers consume
them asynchronously.
Importance: Docker ensures that automation scripts and machine learning models run
reliably in any environment, reducing the risk of "works on my machine" issues.
2. CI Setup: Configure Jenkins or GitHub Actions to automatically run tests and build
the project whenever code is pushed to the repository.
3. Model Training: Automate the training pipeline using tools like TensorFlow Extended
(TFX) or custom scripts.
4. Deployment: Use Docker to containerize the model and deploy it using Kubernetes
or a cloud platform.
49. Explain GitHub Actions and Jenkins.
• GitHub Actions: An integrated CI/CD tool within GitHub that automates workflows
for building, testing, and deploying applications. Example: Automatically trigger
tests when code is pushed to the repository.
Coding Challenges:
# Setup WebDriver
driver = webdriver.Chrome()
driver.get('https://example.com/login')
driver.find_element(By.ID, 'username').send_keys('my_username')
driver.find_element(By.ID, 'password').send_keys('my_password')
driver.find_element(By.ID, 'submit').click()
print(data)
driver.quit()
import smtplib
app = FastAPI()
@app.post("/send-email/")
sender_email = "[email protected]"
receiver_email = recipient
password = "mypassword"
msg = MIMEMultipart()
msg['From'] = sender_email
msg['To'] = receiver_email
msg['Subject'] = subject
msg.attach(MIMEText(body, 'plain'))
server.starttls()
server.login(sender_email, password)
text = msg.as_string()
To design a scalable AI automation system, you should consider the following aspects:
1. Modular Architecture: Use microservices to break down the system into smaller,
independently deployable components. Each component should handle specific
tasks (e.g., data preprocessing, model training, inference, and logging).
4. Asynchronous Processing: Use task queues (e.g., Celery with Redis or RabbitMQ)
for long-running tasks, ensuring that the system can process tasks asynchronously.
5. Cloud Platforms: Deploy the system on cloud services (AWS, GCP, Azure) to
leverage auto-scaling, managed Kubernetes clusters, and GPU instances for deep
learning.
Example: You can design a system where users send requests for predictions. A web API
(Flask/FastAPI) receives these requests, stores them in a queue (Redis), and workers
asynchronously process them using a trained model stored in a model registry.
• Scalability: Different parts of the automation pipeline (e.g., data processing, model
training, and inference) can scale independently.
• Fault Isolation: If one service fails, others remain unaffected, improving system
reliability.
Example: In an AI pipeline, one microservice could handle data ingestion, another could
handle feature engineering, another could manage the model training, and another could
handle deployment and inference.
For an AI-powered chatbot, a NoSQL database like MongoDB or a relational database like
PostgreSQL can be used depending on the requirements.
• MongoDB: Ideal for storing unstructured data like conversations and user
messages, which can be flexible and schema-less.
• PostgreSQL: Suitable if you need structured data storage (e.g., user profiles,
interaction history) with complex querying capabilities.
"user_id": 1234,
"timestamp": "2025-02-05T10:00:00",
Use case: In an AI automation system, a message queue can be used to send training
tasks to different workers (e.g., a task to train a model on new data).
1. Rate Limiting: Prevent abuse and protect the system by limiting the number of
requests from each user within a given time frame.
2. Caching: Cache responses for repeated requests to reduce the load on the backend
and improve response times.
4. Horizontal Scaling: Scale out by adding more instances of the API service and using
a load balancer to distribute traffic evenly.
5. Asynchronous Processing: Use background jobs or queues (e.g., Celery,
RabbitMQ) for time-consuming tasks, ensuring the API can handle incoming
requests without delays.
1. Centralized Logging: Use tools like Elasticsearch, Logstash, and Kibana (ELK
stack) or Fluentd to collect, aggregate, and visualize logs from multiple services.
2. Log Shippers: Use agents like Filebeat or Fluentd to ship logs from individual
microservices to a central logging server.
3. Log Aggregators: Aggregate logs in real-time and store them in a central storage
system (e.g., Elasticsearch).
4. Alerting: Use monitoring tools (e.g., Prometheus, Grafana) to set up alerts based on
specific log patterns or thresholds (e.g., error rates exceeding a limit).
• SQL (Structured Query Language): Used with relational databases (e.g., MySQL,
PostgreSQL). It requires a predefined schema, supports ACID transactions, and is
ideal for structured data with relationships between entities. Example: A relational
table with columns for employee ID, name, and salary.
58. How would you design a database for an AI-based automation tool?
For an AI-based automation tool, the database design would depend on the type of tool
and its functionality. Generally:
• Task Queue Table: Store tasks with their statuses (queued, processing, completed).
• Logs Table: Store logs of automated actions (e.g., model predictions, API calls).
• User Table: Store user information if applicable.
• Model Metadata Table: Store details about trained models, including versions,
parameters, and performance metrics.
• Historical Data Table: Store historical data for training or evaluation purposes.
FROM users
GROUP BY email
Indexing is a database optimization technique used to speed up the retrieval of rows from
a table. An index creates a data structure that allows the database to find rows more
quickly.
• Impact on Performance:
Coding Challenges:
FROM employees
LIMIT 5;
Redis-Based Caching Layer for an API:
import redis
import time
app = FastAPI()
@app.get("/data/")
cached_data = cache.get("data_key")
if cached_data:
I am passionate about combining AI and automation to create scalable solutions that can
solve real-world problems efficiently. With my background in Python and AI, I am excited
about the opportunity to automate processes, improve productivity, and contribute to
innovative projects.
I prioritize tasks based on urgency and impact. I break down large projects into smaller,
manageable parts and use time management techniques like the Pomodoro method to
stay focused. When under pressure, I keep calm, communicate effectively with the team,
and adjust strategies as needed.
64. What would you do if you had to learn a new technology quickly?
I would start by identifying the key resources (documentation, tutorials, courses) and
dedicating focused time to learning the basics. I’d apply the concepts in small projects to
reinforce my understanding, and seek help from online communities if I encounter
difficulties.
I ensure that my automation scripts are well-tested by writing unit tests and integrating
them with a continuous integration pipeline. I also handle exceptions properly, log errors
for debugging, and use tools like linters to ensure clean and maintainable code.
?** One of the most challenging Python projects I worked on involved designing a traffic
light control system using ESP32. The system integrated real-time sensors, a pedestrian
button, and an I2C LCD display to simulate traffic flow and manage pedestrian requests
efficiently.
Prepare for these questions with examples from your experience, especially related to
Python and AI automation. Good luck with your interview!