Interview QNA-python
Interview QNA-python
Python is a high-level, interpreted programming language that is widely used for a variety of
applications due to its simplicity, versatility, and vast library of modules and packages. One of
the main advantages of Python is its ease of use and readability, which makes it accessible for
beginners and allows for faster development times. Additionally, Python's syntax is concise and
clear, which reduces the likelihood of errors and makes it easier to maintain code.
In Python, a module is a file that contains Python code, which can be imported and used in
other Python code. A package, on the other hand, is a collection of modules that are organized
into a directory hierarchy. This allows for more complex projects to be easily managed and
maintained. To create a module, you typically create a Python script with a .py extension. For
example, you could have a module named my_module.py. Inside this file, you can define
functions, classes, or other code that you want to make available to other programs.
To import a module in Python, you can use the `import` statement followed by the name of the
module. For example, `import math`. Here are the common import statements:
a. Import the entire module:
import module_name This allows you to access the module's contents using the module
name as a prefix. For example, if you have a module named math, you can use
functions from that module like this: math.sqrt(25).
b. Import specific items from a module:
from module_name import item_name
With this syntax, you can import specific functions, classes, or variables directly into your
code, without needing to use the module name as a prefix. For example: from math
import sqrt. Now you can directly use sqrt(25).
c. Import the entire module with a custom name:
import module_name as alias_name
This imports the entire module but assigns it a custom name (alias) that you specify. This
can be helpful if the module name is long or conflicts with another name in your code.
For example: import math as m. Now you can use m.sqrt(25).
a. Mutability: Lists are mutable, which means you can add, remove, or modify elements
after the list is created. Tuples, on the other hand, are immutable, meaning that once a
tuple is created, you cannot change its elements. If you need to modify a tuple, you
would need to create a new tuple with the desired changes.
b. Syntax: Lists are defined using square brackets [], while tuples are defined using
parentheses ().
c. Usage: Lists are typically used for collections of elements where the order and individual
elements may change. They are commonly used for sequences of data and when you
need to perform operations such as appending, extending, or removing elements.
Tuples, on the other hand, are often used for collections of elements where the order
and values should not change, such as coordinates, database records, or function
arguments.
d. Performance: Tuples are generally slightly more memory-efficient and faster to access
compared to lists. Since tuples are immutable, Python can optimize them internally.
Lists, being mutable, require additional memory allocation and support for dynamic
resizing.
e. Common Operations: Both lists and tuples support common operations such as
indexing, slicing, and iterating over elements. However, lists have additional methods like
append(), extend(), and remove() that allow for in-place modifications, which are not
available for tuples due to their immutability.
In Python, a lambda function is a small anonymous function that can be defined without a name.
Lambda functions are typically used for short and simple operations that can be defined inline in
the code, without the need to create a separate function.
Lambda functions are defined using the `lambda` keyword, followed by the function's arguments
and the operation that the function should perform. The syntax for a lambda function is as
follows:
```
lambda arguments: expression
```
For example, the following lambda function takes a number as input and returns its square:
```
square = lambda x: x**2
```
The `map` function in Python applies a function to each element in a sequence and returns a
new sequence with the results.In Python, the map() function is a built-in function that allows you
to apply a given function to each element of an iterable (such as a list, tuple, or string) and
returns an iterator that yields the results. It provides a concise way to perform the same
operation on every item in a collection without writing explicit loops. The syntax for the map()
function is as follows:
map(function, iterable)
Example:
In Python, the filter() function is a built-in function that allows you to filter elements from an
iterable (such as a list, tuple, or string) based on a specified condition. It returns an iterator that
yields the elements from the iterable for which the condition evaluates to True. The syntax for
the filter() function is as follows:
filter(function, iterable)
The filter() function applies the provided function to each element of the iterable and returns an
iterator that yields the elements for which the condition is True.
10. What is the `reduce` function in Python?
In Python, the reduce() function is a part of the functools module and is used for performing a
cumulative computation on a sequence of elements. It applies a specified function to the
elements of an iterable in a cumulative way, reducing the sequence to a single value. To use the
reduce() function, you need to import it from the functools module:
from functools import reduce
The syntax for the reduce() function is as follows:
reduce(function, iterable, initializer)
Generators in Python are a powerful feature that allows for the creation of iterators that can be
used to generate sequences of values. This can be particularly useful when working with large
datasets or when memory constraints are a concern. By utilizing the yield keyword, generators
can pause execution and resume later, allowing for efficient and flexible processing of data.In
Python, a generator is a special type of iterator that generates values on the fly. It allows you to
write iterable objects by defining a function that uses the yield keyword instead of return to
provide values one at a time. Generators are memory-efficient and provide a convenient way to
work with large datasets or infinite sequences. Here's an example of a simple generator
function:
In this example, the count_up_to() function is a generator that generates numbers from 0 up to
a given n value. Instead of returning all the numbers at once, it yields them one by one using the
yield keyword. To use the generator and obtain its values, you can iterate over it or use the
next() function:
When the generator function encounters a yield statement, it temporarily suspends its execution
and returns the yielded value. The state of the generator function is saved, allowing it to resume
execution from where it left off the next time next() is called.
Pickling and unpickling are the processes of serializing and deserializing Python objects,
respectively. These processes allow you to convert complex objects into a byte stream
(serialization) and convert the byte stream back into the original object (deserialization). In
Python, the pickle module provides functionality for pickling and unpickling objects. Pickling:
Pickling is the process of converting a Python object into a byte stream, which can be saved to
a file, transmitted over a network, or stored in a database. The pickle.dump() function is used to
pickle an object by writing it to a file-like object. The pickle.dumps() function is used to pickle an
object and return the byte stream without writing it to a file. Pickled objects can be saved with
the file extension .pickle or .pkl. Unpickling: Unpickling is the process of restoring a pickled byte
stream back into the original Python object. The pickle.load() function is used to unpickle an
object from a file-like object. The pickle.loads() function is used to unpickle an object from a byte
stream. Unpickling reconstructs the original object with the same state and data as it had before
pickling.
c. pass statement: It doesn't do anything and acts as a placeholder. It is used when you
need a statement syntactically but don't want any code to be executed. Here's an
example:
16. How can you randomize the items of a list in place in Python?
To randomize the items of a list in place (i.e., modifying the original list), you can make use of
the random.shuffle() function from the random module in Python. The shuffle() function shuffles
the elements of a list randomly. Here's an example:
Output:
A Python iterator is an object that can be used to iterate over a sequence of values. It provides
a `__next__()` method that returns the next value in the sequence, and raises a `StopIteration`
exception when there are no more values.
Here's an example of a simple iterator:
18. How do you handle exceptions in Python?
In Python, exceptions are used to handle errors and exceptional situations that may occur
during the execution of a program. Exception handling allows you to gracefully handle errors
and control the flow of your program when an exception is raised. To handle exceptions in
Python, you can use a combination of the try, except, else, and finally blocks.
Example:
Here's a breakdown of the different parts of exception handling:
a. try: The try block is where you put the code that may raise an exception. If an exception
occurs within this block, the execution jumps to the appropriate except block.
b. except: The except block catches specific exceptions and provides the handling code for
each exception type. You can have multiple except blocks to handle different types of
exceptions.
c. else: The else block is optional and is executed if no exception occurs in the try block.
d. finally: The finally block is optional and is always executed, regardless of whether an
exception occurred or not.
19. What is the difference between `finally` and `else` in a Python `try`/`except` block?
In a Python try/except block, finally and else are optional clauses that serve different purposes:
a. finally block: The finally block is always executed, regardless of whether an exception
occurred or not. It is typically used for cleanup operations or releasing resources that need to be
performed regardless of the outcome of the try block. The finally block is executed even if an
exception is raised and not caught by any of the except blocks. It ensures that certain code is
executed regardless of exceptions or successful execution.
b.else block: The else block is executed only if the try block completes successfully without any
exceptions being raised. It is optional and provides a place to put code that should be executed
when no exceptions occur. If an exception is raised within the try block, the code in the else
block is skipped, and the program flow jumps to the appropriate except block or propagates the
exception up.
20. What is a list comprehension in Python?
List comprehension is a concise way to create lists in Python. It allows you to generate a new
list by specifying an expression, followed by one or more for and if clauses. The basic syntax of
list comprehension is as follows:
new_list = [expression for item in iterable if condition]
Here's a breakdown of the different parts:
a. expression: The expression to be evaluated for each item in the iterable.
b. item: A variable that represents each item in the iterable.
c. iterable: A sequence, such as a list, tuple, or string, that you want to iterate over.
d. condition (optional): A condition that filters the items based on a Boolean expression.
Only items for which the condition evaluates to True are included in the new list.
22. What is the difference between a shallow copy and a deep copy in Python?
In Python, a shallow copy and a deep copy are two different methods used to create copies of
objects, including lists, dictionaries, and custom objects. The main difference between a shallow
copy and a deep copy lies in how they handle nested objects or references within the original
object.A shallow copy creates a new object but maintains references to the objects found in the
original object. In other words, it creates a new object and copies the references to the nested
objects in the original object. If the original object contains mutable objects (e.g., lists or
dictionaries), changes made to the mutable objects in either the original or the copied object will
affect both. Example of shallow copy:
A deep copy creates a completely independent copy of the original object, including all the
nested objects. It recursively copies all objects found in the original object. Changes made to
the original object or its nested objects will not affect the deep copy, and vice versa.
Example of deep copy:
To sort a list in Python, you can use the `sorted()` function, which returns a new sorted list, or
the `sort()` method, which sorts the list in-place. Both functions take an optional `key` parameter,
which is used to specify a function that returns a value to use for sorting.
24. How do you reverse a list in Python?
To reverse a list in Python, you can use either the reverse() method or slicing. Here's how you
can use each method:
reverse() method: The reverse() method is a list method that reverses the order of the elements
in the list in place, meaning it modifies the original list.
Slicing: You can reverse a list using slicing by specifying the step value as -1, which traverses
the list in reverse order. This method returns a new reversed list without modifying the original
list.
To concatenate two lists in Python, you can use the + operator or the extend() method. Both
methods allow you to combine the elements of two lists into a single list. Here's how you can
use each method:
Using the + operator: The + operator concatenates two lists by creating a new list that contains
all the elements from both lists.
Using the extend() method: The extend() method modifies the original list by appending all the
elements from another list to the end of it.
To check if an element is in a list in Python, you can use the in operator. The in operator returns
True if the element is found in the list and False otherwise. Here's an example:
In this example, the element in my_list expression checks if element (which is set to 3) is
present in my_list. Since 3 is in my_list, the condition is true, and the statement "Element is in
the list" is printed.
In Python, there are multiple ways to remove an element from a list. Here are a few common
methods:
remove() method: The remove() method removes the first occurrence of a specified element
from the list. If the element is not found, it raises a ValueError.
Here's an example:
del statement: The del statement is used to remove an element from a list by its index. It can
also be used to remove a slice of elements from a list. Here's an example:
pop() method: The pop() method removes and returns an element from a list based on its index.
If no index is specified, it removes and returns the last element.
Here's an example:
To split a string into a list in Python, you can use the `split()` method, which splits a string into a
list of substrings based on a specified delimiter. For example:
You can also specify a custom delimiter for splitting the string. For example, to split the string
based on commas, you can pass ',' as the delimiter to the split() method:
30. How do you join a list into a string in Python?
To join a list into a string in Python, you can use the `join()` method, which concatenates the
elements of a list using a specified separator string. For example:
```
my_list = ['hello', 'world']
my_string = ' '.join(my_list) # joins elements with a space separator
print(my_string) # outputs "hello world"
To convert a string to a number in Python, you can use the `int()` or `float()` function, depending
on the type of number you want to convert to. For example:
```
my_string = "42"
my_int = int(my_string) # converts to integer
print(my_int) # outputs 42
my_string = "3.14"
my_float = float(my_string) # converts to float
print(my_float) # outputs 3.14
```
To convert a number to a string in Python, you can use the `str()` function, which returns a string
representation of the number. For example:
```
my_int = 42
my_string = str(my_int) # converts to string
print(my_string) # outputs "42"
my_float = 3.14
my_string = str(my_float) # converts to string
print(my_string) # outputs "3.14"
```
To read input from the user in Python, you can use the `input()` function, which reads a line of
text from the user and returns it as a string. For example:
Note that the input() function always returns a string, even if the user enters a number or other
data type. If you need to convert the input to a different data type, such as an integer or float,
you can use appropriate conversion functions like int() or float().
To open a file in Python, you can use the `open()` function, which returns a file object. The
function takes two arguments: the filename and the mode in which to open the file. For example:
```
file = open("example.txt", "r") # opens file for reading
```
```
# read entire file
file = open("example.txt", "r")
data = file.read()
print(data)
file.close()
To write data to a file in Python, you can use the `write()` method of the file object, which writes
a string to the file. Alternatively, you can use the `writelines()` method to write a list of strings to
the file. For example:
```
# write a string to file
file = open("example.txt", "w")
file.write("Hello, world!\n")
file.close()
To close a file in Python, you can call the `close()` method of the file object. For example:
```
file = open("example.txt", "r")
data = file.read()
file.close()
```
To check if a file exists in Python, you can use the `os.path.isfile()` function, which returns `True`
if the file exists and `False` otherwise. For example:
```
import os.path
if os.path.isfile("example.txt"):
print("The file exists.")
else:
print("The file does not exist.")
```
To get the current working directory in Python, you can use the `os.getcwd()` function, which
returns a string representing the current working directory. For example:
```
import os
cwd = os.getcwd()
print(cwd)
```
To change the current working directory in Python, you can use the `os.chdir()` function, which
changes the current working directory to the specified path. For example:
```
import os
os.chdir("/path/to/new/directory")
```
```
import os
files = os.listdir("/path/to/directory")
print(files)
```
To create a directory in Python, you can use the `os.mkdir()` function, which creates a new
directory with the specified name in the current working directory. For example:
```
import os
os.mkdir("new_directory")
```
To remove a directory in Python, you can use the `os.rmdir()` function, which removes the
directory with the specified name in the current working directory. For example:
```
import os
os.rmdir("directory_to_remove")
```
To filter files by extension in Python, you can use a list comprehension to create a new list that
contains only the files with the specified extension. For example, to filter all `.txt` files in a
directory:
```
import os
files = os.listdir("/path/to/directory")
To read a file line by line in Python, you can use a `for` loop to iterate over the lines of the file.
For example:
```
with open("file_to_read") as file:
for line in file:
print(line)
```
48. How do you read the contents of a file into a string in Python?
To read the contents of a file into a string in Python, you can use the `read()` method of the file
object. For example:
```
with open("file_to_read") as file:
contents = file.read()
print(contents)
```
To write a string to a file in Python, you can use the `write()` method of the file object. For
example:
```
with open("file_to_write", "w") as file:
file.write("Hello, world!")
```
To append a string to a file in Python, you can open the file in append mode (`"a"`) and use the
`write()` method of the file object. For example:
```
with open("file_to_append", "a") as file:
file.write("Hello, world!")
```
51. Write a one-liner that will count the number of capital letters in a file.
To count the number of capital letters in a file using a one-liner in Python, you can combine file
reading, character filtering, and counting using a generator expression. Here's an example:
count = sum(1 for line in open('filename.txt') for char in line if char.isupper())
In the above code, 'filename.txt' represents the name or path of the file you want to count the
capital letters in. The open() function is used to open the file, and the file is iterated line by line
using the first for loop (for line in open('filename.txt')). Then, for each line, the characters are
iterated using the second for loop (for char in line). The char.isupper() condition checks if the
character is uppercase. The generator expression 1 for line in open('filename.txt') for char in line
if char.isupper() generates 1 for each uppercase character. Finally, the sum() function is used to
add up all the 1 occurrences, resulting in the count of capital letters, which is stored in the count
variable.
52. What is NumPy? Why should we use it?
NumPy (also called Numerical Python) is a highly flexible, optimized, open-source package
meant for array processing. It provides tools for delivering high-end performance while dealing
with N-dimensional powerful array objects. It is also beneficial for performing scientific
computations, mathematical, and logical operations, sorting operations, I/O functions, basic
statistical and linear algebra-based operations along with random simulation and broadcasting
functionalities. Due to the vast range of capabilities, NumPy has become very popular and is the
most preferred package. The following image represents the uses of NumPy.
● Python lists support storing heterogeneous data types whereas NumPy arrays can store
datatypes of one nature itself. NumPy provides extra functional capabilities that make
operating on its arrays easier which makes NumPy array advantageous in comparison to
Python lists as those functions cannot be operated on heterogeneous data.
● NumPy arrays are treated as objects which results in minimal memory usage. Since
Python keeps track of objects by creating or deleting them based on the requirements,
NumPy objects are also treated the same way. This results in lesser memory wastage.
● NumPy arrays support multi-dimensional arrays.
● NumPy provides various powerful and efficient functions for complex computations on
the arrays.
● NumPy also provides various range of functions for BitWise Operations, String
Operations, Linear Algebraic operations, Arithmetic operations etc. These are not
provided on Python’s default lists.
ndarray object is the core of the NumPy package. It consists of n-dimensional arrays storing
elements of the same data types and also has many operations that are done in compiled code
for optimised performance. These arrays have fixed sizes defined at the time of creation.
Following are some of the properties of ndarrays:
● When the size of ndarrays is changed, it results in a new array and the original array is
deleted.
● The ndarrays are bound to store homogeneous data.
● They provide functions to perform advanced mathematical operations in an efficient
manner.
np.mean() method calculates the arithmetic mean and provides additional options for input and
results. For example, it has the option to specify what data types have to be taken, where the
result has to be placed etc. np.average() computes the weighted average if the weights
parameter is specified. In the case of weighted average, instead of considering that each data
point is contributing equally to the final average, it considers that some data points have more
weightage than the others (unequal contribution).
To reverse a NumPy array, you can use the indexing and slicing feature of NumPy. Here are two
common approaches:
a. Using indexing and slicing: For a 1D array, you can use the [::-1] slicing to reverse the
array:
b. Using the np.flip() function: The np.flip() function can be used to reverse an array along a
specified axis. By default, it reverses the array along all axes.
Here's an example:
58. How do you count the frequency of a given positive value appearing in the NumPy array?
We can make use of the bincount() function to compute the number of times a given value is
there in the array. This function accepts only positive integers and boolean expressions as the
arguments. The np.bincount() function in NumPy is used to count the occurrences of
non-negative integers in an array and return the frequency of each integer. It is particularly
useful when dealing with discrete data or integer-valued data. The function operates on 1D
arrays and returns a new array with the count of occurrences for each integer value.
Example:
Output:
59. What is Pandas in Python?
Pandas is an open-source Python package that is most commonly used for data science, data
analysis, and machine learning tasks. It is built on top of another library named Numpy. It
provides various data structures and operations for manipulating numerical data and time series
and is very efficient in performing various functions like data visualization, data manipulation,
data analysis, etc.
Pandas have three different types of data structures. It is due to these simple and flexible data
structures that it is fast and efficient.
a. Series - It is a one-dimensional array-like structure with homogeneous data which means
data of different data types cannot be a part of the same series. It can hold any data type
such as integers, floats, and strings and its values are mutable i.e. it can be changed but
the size of the series is immutable i.e. it cannot be changed.
b. DataFrame - It is a two-dimensional array-like structure with heterogeneous data. It can
contain data of different data types and the data is aligned in a tabular manner. Both size
and values of DataFrame are mutable.
c. Panel - The Pandas have a third type of data structure known as Panel, which is a 3D
data structure capable of storing heterogeneous data but it isn’t that widely used. 3.
Pandas library is known for its efficient data analysis and state-of-the-art data visualization. The
key features of the panda’s library are as follows: Fast and efficient DataFrame object with
default and customized indexing. High-performance merging and joining of data. Data alignment
and integrated handling of missing data. Label-based slicing, indexing, and subsetting of large
data sets. Reshaping and pivoting of data sets. Tools for loading data into in-memory data
objects from different file formats. Columns from a data structure can be deleted or inserted.
Group by data for aggregation and transformations.
● data - It represents various forms like series, map, ndarray, lists, dict, etc.
● index - It is an optional argument that represents an index to row labels.
● columns - Optional argument for column labels.
● Dtype - It represents the data type of each column. It is an optional parameter
64. What are the different ways in which a series can be created in pandas?
In Pandas, there are several ways to create a Series, which is a one-dimensional labeled array.
Here are some common methods:
a. From a Python list: You can create a Series by passing a Python list to the pd.Series()
constructor:
b. From a NumPy array: You can create a Series from a NumPy array by passing the array
to the pd.Series() constructor:
c. From a dictionary: You can create a Series from a dictionary, where the keys of the
dictionary will be the index labels of the Series and the values will be the data:
We can create a copy of the series by using the following syntax: Series.copy(deep=True) The
default value for the deep parameter is set to True. When the value ofdeep=True, the creation of
a new object with a copy of the calling object’s data and indices takes place. Modifications to the
data or indices of the copy will not be reflected in the original object whereas when the value of
deep=False, the creation of a new object will take place without copying the calling object’s data
or index i.e. only the references to the data and index will be copied. Any changes made to the
data of the original object will be reflected in the shallow copy and vice versa.
Categorical data is a discrete set of values for a particular outcome and has a fixed range. Also,
the data in the category need not be numerical, it can be textual in nature. Examples are
gender, social class, blood type, country affiliation, observation time, etc. There is no hard and
fast rule for how many values a categorical value should have. One should apply one’s domain
knowledge to make that determination on the data sets
● Using read_csv(): CSV is a comma-separated file i.e. any text file that uses commas as
a delimiter to separate the record values for each field. Therefore, in order to load data
from a text file we use pandas.read_csv() method.
● Using read_table(): This function is very much like the read_csv() function, the major
difference being that in read_table the delimiter value is ‘\t’ and not a comma which is the
default value for read_csv(). We will read data with the read_table function making the
separator equal to a single space(‘ ‘).
● Using read_fwf(): It stands for fixed-width lines. This function is used to load DataFrames
from files. Another very interesting feature is that it supports optionally iterating or
breaking the file into chunks. Since the columns in the text file were separated with a
fixed width, this read_fwf() read the contents effectively into separate columns.
The iloc() and loc() functions in Pandas are used to access and retrieve data from a DataFrame
or Series. However, they have some differences in terms of the indexing methods they use.
Here's how they differ:
iloc(): It allows you to access data by specifying the integer-based positions of rows and
columns. The indexing starts from 0 for both rows and columns. You can use integer-based
slicing and indexing ranges to select specific rows or columns. The iloc() function does not
include the end value when slicing with ranges. Here's an example to illustrate the usage of
iloc():
loc(): The loc() function is primarily used for label-based indexing. It allows you to access data
by specifying labels or boolean conditions for rows and column names. You can use label-based
slicing and indexing ranges to select specific rows or columns. The loc() function includes the
end value when slicing with ranges. Here's an example to illustrate the usage of loc():
In summary, iloc() is used for integer-based indexing, while loc() is used for label-based
indexing. The choice between iloc() and loc() depends on whether you want to access data
based on integer positions or label names.
69. How would you convert continuous values into discrete values in Pandas?
To convert continuous values into discrete values in Pandas, you can use the pd.cut() function.
The pd.cut() function allows you to divide a continuous variable into bins and assign discrete
labels to the values based on their bin membership. Here's an example of how you can use
pd.cut() to convert continuous values into discrete categories:
Output:
In Pandas, both the interpolate() and fillna() functions are used to fill missing or NaN (Not a
Number) values in a DataFrame or Series. However, they differ in their approach to filling the
missing values:
interpolate(): It is primarily used for filling missing values in time series or other ordered data
where the values are expected to have a smooth variation. The function estimates the missing
values based on the values of neighboring data points, using various interpolation methods
such as linear, polynomial, spline, etc.
Output:
fillna(): The fillna() function in Pandas is used to fill missing values with a specified scalar value
or with values from another DataFrame or Series. The function replaces the missing values with
the provided scalar value or with values from a specified Series or DataFrame. Here's an
example of using fillna() to fill missing values with a constant value:
Output;
71. How to add a row to a Pandas DataFrame?
To add a row to a Pandas DataFrame, you can use the append() function or the loc[] indexing
method. Here are examples of both approaches: Using append() function: The append()
function is used to concatenate rows or DataFrames together. You can create a new DataFrame
representing the row you want to add, and then append it to the original DataFrame.
Output:
Using loc[] indexing: Another approach is to use the loc[] indexing method to directly assign
values to a new row.
Output:
72. Write a Pandas program to find the positions of numbers that are multiples of 5 of a
given series
To find the positions of numbers that are multiples of 5 in a given pandas Series, you
can use the numpy.where() function along with boolean indexing. Here's an example
program:
Output:
In the above program, we create a sample pandas Series named series containing some
numbers. We then use the % operator to check for multiples of 5 by applying the
condition series % 5 == 0. This condition returns a boolean Series with True values
where the numbers are multiples of 5 and False values otherwise. Next, we use
numpy.where() along with boolean indexing ([0]) to retrieve the positions of True values
in the boolean Series. The result is an array of positions where the numbers are
multiples of 5.
73. Write a Pandas program to display the most frequent value in a given series and
replace everything else as “replaced” in the series.
To display the most frequent value in a given pandas Series and replace everything else
with "replaced", you can use the value_counts() function to find the most frequent value
and then use the replace() function to replace the remaining values. Here's an example
program:
Output:
In the above program, we create a sample pandas Series named series with some
values. We use the value_counts() function to count the occurrences of each value in
the Series and then retrieve the most frequent value using idxmax(). The idxmax()
function returns the index label of the maximum value, which corresponds to the most
frequent value in this case. Next, we use boolean indexing (series != most_frequent) to
create a mask of values that are not equal to the most frequent value. We use this mask
to select those values from the Series and replace them with "replaced" using the
replace() function. Finally, we print the Series with the most frequent value replaced as
"replaced".
74. Write a Python program that removes vowels from a string.
Output:
In the above program, the remove_vowels() function takes a string as input and removes
all the vowels from it. The vowels variable stores a string containing all the vowel
characters in both lowercase and uppercase. Within the function, a generator
expression is used along with the join() function to construct a new string
(vowels_removed) by iterating over each character in the input string and only including
characters that are not vowels. You can replace the input_string variable with any string
you want to remove the vowels from. The result will be stored in the result variable and
printed to the console.
75. Write a Python program that rotates an array by two positions to the right.
Output:
In the above program, the rotate_array() function takes an array as input and rotates it
by two positions to the right. The variable n stores the length of the array. To rotate the
array, a new array rotated_arr is created by concatenating the last two elements of the
input array (arr[-2:]) with the remaining elements of the input array excluding the last two
elements (arr[:-2]).
76. Write a Python code to find all nonrepeating characters in the String
Output:
Output:
In the above program, the calculate_power() function takes two parameters: base and
exponent, and calculates the power of base raised to the exponent using a while loop.
Inside the while loop, the result variable is multiplied by base in each iteration, and the
count variable is incremented by 1. The loop continues until count reaches the
exponent. Finally, the result is returned as the calculated power.
78. Write a program to check and return the pairs of a given array A whose sum value is
equal to a target value N.
Output:
In the above program, the find_pairs() function takes two parameters: array, which
represents the given array, and target, which represents the target sum value. We
initialize an empty list pairs to store the pairs whose sum equals the target value. We
also create an empty set seen to keep track of the numbers seen so far. We iterate over
each number num in the array. For each number, we calculate the complement by
subtracting it from the target value (complement = target - num). If the complement is
already in the seen set, it means we have found a pair whose sum equals the target
value. We create a tuple (num, complement) and append it to the pairs list. We add the
current number num to the seen set to ensure we can find pairs with it as a complement
later. Finally, the pairs list is returned as the result.
79. Write a Program to match a string that has the letter ‘a’ followed by 4 to 8 'b’s.
Output:
In the above program, the match_string() function takes two parameters: pattern, which
represents the regular expression pattern, and string, which represents the input string
to be matched. We use the re.search() function from the re module to search for a
match of the pattern in the input string. The pattern a[b]{4,8} specifies that we are
looking for an 'a' followed by 4 to 8 'b's. If a match is found, the match variable will be
set, and match_string() returns True. Otherwise, it returns False. In the example usage,
we provide the input string "abbbb" and the pattern a[b]{4,8}. The pattern matches the
input string since it has an 'a' followed by 4 'b's, and the program prints True.
80. Write a Program to convert date from yyyy-mm-dd format to dd-mm-yyyy format
using regular expression.
Output:
In the above program, the convert_date() function takes a date string in the
"yyyy-mm-dd" format as input and returns the date string in the "dd-mm-yyyy" format.
We define a regular expression pattern (\d{4})-(\d{2})-(\d{2}) that matches the
"yyyy-mm-dd" format. The pattern uses capturing groups to capture the year, month, and
day components of the date. The re.sub() function is used to substitute the capturing
groups in the pattern with the desired format. The replacement pattern r"\3-\2-\1"
specifies that we want to rearrange the captured groups as day-month-year. Finally, the
converted date string is returned as the result. In the example usage, we provide the
input date string "2023-05-24". The program applies the regular expression substitution
and converts the date to the "dd-mm-yyyy" format, resulting in "24-05-2023". The
converted date is then printed to the console.
81. Write a Program to combine two different dictionaries. While combining, if you find
the same keys, you can add the values of these same keys. Output the new dictionary
Output:
We create a new dictionary combined_dict and initially copy the key-value pairs from
dict1 using the copy() method. Then, we iterate over the key-value pairs in dict2. For
each key, if it already exists in combined_dict, we add the corresponding value to the
existing value. If the key doesn't exist, we simply add the key-value pair to
combined_dict. Finally, the combined_dict is returned as the result.
83. Can you provide me examples of when a scatter graph would be more appropriate
than a line chart or vice versa?
A scatter graph would be more appropriate than a line chart when you are looking to
show the relationship between two variables that are not linearly related. For example, if
you were looking to show the relationship between a person’s age and their weight, a
scatter graph would be more appropriate than a line chart. A line chart would be more
appropriate than a scatter graph when you are looking to show a trend over time. For
example, if you were looking at the monthly sales of a company over the course of a
year, a line chart would be more appropriate than a scatter graph.
In Matplotlib, you can customize the appearance of your plots in various ways. Here are
some common customization options:
a. Titles and Labels: Set the title of the plot using plt.title() or ax.set_title(). Set
labels for the x-axis and y-axis using plt.xlabel() and plt.ylabel() or ax.set_xlabel()
and ax.set_ylabel().
b. Legends: Add a legend to your plot using plt.legend() or ax.legend(). Customize
the legend location, labels, and other properties.
c. Grid Lines: Display grid lines on the plot using plt.grid(True) or ax.grid(True).
Customize the grid appearance with options like linestyle, linewidth, and color.
d. Colors, Line Styles, and Markers: Control the colors of lines, markers, and other
plot elements using the color parameter in plotting functions. Customize line
styles (e.g., solid, dashed, dotted) using the linestyle parameter. Specify markers
(e.g., dots, triangles, squares) using the marker parameter.
e. Axis Limits and Ticks: Set custom axis limits using plt.xlim() and plt.ylim() or
ax.set_xlim() and ax.set_ylim(). Customize the appearance of ticks on the x-axis
and y-axis using plt.xticks() and plt.yticks() or ax.set_xticks() and ax.set_yticks().
f. Background and Plot Styles: Change the background color of the plot using
plt.figure(facecolor='color') or ax.set_facecolor('color'). Apply predefined styles or
create custom styles using plt.style.use('style_name') or
plt.style.context('style_name').
g. Annotations and Text: Add annotations and text to your plot using plt.text() or
ax.text(). Customize the font size, color, and other properties of the text.
Output:
86. How do you create a figure with multiple subplots using Matplotlib?
To create a figure with multiple subplots using Matplotlib, you can use the plt.subplots()
function. Here's an example that demonstrates how to create a figure with two subplots
side by side:
Output:
Seaborn is built on top of Matplotlib and provides a higher-level interface for creating
statistical graphics. While Matplotlib offers more flexibility and control over the plot
elements, Seaborn simplifies the creation of common statistical plots by providing
intuitive functions and sensible default settings. Seaborn also integrates well with
Pandas data structures.
To create a heatmap in Seaborn, you can use the heatmap() function. Here's an example
that demonstrates how to create a heatmap:
Output:
In this example: We import the necessary modules, including seaborn as sns and numpy
as np. We create a 2D array of random values using np.random.rand(). This will serve as
our data for the heatmap. We use the sns.heatmap() function to create the heatmap.
The data array is passed as the first argument. Additional parameters can be used to
customize the appearance of the heatmap. In this example, annot=True enables the
display of data values on the heatmap, and cmap='YlGnBu' sets the color map. Finally,
we use plt.show() to display the heatmap.
To create a categorical plot (catplot) in Seaborn, you can use the catplot() function. This
function provides a high-level interface for creating various types of categorical plots.
Here's an example that demonstrates how to use catplot():
Output:
In this example: We import the necessary modules, including seaborn as sns. We load
the built-in 'tips' dataset from Seaborn using the sns.load_dataset() function. This
dataset contains information about restaurant tips. We use the sns.catplot() function to
create a categorical plot. The x parameter specifies the variable to be plotted on the
x-axis ('day' in this example), the y parameter specifies the variable to be plotted on the
y-axis ('total_bill' in this example), the data parameter specifies the dataset to use (the
'tips' dataset in this example), and the kind parameter specifies the type of categorical
plot to create ('box' plot in this example). Finally, we use plt.show() to display the
categorical plot.
The distplot() function in Seaborn is used to create a distribution plot, which displays
the distribution of a univariate set of observations. Here's an example that
demonstrates how to create a distribution plot using distplot():
Output:
In this example: We import the necessary modules, including seaborn as sns and numpy
as np. We generate a random dataset using np.random.randn(). We use the
sns.distplot() function to create the distribution plot. The data array is passed as the
first argument. Additional parameters can be used to customize the appearance of the
plot. In this example, kde=True enables the kernel density estimation line, and hist=True
enables the histogram representation. Finally, we use plt.show() to display the
distribution plot. When you run this code, it will create a distribution plot based on the
provided random dataset. The plot will show the estimated probability density function
(PDF) using a kernel density estimation (KDE) line, as well as a histogram
representation of the data.
91. You have a time series dataset, and you want to visualize the trend and seasonality
in the data using Matplotlib. What type of plot would you use, and how would you create
it?
In this scenario, I would use a line plot to visualize the trend and seasonality in the time
series data. To create the line plot in Matplotlib, I would import the necessary libraries,
create a figure and axes object, and then use the plot function to plot the data points
connected by lines.
92. You have a dataset with a single continuous variable, and you want to visualize its
distribution. Would you choose a histogram or a box plot, and why?
93. You have a dataset with two continuous variables, and you want to visualize their
joint distribution and the individual distributions of each variable. Would you choose a
joint plot or a pair plot in Seaborn, and why?
In this case, I would choose a pair plot in Seaborn to visualize the joint distribution and
individual distributions of the two continuous variables. A pair plot creates a grid of
subplots, where each subplot shows the relationship between two variables through
scatter plots and the distribution of each variable using histograms or kernel density
plots. It allows us to explore the pairwise relationships between variables and gain
insights into their individual distributions. A joint plot, on the other hand, focuses on
visualizing the joint distribution and relationship between two variables in a single plot.
Output:
In this example, we have a pie chart with four slices represented by the values in the
sizes list. Each slice is labeled with the corresponding label from the labels list. The
colors for each slice are defined in the colors list. We explode the second slice (B) by
specifying explode=(0, 0.1, 0, 0), causing it to be slightly separated from the rest. The
autopct='%1.1f%%' formats the percentage values displayed on each slice. The
startangle=90 rotates the chart to start from the 90-degree angle (top). After creating
the chart and adding a title, we display the pie chart using plt.show().
A violin plot in Seaborn is a data visualization that combines elements of a box plot and
a kernel density plot. It is used to visualize the distribution and density of a continuous
variable across different categories or groups. The violin plot gets its name from its
shape, which resembles that of a violin or a mirrored density plot. The width of each
violin represents the density or frequency of data points at different values. The plot is
mirrored at the center, indicating the symmetry of the distribution. The thick black line in
the middle represents the median. The white dot inside the violin represents the mean
(optional). The thinner lines, called "whiskers," extend from the violin to indicate the
range of the data. Optionally, individual data points can be displayed using small points
or a strip plot.
Example:
Output:
In this example, we use the tips dataset provided by Seaborn. We create a violin plot
using sns.violinplot(), specifying the x and y variables to visualize. In this case, we plot
the "total_bill" variable on the y-axis and group it by the "day" variable on the x-axis. After
creating the plot, we add labels and a title using plt.xlabel(), plt.ylabel(), and plt.title().
Finally, we display the plot using plt.show().
A joint plot in Seaborn is a visualization that combines multiple univariate and bivariate
plots to explore the relationship between two variables. It is used to visualize the joint
distribution, individual distributions, and the correlation between two continuous
variables in a single plot.
The key features of a joint plot include:
a. Scatter plot: It displays the joint distribution of the two variables using a scatter
plot, where each data point is represented by a marker on a 2D plane.
b. Histograms: It shows the marginal distribution of each variable along the x and y
axes using histograms. These histograms represent the frequency or count of
the variable values.
c. Kernel density estimation (KDE) plot: It provides a smooth estimate of the joint
distribution using kernel density estimation, which is a non-parametric technique
to estimate the probability density function of a random variable.
The joint plot helps to visualize the relationship between two variables, identify patterns,
clusters, and potential outliers, and understand their individual distributions. It also
provides a visual representation of the correlation between the variables, allowing for
insights into their dependency. To create a joint plot in Seaborn, you can use the
sns.jointplot() function. Joint plots are particularly useful when analyzing the
relationship between two continuous variables and gaining insights into their individual
and joint distributions. They provide a comprehensive view of the data in a single plot,
facilitating exploratory data analysis and hypothesis testing.
97. What are the conditions where heat maps are used?
Heatmaps are commonly used in various scenarios to visualize and analyze data. Here
are some conditions and use cases where heatmaps are often employed:
a. Correlation Analysis: Heatmaps are widely used to visualize the correlation
between variables in a dataset. Each cell in the heatmap represents the
correlation coefficient between two variables, and the color intensity or shade
represents the strength of the correlation.
b. Confusion Matrix: Heatmaps are used to display confusion matrices, particularly
in machine learning classification tasks. Each cell in the heatmap represents the
count or percentage of correct and incorrect predictions for different classes.
c. Geographic Data: Heatmaps are useful for visualizing geographic or spatial data.
By mapping data onto a geographical region, such as a map, heatmaps can
represent the intensity or density of a variable across different regions.
d. Performance Monitoring: Heatmaps can be employed to monitor and analyze
performance metrics over time or across different dimensions. For example, in
web analytics, a heatmap can represent user engagement or click-through rates
for different website sections or page elements.
e. Gene Expression Analysis: In genomics, heatmaps are widely used to analyze
gene expression data. Heatmaps display the expression levels of different genes
across multiple samples or conditions. This helps identify patterns, clusters, or
specific gene expression profiles related to certain biological conditions or
treatments.
f. Financial Analysis: Heatmaps find applications in financial analysis to visualize
market data, such as stock price movements or correlation between different
assets.
98. You have a dataset with multiple categories, and you want to compare their
proportions. Would you choose a bar chart or a pie chart, and why?
When comparing the proportions of multiple categories, I would choose a bar chart over
a pie chart. A bar chart is better suited for comparing and ranking different categories
based on their values. It provides a clear visual representation of the magnitude of each
category, making it easier to interpret the differences between them. A pie chart, on the
other hand, is useful for displaying the proportions of a single categorical variable but
can be less effective when comparing multiple categories.
To create an area plot in Matplotlib, you can use the fill_between() function to fill the
area between two curves. Here's an example of how to create an area plot:
Output:
In this example, we have two sets of data represented by values1 and values2 for each
month specified in the months list. We create a figure and axes object using
plt.subplots(). Then, we use the fill_between() function twice to fill the area between the
curves formed by values1 and values2. To customize the appearance of the area plot,
you can specify the colors using the color parameter and adjust the transparency with
the alpha parameter.
A regplot is a function in Seaborn that allows you to create a scatter plot with a linear
regression line fit to the data. It is used to visualize the relationship between two
continuous variables and estimate the linear trend between them. The regplot function
in Seaborn combines the scatter plot and the linear regression line in a single plot. It
helps you understand the correlation and strength of the linear relationship between the
variables, as well as identify any potential outliers or deviations from the trend line. Here
are some key features of a regplot:
a. Scatter Plot: The regplot function creates a scatter plot by plotting the data
points of the two variables on a Cartesian coordinate system.
b. Linear Regression Line: It fits a regression line to the scatter plot using the least
squares method. The regression line represents the best-fit line that minimizes
the squared differences between the predicted and actual values.
c. Confidence Interval: By default, regplot also displays a shaded confidence
interval around the regression line. The confidence interval provides an estimate
of the uncertainty of the regression line.
d. Residual Plot: Additionally, regplot can display a separate plot showing the
residuals, which are the differences between the observed and predicted values.
This plot helps identify any patterns or heteroscedasticity in the residuals.
Regplots are commonly used in exploratory data analysis and data visualization tasks
to understand the relationship between two continuous variables. They are helpful for
detecting trends, outliers, and deviations from the linear relationship. Regplots are often
used in various fields, including statistics, social sciences, economics, and machine
learning, whenever analyzing the relationship between variables is of interest. To create
a regplot in Seaborn, you can use the sns.regplot() function.