SQL Interview Guide for Three Previous Posts
SQL Interview Guide for Three Previous Posts
(0-3 Years)
17-19 lpa
SQL Questions
1. Write a query to find duplicate rows in a table.
To detect duplicates, identify columns that should be unique and group by them.
Example:
SELECT column1, column2, COUNT(*) AS count
FROM your_table
GROUP BY column1, column2
HAVING COUNT(*) > 1;
Explanation:
GROUP BY combines rows with the same values in the specified columns.
HAVING COUNT(*) > 1 filters those combinations that occur more than once, indicating
duplicates.
˛ Tip: Add ROW_NUMBER() or RANK() with CTE to highlight or delete duplicates if needed.
C*
◆ Example: ROW_NUMBER()
Assigns a unique sequential number to each row within a partition.
SELECT name, department, salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS row_num
FROM employees;
Each employee within the same department gets a row number based on salary rank
(highest first).
◆ Example: RANK()
Assigns the same rank to rows with equal values, but skips the next rank(s).
SELECT name, department, salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS rank_num
FROM employees;
If 2 employees have the same salary, both get rank 1, and the next gets rank 3.
Example:
SELECT city FROM customers
UNION
SELECT city FROM vendors;
→ Returns a unique list of cities.
SELECT city FROM customers
UNION ALL
SELECT city FROM vendors;
→ Returns all cities, including duplicates.
WITH cte_name AS (
SELECT ...
)
SELECT * FROM cte_name;
-- Bad
SELECT * FROM orders;
-- Good
SELECT order_id, customer_id FROM orders;
◆ 2. Create proper indexes
Index frequently used columns in JOIN, WHERE, ORDER BY.
◆ 3. Avoid functions on indexed columns
-- Better
WHERE order_date BETWEEN '2024-01-01' AND '2024-12-31'
◆ 4. Use EXISTS instead of IN (for subqueries)
15. Write a query to find all customers who have not made
any purchases in the last 6 months.
Assume:
customers(customer_id, name)
transactions(customer_id, transaction_date)
⬛ 4. Conditional checks:
SELECT name,
CASE
WHEN salary IS NULL THEN 'Unknown'
ELSE 'Known'
END AS salary_status
FROM employees;
⬛ Example:
-- Creating index
CREATE INDEX idx_customer_id ON transactions(customer_id);
This helps queries like:
SELECT * FROM transactions WHERE customer_id = 101;
ı. Important notes:
Too many indexes can slow down INSERT/UPDATE.
Avoid indexing columns with low cardinality (e.g., gender).
Use composite indexes when querying multiple columns together.
25. How would you determine the Average Revenue Per User
(ARPU) from transaction data?
◆ ARPU = Total Revenue / Total Number of Users
⬛ Assume a transactions table:
(transaction_id, customer_id, amount, transaction_date)
⬛ SQL Query:
SELECT
SUM(amount) * 1.0 / COUNT(DISTINCT customer_id) AS ARPU
FROM transactions;
Q Explanation:
•
˙
SUM(amount) gets total revenue.
COUNT(DISTINCT customer_id) counts unique users.
Multiply by 1.0 to ensure float division.
You can also compute monthly ARPU by grouping by month.
SELECT
DATE_TRUNC('month', transaction_date) AS month,
SUM(amount) * 1.0 / COUNT(DISTINCT customer_id) AS monthly_arpu
FROM transactions
GROUP BY month
ORDER BY month;
◆ Real-life Scenario:
Question: List all customers and their transactions — even if they haven't made any.
⬛ Query:
SELECT c.customer_id, c.name, t.transaction_id, t.amount
FROM customers c
LEFT JOIN transactions t
ON c.customer_id = t.customer_id;
• Why LEFT JOIN?
Q̇
Shows all customers, including those with no transactions (returns NULLs for
those).
Using INNER JOIN would exclude customers with zero activity.
29. Write a query to find customers who have used more than
2 credit cards for transactions in a given month.
Assume a transactions table:
(customer_id, card_id, transaction_date)
⬛ Query:
SELECT customer_id,
TO_CHAR(transaction_date, 'YYYY-MM') AS txn_month,
COUNT(DISTINCT card_id) AS cards_used
FROM transactions
GROUP BY customer_id, TO_CHAR(transaction_date, 'YYYY-MM')
HAVING COUNT(DISTINCT card_id) > 2;
Q̇ Explanation:
•
Groups by customer_id and month.
Counts distinct card_id used.
Filters where more than 2 cards were used in a month.
◆ Step 5: Segmentation
Use clustering or thresholds to group premium customers into:
o High spenders
o Frequent spenders
o Category loyalists (e.g., only travel)
Identify anomalies or subgroups with unique patterns.