Warehouse and SQL QUESTIONS
Warehouse and SQL QUESTIONS
4. Types of dimensions
5. Types of measures
6. Surrogate key
7. Constraints
9. SCD types
SQL:
24. DML,DDL,TCL
SQL
What is normalization?
plaintext
Copy code SELECT
| Product | Region | Sales | Product,
|-----------|--------|-------| MAX(CASE WHEN Region = 'North' THEN
| Product A | North | 100 | Sales END) AS North,
| Product B | North | 150 | MAX(CASE WHEN Region = 'South' THEN
| Product A | South | 200 | Sales END) AS South,
| Product B | South | 250 | MAX(CASE WHEN Region = 'East' THEN
| Product A | East | 180 | Sales END) AS East
| Product B | East | 220 | FROM
Now, let's say we want to pivot this data so SalesData
that we have one row for each product, with GROUP BY
columns for sales in each region. The desired Product;
output would look like this: This query uses conditional aggregation to
pivot the data. It creates columns for each
plaintext region and fills in the sales amount for each
Copy code product in the corresponding column. Finally,
| Product | North | South | East | it groups the results by product.
|-----------|-------|-------|------|
| Product A | 100 | 200 | 180 |
| Product B | 150 | 250 | 220 |
Here's the SQL query to achieve this pivot:
SELECT *
FROM employees
You have a table with employee information
WHERE department_id IN (
and a separate table with department
SELECT department_id
information. You want to find all employees
FROM departments
who belong to a department with more than
GROUP BY department_id
a certain number of employees.
HAVING COUNT(*) > 10
);
You have a table with sales data and want to
calculate the cumulative sum of sales over
time.
SELECT date,
sales_amount,
SUM(sales_amount) OVER (ORDER BY
date) AS cumulative_sales
FROM your_table_name;
desc emp;
A
B
C
D
OP SELECT
AB l1.letter, l2.letter
AC FROM
AD (VALUES ('A'), ('B'), ('C'), ('D')) l1 (letter)
BC JOIN
BD (VALUES ('A'), ('B'), ('C'), ('D')) l2 (letter) ON
l1.letter < l2.letter
Suppose you have a database with two tables:
employees and performance_reviews. The
employees table contains information about
employees, and the performance_reviews table
contains performance review scores for each
employee. Your task is to retrieve the names of
employees who have a higher average
performance score than the overall average
score for all employees.
employees table:
+------+-----------+
| emp_id | emp_name |
+------+-----------+
| 1 | Alice |
| 2 | Bob | Answer:
| 3 | Charlie |
| 4 | David | sql
+------+-----------+ Copy code
performance_reviews table: SELECT e.emp_name
FROM employees e
diff JOIN (
Copy code SELECT emp_id, AVG(score) AS avg_score
+------+-----------+ FROM performance_reviews
| emp_id | score | GROUP BY emp_id
+------+-----------+ ) pr ON e.emp_id = pr.emp_id
| 1 | 85 | WHERE pr.avg_score > (SELECT AVG(score) FROM
| 2 | 92 | performance_reviews);
| 3 | 78 |
| 4 | 90 |
| 1 | 88 |
| 2 | 95 |
| 3 | 80 |
| 4 | 92 |
+------+-----------+
Question:
Write an SQL query to retrieve the names of
employees who have an average performance
score higher than the overall average score for all
employees.
max-core Value -
There is no single optimal value for the max-core
parameter. A “good” value depends on your
particular graph and the environment in which it
runs, as well as on the data.
Details -
A component’s max-core parameter determines
the maximum amount of memory the component
will consume per partition before it spills to disk.
When the value of max-core is exceeded, all input
(in the case of SORT) or the excess input (in the
cases of the other components) is dropped to
disk in the form of temporary files. This can have
a dramatic impact on performance, but it does
not mean that it is always better to increase the
value of max-core in these situations.
SORT component -
For the SORT component, 96 MB (100663296
bytes) is the default value for max-core.
employees table:
employee_id (int, primary key)
employee_name (varchar)
department (varchar) SELECT
salaries table: e.department,
employee_id (int, foreign key referencing AVG(s.salary_amount) AS average_salary
employees.employee_id) FROM
salary_amount (decimal) employees e
Write an SQL query to find the average salary for JOIN salaries s ON e.employee_id =
each department. Utilize an aggregate subquery s.employee_id
to calculate the average salary for each GROUP BY
department. e.department;
SELECT
department,
AVG(performance_rating) AS avg_rating,
employee_ratings table: CASE
employee_id (int, primary key) WHEN AVG(performance_rating) >= 4 THEN
performance_rating (int) 'High Performer'
department (varchar) WHEN AVG(performance_rating) >= 3 AND
Write an SQL query to categorize employees AVG(performance_rating) < 4 THEN 'Medium
based on their average performance rating within Performer'
their department. Assign labels such as "High ELSE 'Low Performer'
Performer" if the average rating is greater than or END AS performance_category
equal to 4, "Medium Performer" if the average FROM
rating is between 3 and 4 (inclusive), and "Low employee_ratings
Performer" if the average rating is less than 3. GROUP BY
department;
SELECT
Scenario 1: Counting Male and Female Customers COUNT(CASE WHEN gender = 'M' THEN 1 END)
AS male_count,
customer_id, customer_name, gender COUNT(CASE WHEN gender = 'F' THEN 1 END)
(1, 'John Doe', 'M'),
(2, 'Jane Smith', 'F'),
(3, 'Bob Johnson', 'M'),
(4, 'Alice Brown', 'F'),
(5, 'Charlie Davis', 'M'),
(6, 'Eva White', 'F');
O/p
AS female_count
| male_count | female_count | FROM
|------------|--------------| customers;
|3 |3 | Expected Result:
I/P
Scenario 2: Counting Orders with Different Status
order_id, customer_id, order_date, order_status)
VALUES
(101, 1, '2024-02-01', 'Pending'),
(102, 2, '2024-02-02', 'Shipped'),
(103, 3, '2024-02-03', 'Delivered'), SELECT
(104, 4, '2024-02-04', 'Shipped'), COUNT(CASE WHEN order_status = 'Pending'
(105, 5, '2024-02-05', 'Pending'), THEN 1 END) AS pending_count,
(106, 6, '2024-02-06', 'Delivered'); COUNT(CASE WHEN order_status = 'Shipped'
THEN 1 END) AS shipped_count,
O/P COUNT(CASE WHEN order_status = 'Delivered'
| pending_count | shipped_count | THEN 1 END) AS delivered_count
delivered_count | FROM
|---------------|---------------|------------------| orders;
|2 |2 |2 | Expected Result:
Jio
jioTV
JioCloud
AJio
MyJio
JioMart
Write sQL query for fetch all Jio related apps…
and write unix cmd as well for the same