Advanced Interview QA ADF Databricks PowerBI
Advanced Interview QA ADF Databricks PowerBI
1. Q: You have a pipeline that loads millions of records daily from an on-prem SQL Server to Azure SQL
Database. One day, the copy activity fails without any code changes. How would you troubleshoot and
2. Q: Your client wants to implement a CDC-based delta load from SAP to Azure SQL via ADF, but SAP only
provides full extracts. How would you design a delta load strategy with minimal load time?
Technical-Based Questions:
1. Q: Explain how integration runtime works in ADF. When would you use self-hosted IR over Azure IR?
2. Q: How can you parameterize Linked Services and Datasets for reusability in ADF across multiple
environments (dev/test/prod)?
3. Q: How does ADF handle retry policies, and what are the best practices for configuring them in
mission-critical pipelines?
Databricks
Scenario-Based Questions:
1. Q: You are implementing a CDC pipeline using Delta Lake. The source system provides both insert and
delete records. How would you design the pipeline in Databricks to handle this efficiently using DLT or Auto
Loader?
2. Q: A data team complains that a notebook job is running slower after new columns were added to the
Technical-Based Questions:
1. Q: Explain the difference between OPTIMIZE, VACUUM, and ZORDER BY in Delta Lake. When and how
2. Q: How would you implement a slowly changing dimension Type 2 (SCD2) logic using PySpark in
Databricks?
3. Q: What are the pros and cons of using Delta Live Tables (DLT) over traditional notebooks for data
pipeline orchestration?
Power BI
Scenario-Based Questions:
1. Q: Your report is slow when filtering on slicers and visuals take 10+ seconds to render. How would you go
- Disable auto-date/time.
2. Q: A user requests row-level security (RLS) based on department and region. Departments may span
multiple regions. How do you implement this dynamic RLS in Power BI?
Technical-Based Questions:
1. Q: Explain the differences between Import, DirectQuery, and Composite models. When should each be
2. Q: How do you handle circular dependency errors in complex DAX measures or calculated columns?
A: - Use variables.
3. Q: What are the best practices for designing a Power BI data model for large-scale datasets (e.g., over 1
billion rows)?
- Apply aggregations.