Mastering SQL Joins: Exclusive Scenarios Using CTEs, Window Functions, and Aggregations

The Data Engineer’s Guide to Joins: Enhanced with CTEs, Window Functions, and Aggregations

Introduction

Today, fast-paced industries usually operate in a competitive digital arenas. Studios and platforms are constantly under pressure to differentiate themselves in a rivalry market where user loyalty can shift with a single trend. This environment drives a consistent need for robust data solution frameworks that power strategic and operational decision-making. Among the critical demands are: Real-Time Player Behavior Analytics, Dynamic Monetization Pipelines, Data-Driven Game Design Optimization, Churn Prediction & Engagement Models, Scalable Predictive Trend Systems, Streamlined ETL for Cost Efficiency, and Competitive Intelligence Data Frameworks.

Behind all of these high-level capabilities lies a simple but powerful truth: data must be integrated, transformed, and analyzed efficiently. This is where SQL, and particularly joins combined with aggregate functions, subqueries, and conditional logic, becomes indispensable. Whether it is merging player activity logs with monetization data, linking demographic datasets with churn predictions, or combining market intelligence with in-game performance metrics, SQL joins enable the seamless connectivity of data sources that fuel these frameworks.

The following set of SQL join scenarios is designed with this reality in mind. They not only test technical proficiency in joining and querying relational data but also echo the very challenges faced by fast-paced industries, where data-driven solutions are the cornerstone of innovation, engagement, and market competitiveness.

Question 1 – Basic INNER JOIN with Aggregation

Scenario: A company tracks salaries across European offices. The HR manager wants the average salary per country only for countries with employees in both IT and Finance.

Question: List the country and average salary of employees working in both IT and Finance departments.

Tables are

  • employees(emp_id, name, salary, dept_id, country)
  • departments(dept_id, dept_name)
SELECT e.country, AVG(e.salary) AS avg_salary
FROM employees e
JOIN departments d ON e.dept_id = d.dept_id
WHERE d.dept_name IN ('IT', 'Finance')
GROUP BY e.country
HAVING COUNT(DISTINCT d.dept_name) = 2; 

Practical Tips:

  1. Use HAVING COUNT(DISTINCT dept_name) = 2.
  2. Join on the correct key
  3. Group by country, not department.

Question 2 – LEFT JOIN with Filtering

Scenario: A company in Spain wants all employees and whether they belong to any project.

List all Spanish employees and show the project name if assigned, otherwise “No Project”.

  • employees(emp_id, name, country)
  • projects(proj_id, proj_name)
  • employee_projects(emp_id, proj_id)

Answer can be like this:

SELECT e.name,
       COALESCE(p.proj_name, 'No Project') AS project_name
FROM employees e
LEFT JOIN employee_projects ep ON e.emp_id = ep.emp_id
LEFT JOIN projects p ON ep.proj_id = p.proj_id
WHERE e.country = 'Spain'; 

Tips to observe:

  1. Use LEFT JOIN for unmatched rows.
  2. COALESCE handles NULLs.
  3. Filter on base table country.

Question 3 – Subquery with Aggregation

Scenario: In Germany, management wants employees earning above the departmental average.

Return employee names, salaries, department, and country if above department average.

  • employees(emp_id, name, salary, dept_id, country)
  • departments(dept_id, dept_name)
SELECT e.name, e.salary, d.dept_name, e.country
FROM employees e
JOIN departments d ON e.dept_id = d.dept_id
WHERE e.salary > (
    SELECT AVG(salary)
    FROM employees
    WHERE dept_id = e.dept_id
); 

Tips for this one:

  1. Correlated subquery ensures per-department comparison.
  2. Alias subqueries for clarity.
  3. Use > to find above-average earners.

Question 4 – CASE with Joins

Scenario: In Italy and Poland, categorize employees based on salary ranges.

Show name, department, country, and salary category.

  • employees(emp_id, name, salary, dept_id, country)
  • departments(dept_id, dept_name)

Answer:

SELECT e.name,
       d.dept_name,
       e.country,
       CASE
           WHEN e.salary > 7000 THEN 'High'
           WHEN e.salary BETWEEN 4000 AND 7000 THEN 'Medium'
           ELSE 'Low'
       END AS salary_category
FROM employees e
JOIN departments d ON e.dept_id = d.dept_id
WHERE e.country IN ('Italy', 'Poland'); 

Tips in-hand:

  1. Use CASE for categorization.
  2. BETWEEN works for ranges.
  3. Use IN for multiple countries.

Question 5 – FULL OUTER JOIN

Scenario: In Luxembourg and UK, employee and salary data may be inconsistent.

List all employees and their salaries, showing unmatched records too.

  • employees(emp_id, name, country)
  • salaries(emp_id, amount)

Answer of the problem:

SELECT COALESCE(e.name, 'Unknown Employee') AS employee_name,
       COALESCE(e.country, 'Unknown Country') AS country,
       COALESCE(s.amount, 0) AS salary
FROM employees e
FULL OUTER JOIN salaries s ON e.emp_id = s.emp_id
WHERE e.country IN ('Luxembourg','UK') OR e.country IS NULL; 
  1. FULL OUTER JOIN preserves all rows.
  2. COALESCE replaces NULLs.
  3. Watch for country filter with NULL handling.

Question 6 – Self JOIN

Scenario: In Greece, employees are mentors to juniors. Each mentor is also an employee.

List each mentor with their juniors’ names.

  • employees(emp_id, name, country, mentor_id)
SELECT m.name AS mentor_name,
       j.name AS junior_name
FROM employees j
JOIN employees m ON j.mentor_id = m.emp_id
WHERE j.country = 'Greece'; 

Question 7 – Window Function (RANK)

Scenario: In Malta, management wants the top 3 highest salaries per department.

Show employee name, department, salary, and rank within their department.

  • employees(emp_id, name, salary, dept_id, country)
  • departments(dept_id, dept_name)

and the answer:

SELECT e.name, d.dept_name, e.salary,
       RANK() OVER (PARTITION BY d.dept_name ORDER BY e.salary DESC) AS salary_rank
FROM employees e
JOIN departments d ON e.dept_id = d.dept_id
WHERE e.country = 'Malta'
QUALIFY salary_rank <= 3; 

also the tips:

  1. Use RANK() for ordered ranking.
  2. PARTITION BY department resets rank per dept.
  3. Use QUALIFY (if DB supports) or subquery.

Question 8 – CROSS JOIN

Scenario: A survey is run in Spain pairing each employee with all projects to test assignment capacity. List all possible employee-project combinations in Spain.

  • employees(emp_id, name, country)
  • projects(proj_id, proj_name)

act like this:

SELECT e.name, p.proj_name
FROM employees e
CROSS JOIN projects p
WHERE e.country = 'Spain'; 

note that:

  1. CROSS JOIN produces Cartesian product.
  2. Always filter to avoid explosion.
  3. Use for simulations/testing assignments.

Question 9 – EXISTS Clause

Scenario: In Germany, list employees who are assigned to at least one project. Find employees in Germany that exist in employee_projects.

  • employees(emp_id, name, country)
  • employee_projects(emp_id, proj_id)

response :

SELECT e.name
FROM employees e
WHERE e.country = 'Germany'
AND EXISTS (
    SELECT 1 FROM employee_projects ep
    WHERE ep.emp_id = e.emp_id
); 
  1. EXISTS stops at first match (efficient).
  2. Use correlated subquery.
  3. Faster than JOIN for existence check.

Question 10 – NOT EXISTS

Scenario: In UK, management wants employees not assigned to any project. List employees without any project.

  • employees(emp_id, name, country)
  • employee_projects(emp_id, proj_id)

Answer:

SELECT e.name
FROM employees e
WHERE e.country = 'UK'
AND NOT EXISTS (
    SELECT 1 FROM employee_projects ep
    WHERE ep.emp_id = e.emp_id
); 

Tips for the query:

  1. NOT EXISTS is clean for anti-joins.
  2. Ensure correlated subquery.
  3. Alternative: LEFT JOIN + IS NULL.

Question 11 – Aggregate with GROUP BY

Scenario: In Italy, calculate total salary cost per department.

Now, return department and sum of salaries.

  • employees(emp_id, name, salary, dept_id, country)
  • departments(dept_id, dept_name)
SELECT d.dept_name, SUM(e.salary) AS total_salary
FROM employees e
JOIN departments d ON e.dept_id = d.dept_id
WHERE e.country = 'Italy'
GROUP BY d.dept_name; 

Tips to observe:

  1. Always GROUP BY non-aggregated columns.
  2. Use SUM for totals.
  3. Restrict rows with WHERE before grouping.

Question 12 – HAVING vs WHERE

Scenario: In Poland, return departments with an average salary above 5000.

Show dept_name, avg_salary for qualifying departments.

  • employees(emp_id, name, salary, dept_id, country)
  • departments(dept_id, dept_name)

Answer:

SELECT d.dept_name, AVG(e.salary) AS avg_salary
FROM employees e
JOIN departments d ON e.dept_id = d.dept_id
WHERE e.country = 'Poland'
GROUP BY d.dept_name
HAVING AVG(e.salary) > 5000; 

Tips:

  1. WHERE filters rows before aggregation.
  2. HAVING filters after aggregation.
  3. Always group by dept_name.

Question 13 – UNION

Scenario: Combine employee lists from Spain and Germany. Return unique employee names.

  • employees(emp_id, name, country)

:

SELECT name FROM employees WHERE country = 'Spain'
UNION
SELECT name FROM employees WHERE country = 'Germany'; 

Tips:

  1. UNION removes duplicates.
  2. Columns must align.
  3. For all rows, use UNION ALL.

Question 14 – INTERSECT

Scenario: Find employees who worked in both UK and Luxembourg offices.

Return names present in both countries.

Tables:

  • employees(emp_id, name, country)

Answer is like

SELECT name FROM employees WHERE country = 'UK'
INTERSECT
SELECT name FROM employees WHERE country = 'Luxembourg'; 

Tips of the question:

  1. INTERSECT finds common rows.
  2. Supported in some SQL dialects only.
  3. Alternatives: JOIN or IN queries.

Question 15 – EXCEPT

Scenario: List employees in Greece who are not in Poland.

Return employees present only in Greece.

  • employees(emp_id, name, country)

Answer:

SELECT name FROM employees WHERE country = 'Greece'
EXCEPT
SELECT name FROM employees WHERE country = 'Poland'; 
  1. EXCEPT removes matching rows.
  2. Result set is distinct.
  3. Not in all databases.

Question 16 – JOIN + DISTINCT

Scenario: In Spain, find distinct departments employees belong to.

Return list of departments in Spain.

  • employees(emp_id, name, dept_id, country)
  • departments(dept_id, dept_name)
SELECT DISTINCT d.dept_name
FROM employees e
JOIN departments d ON e.dept_id = d.dept_id
WHERE e.country = 'Spain'; 
  1. DISTINCT ensures unique rows.
  2. Use JOIN to link dept_id.
  3. Filter base table by country.

Question 17 – Multi-table JOIN

Scenario: In Germany, show employees, their project names, and department names.

Return employee name, dept_name, proj_name.

Tables:

  • employees(emp_id, name, dept_id, country)
  • departments(dept_id, dept_name)
  • employee_projects(emp_id, proj_id)
  • projects(proj_id, proj_name)

Answer like this and be careful about the typos”

SELECT e.name, d.dept_name, p.proj_name
FROM employees e
JOIN departments d ON e.dept_id = d.dept_id
JOIN employee_projects ep ON e.emp_id = ep.emp_id
JOIN projects p ON ep.proj_id = p.proj_id
WHERE e.country = 'Germany'; 
  1. Multiple JOINs combine data.
  2. Always map foreign keys.
  3. Filter by country at end.

Question 18 – CTE

Scenario: In UK, list employees who earn more than the average salary.

Question: Use a CTE to calculate the average, then filter employees.

  • employees(emp_id, name, salary, country)

go for:

WITH avg_salary AS (
    SELECT AVG(salary) AS avg_sal
    FROM employees
    WHERE country = 'UK'
)
SELECT name, salary
FROM employees, avg_salary
WHERE country = 'UK' AND salary > avg_salary.avg_sal; 

Tips:

  1. CTE improves readability.
  2. Use aggregate inside CTE.
  3. Join CTE with main query.

Question 19 – Recursive CTE

Scenario: In Luxembourg, company tracks hierarchy of managers. Find reporting chains.

Kindly list employee and their top manager 🙂

  • employees(emp_id, name, manager_id, country)

Answer:

WITH RECURSIVE hierarchy AS (
    SELECT emp_id, name, manager_id
    FROM employees
    WHERE manager_id IS NULL AND country = 'Luxembourg'
    UNION ALL
    SELECT e.emp_id, e.name, e.manager_id
    FROM employees e
    JOIN hierarchy h ON e.manager_id = h.emp_id
)
SELECT * FROM hierarchy; 

Tips?

  1. Recursive CTEs navigate hierarchies.
  2. Anchor query starts recursion.
  3. UNION ALL required.

Question 20 – LAG Function

Scenario: In Italy, track salary changes over time.

List employee, salary, previous salary .. ?

  • salary_history(emp_id, salary, change_date)

and the answer:

SELECT emp_id, salary, 
       LAG(salary) OVER (PARTITION BY emp_id ORDER BY change_date) AS prev_salary
FROM salary_history; 

Conclusion

Agile industries thrive on its ability to harness data for innovation, competitiveness, and long-term player engagement. With challenges like real-time analytics, churn prediction, monetization strategies, and competitive intelligence, the demand for scalable and efficient data frameworks continues to grow. At the core of these frameworks is the ability to query, integrate, and analyze relational data effectively.

The SQL scenarios explored here — from basic joins to advanced techniques like window functions, recursive CTEs, and subqueries — reflect the real-world complexities data engineers face in this industry. They simulate situations where insights must be drawn from diverse datasets spanning multiple countries. That said, mastering SQL joins is more than a functional skill; it is a direct enabler of data-driven storytelling, operational excellence, and strategic foresight in highly competitive markets. As the industry continues to evolve, engineers who can transform raw data into actionable intelligence through SQL will remain indispensable in shaping the future of fast-paced industries.

5/5 - (1 vote)
You might also like