0% found this document useful (0 votes)
14 views4 pages

FAANG SQL Questions & Solutions Guide

The document provides a collection of SQL practice questions and solutions, focusing on various topics such as joins, aggregations, window functions, and common table expressions (CTEs). It includes queries for tasks like finding the highest sales, consecutive login streaks, and top spenders. The content is tailored for FAANG-level SQL practice, showcasing essential SQL techniques and functions.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views4 pages

FAANG SQL Questions & Solutions Guide

The document provides a collection of SQL practice questions and solutions, focusing on various topics such as joins, aggregations, window functions, and common table expressions (CTEs). It includes queries for tasks like finding the highest sales, consecutive login streaks, and top spenders. The content is tailored for FAANG-level SQL practice, showcasing essential SQL techniques and functions.
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Priyansh Malviya - FAANG SQL Practice Questions & Solutions

1. Employees and Departments Join

Query:

SELECT [Link], d.dept_name


FROM employees e
JOIN Departments d ON e.department_id = [Link];

2. Customer Total Amount

Query:

SELECT customer_id, SUM(amount) as total_amount


FROM Orders
GROUP BY customer_id
ORDER BY total_amount DESC;

3. Scores Dense Rank

Query:

SELECT score, DENSE_RANK() OVER (PARTITION BY exam_id ORDER BY score DESC) AS


rank
FROM Scores;

4. Lead Example for Logs

Query:

SELECT LEAD(num, 1) OVER (ORDER BY id) AS next_num


FROM Logs;

1
5. Highest Sales per Seller

Query:

SELECT seller_id, MAX(amount) AS highest_sales


FROM Sales
GROUP BY seller_id;

6. Second Highest Sale per Seller

Query:

SELECT seller_id, amount AS second_highest_sale


FROM (
SELECT seller_id, amount, ROW_NUMBER() OVER (PARTITION BY seller_id ORDER
BY amount DESC) AS rn
FROM Sales
) t
WHERE rn = 2;

7. Consecutive Login Streak

Query:

SELECT DISTINCT user_id


FROM (
SELECT user_id, activity_date,
LEAD(activity_date,1) OVER (PARTITION BY user_id ORDER BY
activity_date) AS next1,
LEAD(activity_date,2) OVER (PARTITION BY user_id ORDER BY
activity_date) AS next2
FROM Activity
) t
WHERE DATEDIFF(next1, activity_date)=1 AND DATEDIFF(next2, activity_date)=2;

8. Top Spender per Customer (Last 30 Days)

Query:

2
WITH cte AS (
SELECT user_id, SUM(amount) AS total_spent,
DENSE_RANK() OVER (ORDER BY SUM(amount) DESC) AS rank
FROM Transactions
WHERE transaction_date >= DATE_SUB(CURDATE(), INTERVAL 30 DAY)
GROUP BY user_id
)
SELECT user_id, total_spent
FROM cte
WHERE rank <= 3;

9. Longest Order Streak per Customer

Query:

WITH streaks AS (
SELECT customer_id, order_date,
order_date - INTERVAL ROW_NUMBER() OVER (PARTITION BY customer_id
ORDER BY order_date) DAY AS grp
FROM Orders
)
SELECT customer_id, COUNT(*) AS longest_streak
FROM streaks
GROUP BY customer_id, grp
ORDER BY customer_id;

10. Top Spender in Last 90 Days

Query:

WITH cte AS (
SELECT customer_id, SUM(amount) AS total_spent,
DENSE_RANK() OVER (ORDER BY SUM(amount) DESC) AS rnk
FROM Orders
WHERE order_date >= DATE_SUB(CURDATE(), INTERVAL 90 DAY)
)
SELECT customer_id, total_spent
FROM cte
WHERE rnk = 1;

3
11. Second Highest Salary per Department

Query:

WITH cte AS (
SELECT name, salary, department_id,
DENSE_RANK() OVER (PARTITION BY department_id ORDER BY salary DESC)
AS rnk
FROM Employees
)
SELECT [Link], [Link], d.dept_name
FROM cte e
JOIN Departments d ON e.department_id = [Link]
WHERE rnk = 1;

This PDF contains all SQL questions with their solutions we discussed so far, including FAANG-level
practice, window functions, ranking, lead/lag, CTEs.

Common questions

Powered by AI

To find the top spender among customers in the last 90 days, a CTE is used to filter transactions within this timeframe and calculate the total spent by each customer. The query uses SUM(amount) to aggregate spending and DENSE_RANK() to rank customers based on total spending in descending order. The CTE's result filters for customers with a rank of 1, indicating the highest spender: WITH cte AS (SELECT customer_id, SUM(amount) AS total_spent, DENSE_RANK() OVER (ORDER BY SUM(amount) DESC) AS rnk FROM Orders WHERE order_date >= DATE_SUB(CURDATE(), INTERVAL 90 DAY)) SELECT customer_id, total_spent FROM cte WHERE rnk = 1 .

The retrieval of the second highest salary per department uses a Common Table Expression (CTE) with DENSE_RANK as the primary SQL function. The query assigns a rank to salaries within each department using DENSE_RANK(), ordered by salary in descending order. The CTE filters out the salaries to keep only those ranked second. The final selection joins the results with a Departments table to fetch department names via a join operation: WITH cte AS (SELECT name, salary, department_id, DENSE_RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rnk FROM Employees) SELECT e.name, e.salary, d.dept_name FROM cte e JOIN Departments d ON e.department_id = d.id WHERE rnk = 1 .

Window functions like DENSE_RANK() provide significant benefits over traditional sorting and grouping because they allow for more nuanced data manipulation and analysis without reducing the dataset's size. Unlike grouping (GROUP BY), which collapses rows, window functions calculate rankings and other aggregates across the entire dataset within specified partitions. This permits retaining complete row details while adding additional analytical layers, such as rankings or moving averages. DENSE_RANK() specifically helps address ties more effectively by assigning the same rank to equal values, unlike ROW_NUMBER(), which would differentiate them sequentially, thus providing more meaningful insights in scenarios requiring rank-based analysis .

Effective SQL strategies for dealing with large datasets to identify top performers include using index optimizations to expedite data retrieval, applying window functions like RANK() or DENSE_RANK() to dynamically analyze data, and employing partitioning to manage data subsets efficiently. Implementing CTEs for logical separation of query components also aids in managing performance. Strategic use of WHERE clauses narrows datasets before processing, decreasing computation loads. Finally, leveraging analytical functions over raw data aggregation permits insight without limiting dataset visibility, which is crucial in performance and high-value entry analysis .

To calculate the longest order streak for each customer, a window function combined with the concept of grouping sequential data through row numbering is used. A CTE is created to assign each order a group number by subtracting the row number from the order date, ensuring orders on consecutive dates group together. The query counts the number of orders per group per customer to determine the length of each streak. The result is then ordered by customer and grouped sequence: WITH streaks AS (SELECT customer_id, order_date, order_date - INTERVAL ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date) DAY AS grp FROM Orders) SELECT customer_id, COUNT(*) AS longest_streak FROM streaks GROUP BY customer_id, grp ORDER BY customer_id .

SQL can differentiate between tied scores within a dataset while maintaining a consistent ranking sequence using the DENSE_RANK() function. Unlike RANK(), which creates gaps in ranks for ties, DENSE_RANK() assigns the same rank to identical scores but does not skip subsequent ranks. This ensures a continuous sequence where each score ranks correctly without leaving gaps. The syntax typically used is DENSE_RANK() OVER (ORDER BY [score_column] DESC), and the dataset must be ordered based on the scoring criteria to apply ranks appropriately .

Potential issues with using LEAD() and LAG() functions include performance hits due to large data volumes and logic errors when partitioning is improperly configured. Since these functions access subsequent or previous rows, they can cause inefficiencies without proper indexing or limiting datasets using WHERE clauses. Logical errors occur when incorrect partitioning or ordering happens, leading to unintended cross-reference across unrelated groupings. Mitigation strategies involve optimizing query conditions to limit unnecessary calculations, using appropriate indices for partition sorting, and ensuring correct partition and order criteria to maintain logical consistency in time series or sequential comparison tasks .

The LEAD() function is crucial for identifying consecutive login streaks as it allows the query to compare each user's login date with the subsequent two login dates. It computes the next two activity dates for each user ordered by the activity_date field. The logic then checks if the difference between these dates is exactly one and two days, respectively, confirming a consecutive streak. When both conditions are met, the user ID is selected as part of the result. This window function is essential for prefetching the next row's data and simplifies comparing consecutive rows: SELECT DISTINCT user_id FROM (SELECT user_id, activity_date, LEAD(activity_date,1) OVER (PARTITION BY user_id ORDER BY activity_date) AS next1, LEAD(activity_date,2) OVER (PARTITION BY user_id ORDER BY activity_date) AS next2 FROM Activity) t WHERE DATEDIFF(next1, activity_date) = 1 AND DATEDIFF(next2, activity_date) = 2 .

SQL's CTE enhances the readability and manageability of complex queries by allowing the breaking down of queries into modular parts. It provides a way to write more organized and understandable queries by creating temporary result sets referenced within the main query. Through CTEs, particularly when handling recursive or deeply nested SQL logic, a programmer can define separate logic units that clarify each step's function and contribute to overall query clarity and maintenance. Furthermore, CTEs can be used to avoid repeated subquery computations, which optimizes the query execution plan by focusing on necessary data transformations in a clear, orderly manner .

The ORDER BY clause enables ordering of results by their aggregated totals when calculating total sales per customer, using the syntax ORDER BY total_amount DESC. This is beneficial because it allows for prioritization of customers based on their contribution to sales, thereby facilitating insights into customer spending behavior or identifying significant revenue sources. In cases where total amounts are aggregated, such as SUM(amount), this clause ensures that the most critical data is presented prominently, which supports targeted marketing or prioritization in business strategies .

You might also like