Union, Union All & Intersect Operators for Advanced SQL

Advanced SQL

SQL, a powerful language for managing relational databases, provides various operators to manipulate and combine data from multiple tables. Among these, the UNION, UNION ALL, and INTERSECT are advanced SQL operators that are essential for performing set operations. These operators allow us to combine, merge, and intersect result sets from different SELECT statements, providing flexibility and efficiency in data analysis.

If you wish to learn SQL and other essential technologies, you can enrol in Imarticus Learning’s postgraduate data science course.

Understanding Set Operations for Advanced SQL

Set operations in advanced SQL treat result sets as sets of rows, where each row is unique. We can combine, intersect, or exclude rows from multiple result sets by applying set operations.

The SQL Union Operator

The SQL UNION operator combines the result sets of two or more SELECT statements, eliminating duplicate rows. It's like merging two sets of data, keeping only the unique elements.

Syntax:

SELECT column1, column2, ...

FROM table1

UNION

SELECT column1, column2, ...

FROM table2;

Example: Consider two tables, customers_usa and customers_europe, each with columns customer_id and customer_name. To combine the unique customers from both regions, you can use the UNION operator:

SELECT customer_id, customer_name

FROM customers_usa

UNION

SELECT customer_id, customer_name

FROM customers_europe;

The SQL UNION ALL Operator

The UNION ALL operator combines the result sets of two or more SELECT statements, including duplicate rows. It's like concatenating the results of multiple queries.

Syntax:

SELECT column1, column2, ...

FROM table1

UNION ALL

SELECT column1, column2, ...

FROM table2;

Example: To combine all customers from both regions, including duplicates, you can use the UNION ALL operator:

SELECT customer_id, customer_name

FROM customers_usa

UNION ALL

SELECT customer_id, customer_name

FROM customers_europe;

The SQL INTERSECT Operator

The INTERSECT operator returns the rows that are present in both result sets of two SELECT statements. It's like finding the intersection of two sets.

Syntax:

SELECT column1, column2, ...

FROM table1

INTERSECT

SELECT column1, column2, ...

FROM table2;

Example: To find customers who are present in both the customers_usa and customers_europe tables, you can use the INTERSECT operator:

SELECT customer_id, customer_name

FROM customers_usa

INTERSECT

SELECT customer_id, customer_name

FROM customers_europe;

Important Considerations 

  • Column Compatibility: The SELECT statements in UNION or INTERSECT operations must consist of the same number of columns while the corresponding columns must have compatible data types.
  • Order of Rows: The order of rows in the result set is not guaranteed.
  • Performance Implications: UNION ALL operations can be more efficient than UNION, as they avoid the overhead of removing duplicates.
  • Null Values: Null values are treated as distinct values in set operations.

Advanced SQL Techniques and Optimisation

Here are some advanced SQL techniques and optimisation methods:

  • Combining Multiple Set Operations: You can combine multiple UNION, UNION ALL, and INTERSECT operations to create complex queries.
  • Using Subqueries: You can use subqueries to create temporary result sets and combine them with set operations.
  • Indexing: Create appropriate indexes on the columns involved in the set operations to improve query performance.
  • Query Optimisation: Use query optimisation techniques to minimise execution time and resource usage.

Combining Set Operations with Joins

Set operations can be combined with join operations to create complex queries involving multiple tables. We can perform sophisticated data analysis and reporting tasks by joining tables based on specific conditions and then applying set operations to the joined result sets.

Example: Consider two tables: orders and order_items. You want to find the top 10 customers who have placed the most orders in both the "US" and "EU" regions.

WITH us_orders AS (

  SELECT customer_id, COUNT(*) AS order_count

  FROM orders

  WHERE region = 'US'

  GROUP BY customer_id

),

eu_orders AS (

  SELECT customer_id, COUNT(*) AS order_count

  FROM orders

  WHERE region = 'EU'

  GROUP BY customer_id

)

SELECT customer_id, SUM(order_count) AS total_orders

FROM (

  SELECT customer_id, order_count

  FROM us_orders

  UNION ALL

  SELECT customer_id, order_count

  FROM eu_orders

) AS combined_orders

GROUP BY customer_id

ORDER BY total_orders DESC

LIMIT 10;

In this example, we first use JOIN to combine the orders and order_items tables. Then, we use UNION ALL to combine the results from the two regions. Finally, we use GROUP BY and ORDER BY to identify the top 10 customers.

Set Operations and Window Functions

Window functions can be combined with set operations to perform calculations and rankings within result sets. This allows us to analyse data in a more granular way and gain deeper insights.

Example: Consider a table of sales data with columns for product_id, region, and sales_amount. You want to find the top-selling product in each region.

WITH product_rankings AS (

  SELECT product_id, region, SUM(sales_amount) AS total_sales,

         ROW_NUMBER() OVER (PARTITION BY region ORDER BY SUM(sales_amount) DESC) AS rank

  FROM sales_data

  GROUP BY product_id, region

)

SELECT product_id, region, total_sales

FROM product_rankings

WHERE rank = 1;

In this example, we use the ROW_NUMBER() window function to rank products within each region by total sales. Then, we use a WHERE clause to filter for the top-ranked product in each region.

Real-World Applications of Set Operations

Set operations have numerous real-world applications across various industries. Some common use cases include:

  • Data Cleaning and Deduplication: Identifying and removing duplicate records from datasets.
  • Data Integration: Combining data from multiple sources into a unified view.
  • Financial Analysis: Analysing financial data to identify trends, anomalies, and potential fraud.
  • Marketing Analysis: Analysing customer data to identify target segments and optimise marketing campaigns.
  • Supply Chain Management: Optimising inventory levels and logistics operations.
  • Fraud Detection: Identifying suspicious patterns in financial transactions.

Wrapping Up

We can effectively manipulate and combine data from multiple sources to gain valuable insights by mastering the UNION, UNION ALL, and INTERSECT operators. These operators are powerful tools for data analysis and reporting, enabling you to extract the information you need.

If you wish to become an expert in SQL and other tools for data science, enrol in Imarticus Learning’s Postgraduate Program In Data Science And Analytics.

Frequently Asked Questions

What is the difference between SQL UNION ALL vs INTERSECT?

When it comes to SQL UNION ALL vs INTERSECT, UNION ALL combines the result sets of two or more SELECT statements, including all rows, even duplicates. It's like stacking the results of multiple queries on top of each other. INTERSECT, on the other hand, returns only the rows that are present in both result sets. It's like finding the common elements between two sets.

How can I optimise the performance of queries involving set operations?

To optimise performance, consider creating indexes on the columns involved in the set operations, using query optimisation techniques, and partitioning large tables. Additionally, materialising the results of complex subqueries can improve query execution time.

Can I use set operations with other SQL clauses like WHERE and GROUP BY?

Yes, you can combine set operations with other SQL clauses to create complex queries. For example, you can use a WHERE clause to filter the results of a UNION or INTERSECT operation.

What are some common mistakes to avoid when using set operations?

Common mistakes include forgetting to include all necessary columns in the SELECT statements, using incompatible data types, and not considering the order of rows in the result set. It's important to carefully plan and test your queries to avoid errors.

Share This Post

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Our Programs

Do You Want To Boost Your Career?

drop us a message and keep in touch