Subquery vs JOIN in SQL — performance and readability tradeoffs

Most query planners (Postgres, MySQL 8+, SQL Server) optimize them identically for simple cases. Pick based on readability. For complex correlated subqueries, JOINs are usually faster and more predictable.

Side-by-side — same result

-- Subquery (IN)
SELECT name FROM customers
WHERE id IN (SELECT customer_id FROM orders WHERE total > 10000);

-- JOIN
SELECT DISTINCT c.name
FROM customers c
JOIN orders o ON o.customer_id = c.id
WHERE o.total > 10000;

Postgres typically rewrites both to the same execution plan. Test with EXPLAIN.

Where JOIN clearly wins — multi-column lookup

-- Painful as subquery
SELECT id FROM orders WHERE customer_id IN (
  SELECT id FROM customers WHERE city = 'mumbai' AND tier = 'gold'
);

-- Cleaner as JOIN
SELECT o.id FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE c.city = 'mumbai' AND c.tier = 'gold';

JOIN also lets you SELECT columns from both tables; subquery cannot.

Where subquery wins — anti-join (NOT EXISTS)

-- Customers with NO orders — cleaner as subquery
SELECT name FROM customers c
WHERE NOT EXISTS (SELECT 1 FROM orders o WHERE o.customer_id = c.id);

-- LEFT JOIN equivalent — works but less self-documenting
SELECT c.name FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
WHERE o.id IS NULL;

When subquery is a trap — correlated subquery on a big table

-- ❌ Runs the inner query ONCE PER ROW of customers — N×M scans
SELECT name,
       (SELECT SUM(total) FROM orders WHERE customer_id = customers.id) AS total
FROM customers;

Rewrite as JOIN + GROUP BY:

-- ✅ Single scan + hash join
SELECT c.name, COALESCE(SUM(o.total), 0) AS total
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
GROUP BY c.id, c.name;

IN vs EXISTS

-- IN — best when the subquery is small or static
WHERE customer_id IN (1, 5, 9, 12)

-- EXISTS — best when the subquery is large; stops at first match
WHERE EXISTS (SELECT 1 FROM big_table WHERE big_table.fk = main.id)

Rule I use in code review

Use JOIN when you need columns from both tables.
Use EXISTS / NOT EXISTS for "does a related row exist".
Use IN with a literal list.
Avoid correlated subqueries in the SELECT list — they're the silent killer of slow queries.

-- Subquery (IN) SELECT name FROM customers WHERE id IN (SELECT customer_id FROM orders WHERE total > 10000); -- JOIN SELECT DISTINCT c.name FROM customers c JOIN orders o ON o.customer_id = c.id WHERE o.total > 10000;

-- Painful as subquery SELECT id FROM orders WHERE customer_id IN ( SELECT id FROM customers WHERE city = 'mumbai' AND tier = 'gold' ); -- Cleaner as JOIN SELECT o.id FROM orders o JOIN customers c ON c.id = o.customer_id WHERE c.city = 'mumbai' AND c.tier = 'gold';

-- Customers with NO orders — cleaner as subquery SELECT name FROM customers c WHERE NOT EXISTS (SELECT 1 FROM orders o WHERE o.customer_id = c.id); -- LEFT JOIN equivalent — works but less self-documenting SELECT c.name FROM customers c LEFT JOIN orders o ON o.customer_id = c.id WHERE o.id IS NULL;

-- IN — best when the subquery is small or static WHERE customer_id IN (1, 5, 9, 12) -- EXISTS — best when the subquery is large; stops at first match WHERE EXISTS (SELECT 1 FROM big_table WHERE big_table.fk = main.id)

Subquery vs JOIN in SQL — performance and readability tradeoffs

Side-by-side — same result

Where JOIN clearly wins — multi-column lookup

Where subquery wins — anti-join (NOT EXISTS)

When subquery is a trap — correlated subquery on a big table

IN vs EXISTS

Rule I use in code review

Subquery vs JOIN in SQL — performance and readability tradeoffs

Side-by-side — same result

Where JOIN clearly wins — multi-column lookup

Where subquery wins — anti-join (NOT EXISTS)

When subquery is a trap — correlated subquery on a big table

IN vs EXISTS

Rule I use in code review

Tokens, context windows, and the O(n²) attention cost — what every dev should know

Cold starts in Azure Functions — what they are and how to mitigate

useMemo vs useCallback in React — when each one actually helps

Tokens, context windows, and the O(n²) attention cost — what every dev should know

Cold starts in Azure Functions — what they are and how to mitigate

useMemo vs useCallback in React — when each one actually helps

Related questions

Related questions