Tools: How SQL JOINs Finally Clicked for Me (and How They Can for You Too)

Tools: How SQL JOINs Finally Clicked for Me (and How They Can for You Too)

Source: Dev.to

The Big Block That Held Me Back from Understanding ## The Setup: Two Simple Tables ## Customers ## Orders ## CROSS JOIN — Why This One Feels Wrong (and That’s Normal) ## INNER JOIN — The Join That made Joins To Finally Make Sense ## The rule ## What happens ## LEFT JOIN ## Result ## RIGHT JOIN ## FULL OUTER JOIN ## JOIN Conditions Are Not Always Equality ## Everything came together but there was a Mistake I Discovered I Was Making ## The mistake ## The fix I learnt ## My Conclusions ## Quick Summary Today's post is not a “perfect” explanation of SQL joins. It’s the explanation I wish I had when joins first confused and kept me in place for some time. And I’m writing this from the perspective of someone who: If SQL joins have ever made you feel lost, overwhelmed this is for you. At first, joins felt like this: “Some magic SQL thing where tables are smashed together somehow.” Truly that's how it felt at the time. I understood the idea of what it was supposed to do. Merge tables together based on a common column. All of that was wrong. All assumptions from simply just seeing the syntax and expecting my brain to follow along. I didn’t understand that joins are just comparisons between rows. It wasn't magic. It was just row-by-row matching with rules. Let's take a very a very simple and normal business scenario: customers placing orders. In this table we have: Customer 4 does not exist and Customer 3 has no orders Cross joins confused me at first but after actually applying it and seeing what it does. I finally understood that it: Pairs every row in the first table with every row in the second. This is something called a Cartesian Product. For example: This query produces 12 rows: How? 3 customers × 4 orders = 12 rows This was my biggest breakthrough. Only keep rows where the join condition matches in both tables INNER JOIN is strict. So if there's no match, the row is completely dropped. LEFT JOIN clicked faster for me. Keep everything from the left table. In this case it's the customer table that's on the left. So we keep the left table in place and match the customer_id of both tables and then join them into one table. So in this join table, any order that doesn't match the left table is filled in with Nulls Charlie finally shows up — with NULLs. LEFT JOIN answers questions like: “Show me all customers, even those who haven’t ordered yet.” RIGHT JOIN is just LEFT JOIN from the other direction. So it keeps the right table in place, matches based on the join condition and joins the table and any row on the left that doesn't match the condition is filled out with Nulls. This is where both Sides Matter. FULL JOIN keeps all matching rows and all unmatched rows from both tables. This can be useful for data reconciliation or finding missing or orphaned records This surprised me as most of the examples I'd seen were equalities. Rather I discovered you can join on: So I kept practicing and learning and I started thinking of filtering results from the join tables and well I discovered how a lack of understanding of how SQL works could cause silent bugs. This turns your LEFT JOIN into an INNER JOIN. This is because the WHERE runs after the join and the rows with NULL orders are filtered out. But that was not what I was trying to query from my data. Unmatched customers are preserved. Adding the filtering to the join ensures that I get the left join needed but I also filter for the data need. Joins are not scary. I kept pushing my learning of SQL at joins because most of the tutorials I used weren't explaining it in a way I understood. But after getting over my fears, I'm so happy that I was able to fully understand this concept so well. Row-by-row comparisons with rules about what to keep. If joins ever made you feel lost, that’s normal. It doesn’t mean you’re bad at SQL. It means you’re finally learning how it actually works. — Jessica Aki Data and Database Engineering Enthusiast I’m documenting my journey into data engineering learning SQL, databases, and the systems behind modern data platforms. I would love you to join me on this. Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse CODE_BLOCK: select * from customers; Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: select * from customers; CODE_BLOCK: select * from customers; CODE_BLOCK: select * from orders; Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: select * from orders; CODE_BLOCK: select * from orders; CODE_BLOCK: select * from customers cross join orders; Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: select * from customers cross join orders; CODE_BLOCK: select * from customers cross join orders; CODE_BLOCK: select * from customers inner join orders on customers.customer_id = orders.customer_id; Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: select * from customers inner join orders on customers.customer_id = orders.customer_id; CODE_BLOCK: select * from customers inner join orders on customers.customer_id = orders.customer_id; CODE_BLOCK: select * from customers left join orders on customers.customer_id = orders.customer_id; Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: select * from customers left join orders on customers.customer_id = orders.customer_id; CODE_BLOCK: select * from customers left join orders on customers.customer_id = orders.customer_id; CODE_BLOCK: select * from orders left join customers on customers.customer_id = orders.customer_id; Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: select * from orders left join customers on customers.customer_id = orders.customer_id; CODE_BLOCK: select * from orders left join customers on customers.customer_id = orders.customer_id; CODE_BLOCK: select * from customers full join orders on customers.customer_id = orders.customer_id; Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: select * from customers full join orders on customers.customer_id = orders.customer_id; CODE_BLOCK: select * from customers full join orders on customers.customer_id = orders.customer_id; CODE_BLOCK: select * from customers c join orders o on length(c.customer_name) = length(o.product); Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: select * from customers c join orders o on length(c.customer_name) = length(o.product); CODE_BLOCK: select * from customers c join orders o on length(c.customer_name) = length(o.product); CODE_BLOCK: select * from customers c left join orders o on c.customer_id = o.customer_id where o.product = 'Laptop'; Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: select * from customers c left join orders o on c.customer_id = o.customer_id where o.product = 'Laptop'; CODE_BLOCK: select * from customers c left join orders o on c.customer_id = o.customer_id where o.product = 'Laptop'; CODE_BLOCK: select * from customers c left join orders o on c.customer_id = o.customer_id and o.product = 'Laptop'; Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: select * from customers c left join orders o on c.customer_id = o.customer_id and o.product = 'Laptop'; CODE_BLOCK: select * from customers c left join orders o on c.customer_id = o.customer_id and o.product = 'Laptop'; - Understood tables and primary keys - Had an understanding of foreign keys - Knew joins were important - Understood Normalization and Denormalization - But absolutely did not understand what was actually happening during a join - The database “knew” how to join tables - JOIN meant “merge tables” - LEFT and RIGHT were arbitrary keywords - Alice → has an order → keep - Bob → has orders → keep - Charlie → no orders → gone - Order with customer_id 4 → gone - All customers appear - Orders appear if they exist - No order? → NULL values - Equal values - Expressions - CROSS JOIN → everything × everything - INNER JOIN → only matches - LEFT JOIN → keep left, fill NULLs - RIGHT JOIN → keep right, fill nulls - FULL JOIN → keep everything