Explain on Join
Explain on Join
Hash Join
nimalanm@/tmp:join_test> explain analyse select * from base b join child ON child.id = b.child_id;
+-----------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
|-----------------------------------------------------------------------------------------------------------|
| Hash Join (cost=38.58..63.74 rows=1200 width=76) (actual time=0.006..0.006 rows=0 loops=1) |
| Hash Cond: (b.child_id = child.id) |
| -> Seq Scan on base b (cost=0.00..22.00 rows=1200 width=40) (actual time=0.005..0.005 rows=0 loops=1) |
| -> Hash (cost=22.70..22.70 rows=1270 width=36) (never executed) |
| -> Seq Scan on child (cost=0.00..22.70 rows=1270 width=36) (never executed) |
| Planning Time: 0.097 ms |
| Execution Time: 0.024 ms |
+-----------------------------------------------------------------------------------------------------------+
EXPLAIN
Time: 0.017s
Two operations take plan, Hash
and the other sub operation
Next
- The sub operation will run
seq scan
on base - Check if join key exists in result of hash
- No, ignore the key
- Yes, fetch the rows
Aggregate
nimalanm@/tmp:join_test> explain analyse select count(*) from base join child ON child.id = base.child_id where length(base.name) > 20;
+--------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
|--------------------------------------------------------------------------------------------------------------------|
| Aggregate (cost=141.58..141.59 rows=1 width=8) (actual time=1.329..1.329 rows=1 loops=1) |
| -> Hash Join (cost=26.16..140.74 rows=333 width=0) (actual time=0.191..1.311 rows=86 loops=1) |
| Hash Cond: (child.id = base.child_id) |
| -> Seq Scan on child (cost=0.00..80.00 rows=5000 width=4) (actual time=0.008..0.560 rows=5000 loops=1) |
| -> Hash (cost=22.00..22.00 rows=333 width=4) (actual time=0.155..0.156 rows=86 loops=1) |
| Buckets: 1024 Batches: 1 Memory Usage: 12kB |
| -> Seq Scan on base (cost=0.00..22.00 rows=333 width=4) (actual time=0.009..0.142 rows=86 loops=1) |
| Filter: (length(name) > 20) |
| Rows Removed by Filter: 914 |
| Planning Time: 0.138 ms |
| Execution Time: 1.381 ms |
+--------------------------------------------------------------------------------------------------------------------+
EXPLAIN
Time: 0.019s
When a criteria is present some additional steps will occur
seq scan
on base and create buckets- Apply filter and remove unnecessary rows
- Join if key exists on base
HashAggregate
nimalanm@/tmp:join_test> explain analyse select child_id, count(*) c from base group by child_id;
+-----------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
|-----------------------------------------------------------------------------------------------------------|
| HashAggregate (cost=22.00..30.93 rows=893 width=12) (actual time=0.422..0.563 rows=893 loops=1) |
| Group Key: child_id |
| -> Seq Scan on base (cost=0.00..17.00 rows=1000 width=4) (actual time=0.010..0.116 rows=1000 loops=1) |
| Planning Time: 0.046 ms |
| Execution Time: 0.695 ms |
+-----------------------------------------------------------------------------------------------------------+
EXPLAIN
Time: 0.027s
For every row find GROUP BY
key, then put the given row in designated bucket
Note:
HashAggregate
has to scan all the rows before it can return a single row
If there are a huge number of rows the hash will be too big to fit into worker_mem
, it will spill into disk
Note: If we have
HashAggregate
andSort
, we can use up to 2 x worker_mem
nimalanm@/tmp:join_test> explain analyse select child_id, count(*) c from base group by child_id order by child_id;
+-----------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
|-----------------------------------------------------------------------------------------------------------------|
| Sort (cost=74.70..76.93 rows=893 width=12) (actual time=1.032..1.097 rows=893 loops=1) |
| Sort Key: child_id |
| Sort Method: quicksort Memory: 66kB |
| -> HashAggregate (cost=22.00..30.93 rows=893 width=12) (actual time=0.641..0.790 rows=893 loops=1) |
| Group Key: child_id |
| -> Seq Scan on base (cost=0.00..17.00 rows=1000 width=4) (actual time=0.032..0.173 rows=1000 loops=1) |
| Planning Time: 0.137 ms |
| Execution Time: 1.272 ms |
+-----------------------------------------------------------------------------------------------------------------+
EXPLAIN
Time: 0.020s
Nested Loop
nimalanm@/tmp:join_test> explain analyse select * from base join child ON child.id = base.child_id where base.child_id in (100, 1001) ;
+------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
|------------------------------------------------------------------------------------------------------|
| Nested Loop (cost=0.28..36.10 rows=2 width=36) (actual time=0.102..0.102 rows=0 loops=1) |
| -> Seq Scan on base (cost=0.00..19.50 rows=2 width=20) (actual time=0.102..0.102 rows=0 loops=1) |
| Filter: (child_id = ANY ('{100,1001}'::integer[])) |
| Rows Removed by Filter: 1000 |
| -> Index Scan using child_pkey on child (cost=0.28..8.30 rows=1 width=16) (never executed) |
| Index Cond: (id = base.child_id) |
| Planning Time: 0.139 ms |
| Execution Time: 0.117 ms |
+------------------------------------------------------------------------------------------------------+
EXPLAIN
Time: 0.026s
- Run
seq scan
on base filtering based on criteria - Loop through the child table for each row filtered
- In this case since the Pk is the join column the looping is skiped
Merge Join
nimalanm@/tmp:join_test> explain analyse select * from (select * from base order by child_id) b join (select * from child order by id) c on b.child_id = c.id;
+---------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
|---------------------------------------------------------------------------------------------------------------------------------|
| Merge Join (cost=67.35..329.17 rows=1000 width=36) (actual time=0.377..2.685 rows=1000 loops=1) |
| Merge Cond: (child.id = base.child_id) |
| -> Index Scan using child_pkey on child (cost=0.28..172.43 rows=5000 width=16) (actual time=0.024..1.063 rows=4999 loops=1) |
| -> Materialize (cost=66.83..81.83 rows=1000 width=20) (actual time=0.345..0.712 rows=1000 loops=1) |
| -> Sort (cost=66.83..69.33 rows=1000 width=20) (actual time=0.343..0.507 rows=1000 loops=1) |
| Sort Key: base.child_id |
| Sort Method: quicksort Memory: 93kB |
| -> Seq Scan on base (cost=0.00..17.00 rows=1000 width=20) (actual time=0.018..0.122 rows=1000 loops=1) |
| Planning Time: 0.215 ms |
| Execution Time: 2.796 ms |
+---------------------------------------------------------------------------------------------------------------------------------+
EXPLAIN
Time: 0.026s
If both data sets are sorted Merge Join is done
Steps
- If join column is same on both sides.
- Get new joined row, based on current rows on both sides
- get next row on right, or left is right is over
- End if no more rows left
- If right column is smaller
- Get row from right
- End if no more rows left
- If left column is smaller
- Get row from left
- End if no more rows left