How to Parallel Index Scan in YSQL For Temporal Joins

Author: Mark Peacock

Originally Sourced from: https://www.yugabyte.com/blog/parallel-index-scan-in-ysql/

In this blog, we explore how a common temporal join pattern in analytics used to require extra tweaks in distributed Postgres-compatible systems, and how this is no longer necessary with full support for native PostgreSQL Parallel Query (PQ).

Teams often had to implement bucketized indexes with UNION ALL views for Parallel Append. With current YSQL optimizer features, this same pattern can run with a Parallel Index Scan on the temporal table and a YB Batched Nested Loop Join into PK lookups (using PostgreSQL indexes and native SQL).

Parallel Index Scan shows gains over serial execution, but YugabyteDB already provides tablet-level parallelism, which adds parallelism to non-PostgreSQL PQ execution. This means it effectively provides double parallelism.

The Workaround

Before this improved path, the common workaround was a mix of bucketized indexes and query rewrites to match those structures and enable Parallel Append. This works well, but adds complexity with additional schema and query changes.

What Changed?

YSQL optimizer improvements (in version 2025.2.2) now allow parallel execution for this query pattern, up to worker caps.

When the leading index column aligns with the range predicate (for example, tt_to ASC for time-window access), the planner can choose:

Gather/Gather Merge
worker processes launched
Parallel Index Scan on the temporal index

How to Set up Your GUCs to Use the Feature

These features can be enabled by GUC, here we have performed database-level feature enablement by setting the following GUCs to ON: Note that we plan to make these ‘on’ by default, so there will be no configuration changes required to enable the feature in the future.

ALTER DATABASE yugabyte SET yb_enable_cbo = on;
ALTER DATABASE yugabyte SET yb_enable_parallel_scan_colocated = on;
ALTER DATABASE yugabyte SET yb_enable_parallel_scan_range_sharded = on;
ALTER DATABASE yugabyte SET yb_enable_parallel_scan_hash_sharded = on;
ALTER DATABASE yugabyte SET yb_enable_parallel_append = on;

The session-level DOP used for the test path:

SET max_parallel_workers_per_gather = 6;

Everything else stayed at defaults, including parallel_tuple_cost and parallel_setup_cost, so the planner still assumes relatively high parallel overhead and won’t choose parallel plans unless the workload is large enough to justify it.

DDL and Query Shape

Here is the anonymized schema we are using to demonstrate PQ activating:

CREATE SCHEMA pq_anon_parallel_demo;

CREATE TABLE entity_payload (
  version_ref BIGINT NOT NULL,
  entity_type_id INT NOT NULL,
  payload JSONB NOT NULL,
  PRIMARY KEY ((version_ref) HASH)
);

CREATE TABLE entity_validity (
  version_ref BIGINT NOT NULL,
  entity_ref TEXT NOT NULL,
  tt_from TIMESTAMPTZ NOT NULL,
  tt_to TIMESTAMPTZ NOT NULL,
  vt_from TIMESTAMPTZ NOT NULL,
  vt_to TIMESTAMPTZ NOT NULL,
  aux_metric INT NOT NULL,
  PRIMARY KEY ((version_ref) HASH)
);

CREATE INDEX idx_entity_validity_tt_to_asc_vkey
ON entity_validity (tt_to ASC, version_ref ASC);

Query used:

SELECT count(*)
FROM entity_validity v
JOIN entity_payload p ON p.version_ref = v.version_ref
WHERE v.tt_to > timestamptz '2025-11-17 06:06:09.391+00'
 AND v.tt_to <= timestamptz '2025-11-17 06:06:09.391+00' + interval '180 day'
 AND v.vt_to > timestamptz '2025-11-17 06:06:09.391+00'
 AND v.tt_from <= timestamptz '2025-11-17 06:06:09.391+00'
 AND v.vt_from <= timestamptz '2025-11-17 06:06:09.391+00'
 AND ((p.payload->>'hasNonFlatPosition')::boolean = true);

How to Verify PQ is Working

Use:

EXPLAIN (ANALYZE, DIST, COSTS OFF)
...query...

Plan the shape to look for:

Finalize Aggregate
 -> Gather
    Workers Planned: 6
    Workers Launched: 6
    -> Partial Aggregate
       -> YB Batched Nested Loop Join
          -> Parallel Index Scan using idx_entity_validity_tt_to_asc_vkey on entity_validity v
          -> Index Scan using entity_payload_pkey on entity_payload p

If you see that node chain, the temporal side is running as a Parallel Index Scan, not a Parallel Seq Scan, for this query pattern.

The Parallel Index Scan was roughly 50–55% faster than the non PQ plan. These gains should increase as the amount of qualifying work increases.

Where This Helps Most

Best fit:

Large time-window analytics over range-indexed columns
HTAP-style reads where you need live data and cannot rely on pre-aggregation only

Less helpful:

tiny ranges
point lookups
CPU-saturated systems where extra workers add contention

Conclusion

This temporal join pattern allows you to keep your original schema and query and let YSQL execute a Parallel Index Scan directly. This achieves latency gains that scale better as the amount of qualifying work increases.

For more information on parallel queries, check out this blog: Delivering Postgres Parallel Queries in YugabyteDB.