BigQuery: why we keep three copies of every reward and stake table

BigQuery bills by the byte. On-demand, you pay for the data a query scans, not the rows it returns — $6.25 per TiB, with the first 1 TiB each month free (per Google Cloud's published rates; regional pricing varies). So how a table is physically laid out is not an implementation detail. It is the difference between a query that scans 30 MB and the identical query scanning 30 GB.

We index the full Cardano mainnet. Two of the larger datasets are staking-related: reward (every staking reward, ~427M rows) and epoch_stake (every per-epoch stake snapshot, ~466M rows). Each carries three identity columns people filter on: epoch_no, stake_addr_hash, and pool_hash.

Three columns, three ways people query. One physical table serves only one of them well. So we keep three copies of each:

Table name	Optimised for a filter on	Layout
`reward`	`epoch_no`	partitioned by `epoch_no`, clustered by `stake_addr_hash`
`reward_addr`	`stake_addr_hash`	clustered by `stake_addr_hash`
`reward_pool`	`pool_hash`	clustered by `pool_hash`
`epoch_stake`	`epoch_no`	clustered by `epoch_no`
`epoch_stake_addr`	`stake_addr_hash`	clustered by `stake_addr_hash`
`epoch_stake_pool`	`pool_hash`	clustered by `pool_hash`

One sort order, three questions

Clustering sorts a table by a chosen key. BigQuery then skips the blocks that can't contain your filter value. The catch: clustering only prunes on a prefix of the cluster columns, left to right. A table sorted by pool_hash does nothing for a query that filters only on stake_addr_hash — the rows you want are spread across every block, so BigQuery reads them all.

The obvious workaround — cluster by all three, (epoch_no, stake_addr_hash, pool_hash) — doesn't escape this. Prefix-only pruning helps a filter on epoch_no, or on epoch_no and stake_addr_hash, but it does nothing for a filter on pool_hash alone. Whichever column leads wins; the rest barely prune.

And our three access patterns are independent:

An operator checks their pool — filter on pool_hash.
A delegator pulls their reward history for taxes — filter on stake_addr_hash, across all time.
An analyst reads a single epoch — filter on epoch_no.

No single ordering serves all three. So each copy leads with the key its audience actually filters on.

The partition trap

Here's the part that surprised us.

Our first instinct for the address table was the textbook move: partition by epoch_no, cluster by stake_addr_hash. Partitioning splits a table into physical segments — one per epoch here — and a query touching a few epochs reads only those segments. It is the right tool when queries filter a narrow range of the partition key.

It backfired. A delegator's tax query isn't epoch-bounded — it wants every reward they ever earned. A staking address earns roughly one reward per epoch, so its ~300 rewards land in ~300 different partitions. To gather them, BigQuery has to dip into 300 segments and read at least one block from each. The per-partition overhead dominated the scan.

The same address lookup, three layouts:

Layout of the reward table	Bytes billed
Clustered by `epoch_no` (the wrong key for this query)	30.29 GB
Partitioned by `epoch_no`, clustered by `stake_addr_hash`	8.08 GB
Clustered by `stake_addr_hash`, no partitioning	30.45 MB

All three returned the identical answer: 307 rewards totalling 937.70 ADA for the test address. Dropping the partition cut the scan roughly 1,000×.

The lesson: partition a column when your queries filter narrow ranges of it; cluster a column when your queries sweep across it in full. For a key like a stake address — read across all epochs at once — partitioning by epoch fragments the very rows you're trying to gather. Cluster on the key, skip the partition.

The opposite holds for the epoch-keyed copies. An epoch query wants one contiguous slice of time, so there partitioning earns its keep — reward is range-partitioned on epoch_no. (epoch_stake is clustered on epoch_no instead; both prune an epoch filter hard, partitioning slightly harder.)

What this costs

At $6.25 per TiB scanned, the numbers above translate directly into money. The full reward table is ~67 GB; scanned naively, a single query runs about $0.40. The optimised address lookup scans 30 MB — a small fraction of one cent. Same answer.

One query, and it hardly matters. But a dashboard re-runs on every page load, and an API endpoint gets hit thousands of times a day. At that volume, the table layout is the bill. That is the whole reason we maintain the copies: the cost lives in bytes scanned, and querying the right copy makes the same answer about a thousand times cheaper.

These queries are not free. We lay out the schema so yours stay close to it.

Which table should you query?

Match your WHERE clause to the copy whose key it leads with:

Filtering by epoch_no → epoch_stake, reward
Filtering by stake_addr_hash → epoch_stake_addr, reward_addr
Filtering by pool_hash → epoch_stake_pool, reward_pool

Example queries

SELECT epoch_no, type, amount / 1000000 AS ada
FROM `blockchain-analytics-392322.cardano_mainnet.reward_addr`
WHERE stake_addr_hash = @stake_addr_hash
ORDER BY epoch_no;

A delegator's full reward history

SELECT stake_addr_hash, type, amount / 1000000 AS ada
FROM `blockchain-analytics-392322.cardano_mainnet.reward_pool`
WHERE pool_hash = @pool_hash
  AND epoch_no = @epoch_no;

A pool's rewards in a single epoch

Before you run anything, check the estimate. The BigQuery console shows bytes processed before execution, and the bq CLI has --dry_run. If a filter on a table's lead key isn't shrinking the estimate, you're probably on the wrong copy.

One more reason the reward tables carry a type column: it separates operator rewards (the pool's own pledged stake) from delegator rewards. That distinction matters for accounting and tax, and it's preserved in all three reward copies.

Keeping the copies in sync

Three copies means writing each row three times. We append rather than rebuild: every refresh inserts only the epochs newer than a copy's current maximum. Rewards and stake snapshots are immutable once an epoch settles, so an append-by-epoch watermark is safe and avoids rescanning history on each run. BigQuery reclusters new data in the background — no extra step.

The schemas are machine-readable

Every table — columns, types, partitioning, and clustering — is published at the BCA Schema Registry: schemas.blockchain-applied.com.
JSON Schema 2020-12, YAML, and JSON-LD, refreshed regularly.

It's built to be read by tools, not just people. Point an AI agent or LLM at the Cardano on BigQuery catalogue and it can see which column leads each table — and build a query that prunes, instead of one that scans the whole chain.