---
title: "314 broken stats pages and one bad cursor"
canonical: https://dxdev.com/blog/2026-05-22_314-stats-pages-one-bad-cursor/
datePublished: 2026-05-22
---
Team stats pages were rendering empty across 314 customer sites, one shared codepath routing every affected surface through the same helper. The count came out of an A/B I ran the next day on a separate ticket: every site I sampled was broken on master and clean on the develop branch carrying the fix.

The first instinct was to look at the SQL. The queries used `GROUP BY` across a large activity table, and the timeouts looked like a query-performance problem. I spent about an hour down that path before a better symptom pointed somewhere else: the failure was identical on every page, with no per-site variance, which is not what you see when a query plan goes bad under load. It is what you see when a shared wrapper is doing something the queries never agreed to.

## What the DataTable wrapper was actually doing

HomeTeamsONLINE has a helper called `DataTable` that most stats pages use to fetch recordsets. It is old enough that I cannot tell you exactly when it was written. It opens an ADO `Recordset` against the connection with `adOpenKeyset` and `adLockReadOnly`.

Keyset cursors are a relic of a time when server-side result iteration made sense. The driver opens a cursor on the server, holds it open, and fetches rows on demand. For a simple `SELECT` returning a few hundred rows from a single table, this works fine: slightly wasteful, but functional.

For a `GROUP BY` aggregate that needs to scan millions of activity rows and hash them into counts before returning anything, it is not fine. The driver attempts to open the cursor before the query has produced its first row, and past a few seconds of wait the ADO layer blows up and returns no data at all. The page gets an empty recordset and renders nothing.

The stats pages had not always been broken. They worked at smaller data volumes. At the volume the site operates now, the aggregate queries take long enough that the cursor contract fails every time.

## Migrating from cursor semantics to SQL.get and SQL.top

Two newer helpers in the codebase do not carry the keyset baggage. `SQL.get` executes a query and returns a disconnected array; `SQL.top` is the same thing with a row limit. Both open `adOpenForwardOnly`, fetch everything into memory, and close the connection. No server-side cursor state, no timeout from the driver waiting on a half-finished aggregate.

Rewriting `DataTable` calls to `SQL.get` calls sounds mechanical. In practice it required touching consumer code in several places, because `DataTable` returns an ADO `Recordset` object and `SQL.get` returns a plain array. Pages that called `.Fields("column_name")` on the result had to be rewritten to array-index access, and pages that used `.MoveNext` loops had to be rewritten to `For` loops.

None of that is conceptually hard, but one missed reference produces a runtime error on a page that used to work. You have to read every consumer carefully and test each one.

## The ninety-second timeout is a labeled band-aid

The aggregate query itself is still slow. `SQL.get` does not magically make a `GROUP BY` across millions of rows fast. It makes the failure mode survivable: the page waits, then renders, instead of failing immediately with an empty result.

I added a 90-second command timeout on the affected queries as an explicit band-aid. The number is not derived from profiling. It is large enough that the query usually finishes and small enough that a runaway query does not hold a connection open forever. I labeled it in a comment with the date and the follow-up ticket that owns the real work: ITEM-7006, which is scoped around either a covering index on the activity table or a pre-aggregated summary table the stats pages read instead of computing on demand. Until that ships, the 90-second number is the bridge, and it is in the code as a bridge, not as a setting.

## Why fixing the abstraction beat fixing the pages

Some bugs are fixed at the call site. This one wasn't, and the 314-surface number tells you why. Starting from "this specific stats page is slow" would have led to profiling one query, maybe adding one index, and moving on. The other 313 surfaces would still be broken. The fact that all of them trace to the same `DataTable` codepath is the information that tells you to fix the codepath and let the fix propagate.

The reason I ended up at the right place is that I looked at the error pattern before I looked at any individual query. Every affected page failed with the same symptom, and that symmetry is the clue. Once I had it, finding `DataTable` and reading how it opened its cursor took about ten minutes.

## What the data volume collected

The `DataTable` function had been in the codebase for years, doing its job quietly enough that nobody had reason to revisit the cursor mode it chose at the start. It became invisible the way working infrastructure becomes invisible.

It stopped working not because it changed, but because the data volume around it changed. The keyset cursor contract that holds at 50,000 rows does not hold at 5,000,000 rows. The assumption baked into `adOpenKeyset` was always conditional on the query returning its first row before the ADO layer gave up waiting, and the activity table eventually grew past the point where any aggregate over it could meet that condition.

A `DataTable` wrapper whose cursor contract assumes the query returns its first row quickly is deferred outage debt. The volume collected on it on May 22, all 314 surfaces in the same pass.
