The NYC Department of Consumer and Worker Protection has published
detailed rules on what a bias audit under Local Law 144 must contain.
The DCWP rules (5 RCNY §5-300 et seq.) specify impact-ratio analysis
across race / ethnicity, sex, and intersectional combinations. The
specifications are tight enough that an auditor who follows them
mechanically produces an audit that satisfies the letter of the law.
That mechanical conformance is not the same as a defensible audit. As
the second amended complaint in Mobley v. Workday
makes clear, an audit can be formally complete and substantively
contestable at the same time. Here is what to look at when you score
one.
1. Auditor independence — and what the auditor is paid for
The DCWP rules require the auditor to be independent. That means no
financial interest in the audited entity and no financial interest in
the AEDT vendor. Auditor independence is bright-line — there is no
"mostly independent."
In practice the biggest gray area is what the auditor is paid for.
Some auditors are paid a flat fee per audit; others are paid per
deliverable on a longer engagement that may include consulting work for
the same vendor. The latter raises questions the rule does not yet
directly address.
The defensible posture is to commission audits on flat fees, from
auditors with no other commercial relationship with the vendor or
employer, and to publish the engagement terms.
Auditors that appear in the directory's evidence: BABL AI, DCI
Consulting, ORCAA (O'Neil Risk Consulting & Algorithmic Auditing),
ConductorAI, Holistic AI, Warden AI, Credo.AI, Secretariat, Parity.
Each has a different methodology. None is uniformly "the right" choice;
the right choice depends on the AEDT's design.
2. Data window: long enough, recent enough, representative enough
The DCWP rules permit either employer data (real outcomes from the
employer's own use) or historical data (the vendor's training set,
or other reference data). Both are valid. The defensible choice for a
new vendor is historical; for an established vendor, employer data is
stronger.
The data window is where most disputes start. A 6-month window is the
floor; 12+ months is more defensible. Crucially:
- The window should not coincide with an unusual hiring period.
Mobley's challenge to Workday's Secretariat audit reads, in part, as a
challenge to the choice of a declining-hiring-volumes window.
- The window should include the high-volume hiring quarters for the
relevant employer or industry. An audit that covers only the slow
season misses where most disparate impact would show up.
- The window should be recent. A 2022 audit shown to a 2026
candidate is not a defence; it is theatrics.
3. Job-profile selection
Where the AEDT is used across multiple job profiles (call centre vs
engineering vs sales), the audit must cover representative profiles.
The DCWP rules speak of the "five highest-volume" profiles, but the
defensible posture is broader: include profiles that show the largest
demographic differences in application volume, even when those profiles
are not the highest by total count.
Mobley argues that Workday's five-profile selection systematically
excluded profiles where the disparate impact would have been most
visible. Whether or not the court accepts that argument, the underlying
critique — that profile selection can drive audit conclusions — is a
real one. Defensible audits document the selection logic explicitly.
4. Subgroup and intersectional analysis
The DCWP rules require analysis across race / ethnicity, sex, and the
intersection of those categories. Intersectional analysis is the
single most-skipped step in practice. It is also the step where the
most disparate impact actually shows up.
A defensible audit:
- Reports impact ratios at the four-fifths rule for each individual
subgroup AND each intersection.
- Reports the n for each cell — small cells with extreme ratios
should be flagged as inconclusive rather than treated as evidence of
no impact.
- Includes statistical-significance testing (Fisher's exact, or the
appropriate equivalent) alongside the impact ratio. An impact ratio
of 0.82 in a small cell is not the same as 0.82 in a large cell.
- Discloses the underlying methodology in enough detail that an
independent reader could in principle reproduce the result.
The vendors in our directory who publish the most defensible intersectional
analyses, as of May 2026, are Eightfold (BABL AI, 29M+ records, 7
race / ethnicity groups intersected with sex), Harver (BABL AI for both
the Harver Platform and pymetrics products), and Beamery (Warden AI,
continuous, publicly viewable).
5. Published artifact: summary vs full report
The DCWP rules require a public summary on the employer's website.
The summary must include the date of the audit, the source of the data,
and the impact-ratio results across required subgroups.
The defensible posture goes further: publish a sufficient summary that a
prospective customer or candidate can understand what was tested, on what
data, by which auditor, with what methodology. The vendors whose
documentation is gated behind trust portals — SafeBase, Conveyor — make
the technical compliance work but undercut the disclosure spirit of the
rule.
A defensible artifact:
- Identifies the auditor and the auditor's independence statement.
- States the data window, the population, and the profile selection.
- Reports the impact ratios per the DCWP rules.
- Reports the methodology in enough detail to allow critique.
- Is linked from the vendor's website and the employer's website (LL
144 requires the latter), with the most recent audit prominently
featured and prior audits archived for trend visibility.
6. Cadence
LL 144 requires the audit to be within the prior year. The
defensible cadence is annual at minimum, more frequently if the model is
materially updated or the deployer pool shifts substantially.
Beamery's Warden AI continuous monthly monitoring is, in our view, the
emerging best practice — the audit becomes a dashboard rather than a
PDF, which makes drift visible in close to real-time. Most other vendors
in the directory commission annual audits.
How this maps to our scoring
The Bias Audit Transparency category in our rubric weights:
- Whether an audit exists at all.
- Whether it is independent.
- Whether the methodology is disclosed.
- Whether the results are public, not gated.
- Whether the cadence is annual or better.
Across the directory, scores in this category range from 35
(Paradox, where the audit is gated and the vendor positions itself
outside the LL 144 scope) to 85 (Eightfold and Beamery). The
methodology page describes the full rubric; the
vendor directory shows the cited evidence behind each
vendor's bias-audit score.