What makes a defensible NYC LL 144 bias audit

The NYC Department of Consumer and Worker Protection has published detailed rules on what a bias audit under Local Law 144 must contain. The DCWP rules (5 RCNY §5-300 et seq.) specify impact-ratio analysis across race / ethnicity, sex, and intersectional combinations. The specifications are tight enough that an auditor who follows them mechanically produces an audit that satisfies the letter of the law.

That mechanical conformance is not the same as a defensible audit. As the second amended complaint in Mobley v. Workday makes clear, an audit can be formally complete and substantively contestable at the same time. Here is what to look at when you score one.

1. Auditor independence — and what the auditor is paid for

The DCWP rules require the auditor to be independent. That means no financial interest in the audited entity and no financial interest in the AEDT vendor. Auditor independence is bright-line — there is no "mostly independent."

In practice the biggest gray area is what the auditor is paid for. Some auditors are paid a flat fee per audit; others are paid per deliverable on a longer engagement that may include consulting work for the same vendor. The latter raises questions the rule does not yet directly address.

The defensible posture is to commission audits on flat fees, from auditors with no other commercial relationship with the vendor or employer, and to publish the engagement terms.

Auditors that appear in the directory's evidence: BABL AI, DCI Consulting, ORCAA (O'Neil Risk Consulting & Algorithmic Auditing), ConductorAI, Holistic AI, Warden AI, Credo.AI, Secretariat, Parity. Each has a different methodology. None is uniformly "the right" choice; the right choice depends on the AEDT's design.

2. Data window: long enough, recent enough, representative enough

The DCWP rules permit either employer data (real outcomes from the employer's own use) or historical data (the vendor's training set, or other reference data). Both are valid. The defensible choice for a new vendor is historical; for an established vendor, employer data is stronger.

The data window is where most disputes start. A 6-month window is the floor; 12+ months is more defensible. Crucially:

The window should not coincide with an unusual hiring period. Mobley's challenge to Workday's Secretariat audit reads, in part, as a challenge to the choice of a declining-hiring-volumes window.
The window should include the high-volume hiring quarters for the relevant employer or industry. An audit that covers only the slow season misses where most disparate impact would show up.
The window should be recent. A 2022 audit shown to a 2026 candidate is not a defence; it is theatrics.

3. Job-profile selection

Where the AEDT is used across multiple job profiles (call centre vs engineering vs sales), the audit must cover representative profiles. The DCWP rules speak of the "five highest-volume" profiles, but the defensible posture is broader: include profiles that show the largest demographic differences in application volume, even when those profiles are not the highest by total count.

Mobley argues that Workday's five-profile selection systematically excluded profiles where the disparate impact would have been most visible. Whether or not the court accepts that argument, the underlying critique — that profile selection can drive audit conclusions — is a real one. Defensible audits document the selection logic explicitly.

4. Subgroup and intersectional analysis

The DCWP rules require analysis across race / ethnicity, sex, and the intersection of those categories. Intersectional analysis is the single most-skipped step in practice. It is also the step where the most disparate impact actually shows up.

A defensible audit:

Reports impact ratios at the four-fifths rule for each individual subgroup AND each intersection.
Reports the n for each cell — small cells with extreme ratios should be flagged as inconclusive rather than treated as evidence of no impact.
Includes statistical-significance testing (Fisher's exact, or the appropriate equivalent) alongside the impact ratio. An impact ratio of 0.82 in a small cell is not the same as 0.82 in a large cell.
Discloses the underlying methodology in enough detail that an independent reader could in principle reproduce the result.

The vendors in our directory who publish the most defensible intersectional analyses, as of May 2026, are Eightfold (BABL AI, 29M+ records, 7 race / ethnicity groups intersected with sex), Harver (BABL AI for both the Harver Platform and pymetrics products), and Beamery (Warden AI, continuous, publicly viewable).

5. Published artifact: summary vs full report

The DCWP rules require a public summary on the employer's website. The summary must include the date of the audit, the source of the data, and the impact-ratio results across required subgroups.

The defensible posture goes further: publish a sufficient summary that a prospective customer or candidate can understand what was tested, on what data, by which auditor, with what methodology. The vendors whose documentation is gated behind trust portals — SafeBase, Conveyor — make the technical compliance work but undercut the disclosure spirit of the rule.

A defensible artifact:

Identifies the auditor and the auditor's independence statement.
States the data window, the population, and the profile selection.
Reports the impact ratios per the DCWP rules.
Reports the methodology in enough detail to allow critique.
Is linked from the vendor's website and the employer's website (LL 144 requires the latter), with the most recent audit prominently featured and prior audits archived for trend visibility.

6. Cadence

LL 144 requires the audit to be within the prior year. The defensible cadence is annual at minimum, more frequently if the model is materially updated or the deployer pool shifts substantially.

Beamery's Warden AI continuous monthly monitoring is, in our view, the emerging best practice — the audit becomes a dashboard rather than a PDF, which makes drift visible in close to real-time. Most other vendors in the directory commission annual audits.

How this maps to our scoring

The Bias Audit Transparency category in our rubric weights:

Whether an audit exists at all.
Whether it is independent.
Whether the methodology is disclosed.
Whether the results are public, not gated.
Whether the cadence is annual or better.

Across the directory, scores in this category range from 35 (Paradox, where the audit is gated and the vendor positions itself outside the LL 144 scope) to 85 (Eightfold and Beamery). The methodology page describes the full rubric; the vendor directory shows the cited evidence behind each vendor's bias-audit score.