A Dhwani × CIFF working paper · v0.1
We wired Fluxx, Dynamics 365 and Power BI into one automation pipeline — and let the QA spreadsheet drive it. Here's what we built, what we learned, and where it goes next.
§1 · The setup
Fluxx
Grants · disbursements
Proposals, fund requests, donor approvals. Configurable forms, event-driven status changes, async API.
Dynamics 365
CRM · leads · service
Lead management, deduplication, bulk imports, customer service flows. SLA-sensitive surfaces.
Power BI
Donor analytics · KPIs
Dashboards on top of the other two. The bit your board actually looks at. Filtering, drill-through, export.
Each system has its own console, login wall and quirks. Integration bugs hide in the seams.
§1 · The math
Across the three systems, the manual QA catalogue looks like this — and grows quarter on quarter as new flows ship.
One QA. A spreadsheet. Several days a release. The cases that pass don't generate evidence; the cases that fail generate Slack messages.
Counts above are from the POC's demo fixtures — placeholder numbers we wired in for the runner. Worth replacing with CIFF's real catalogue counts before the readout.
§1 · The thesis
Manual test cases already describe inputs, steps, and expected results. That's a spec. The only thing missing is a runtime that understands it — and a way to feed the results back.
If the spreadsheet is the spec, the browser is the compiler, and the run is the assertion — we don't need to rewrite QA, we need to plug it in.
Upload a sheet. Walk away. Come back to results, screenshots, and a diagnosis on every failure.
Built for QA, not engineers. Re-run a single case, regroup cases into a suite, replay a run — without touching the CLI.
§2 · See it run
~38 seconds · captions on§2 · What we built
QA lands on a personal dashboard: total / passed / failed / skipped, modules with pass-rate bars, and the last five runs at a glance.
§2 · Upload → generate
Manual case · Excel row
| Module | Fluxx |
| Test ID | TC-015 |
| Title | Reject grant submission |
| Steps | 1. Login as program officer 2. Open submission #4821 3. Click "Reject", give reason 4. Confirm rejection |
| Expected | Status = Rejected |
| Priority | P1 |
Generated · Playwright
import { test, expect } from '@playwright/test'; test('TC-015 reject grant submission', async ({ page }) => { await page.goto('https://cif.fluxx.io/login'); await page.getByLabel('Username').fill(process.env.FLUXX_USER); await page.getByLabel('Password').fill(process.env.FLUXX_PASS); await page.getByRole('button', { name: 'Sign in' }).click(); await page.goto('https://cif.fluxx.io/grants/4821'); await page.getByRole('button', { name: 'Reject' }).click(); await page.getByLabel('Reason').fill('Out of programme scope'); await page.getByRole('button', { name: 'Confirm rejection' }).click(); await expect(page.getByTestId('grant-status')).toHaveText('Rejected'); });
Column-mapper figures out which column is "title", which is "steps", which is the test ID — even when the QA file has been re-saved nine times.
§2 · The run
Five pipeline stages, live console straight from the runner. No SSH, no CLI, no "did the build pass?" Slack message at 11pm.
Trace, screenshot, video and the generated .spec.ts file — all linked from the run detail page. Re-run a single case in one click.
§2 · A real failure caught
Async race condition. The assertion fired before the API responded. The kind of bug a manual QA might miss because the screen "looks right" after a second.
The error
AssertionError
expect(locator).toHaveText('Rejected')
Expected: Rejected
Received: Pending review
At: grant_approval.spec.ts:142
Claude's analysis
The reject button triggered the request, but the status update is async. The assertion fired before the API responded.
Suggestion:
Add waitForResponse('/api/grants/*/reject')
before the status check.
Three failure modes the runner has already surfaced: async race, brittle locator, real backend bug. We file the third with the dev team — the first two, we fix in seconds.
§2 · Claude's fix
grant_approval.spec.ts · diff
await page.getByRole('button', { name: 'Confirm rejection' }).click(); + await page.waitForResponse(r => + r.url().includes('/api/grants/') && r.url().includes('/reject') + ); await expect(page.getByTestId('grant-status')).toHaveText('Rejected');
The runner doesn't just say "TC-015 failed." It explains why, proposes a fix, and lets QA re-run with one click. Bug found → patched → re-verified in under three minutes. That used to be a half-day investigation.
§2 · So far
Not just the three flows we hand-picked for the POC. The whole catalogue — 1,284 cases — converted, suite-ised, and rerun on every PR.
Today · per case illustrative
QA writes the manual case. QA executes it. Every release. Defects surface late — sometimes after the donor has seen the dashboard.
With Phase 2 · per case illustrative
QA writes the manual case once. AI translates it. The runner enforces it forever. QA reviews failures — not green runs.
Numbers on this slide are illustrative projections — not measured. They assume average case complexity and 2× concurrency. Real numbers will land somewhere in this ballpark once we have a baseline week of runs.
§3 · The conversion loop
Every converted case stays converted. Every new manual case gets the same treatment. The library grows; the cost per case falls.
§3 · What changes
CIFF QA
Manual → exploratory
Stops re-running the same 1,284 cases. Spends time on new flows, exploratory testing, and edge cases that actually need a human.
CIFF engineering
Find bugs sooner
Regression runs on every PR. Failures land in chat with a Claude diagnosis and a proposed patch. Less back-and-forth.
Programme leads
Release with confidence
Dashboard goes from "QA says it's fine" to a real-time pass-rate per module, per release, with trace evidence.
§3 · Roadmap
No big-bang migration. Each milestone ships value on its own. If we stop at M3, CIFF still keeps the 200-case library.
§4 · The ask
What we need to start Phase 2: a green-light, access to staging for the three systems, and one QA hour a week to validate the converted cases. We bring the engineer, the AI, and the runner.
CIFF × DHWANI · MAY 2026 · v0.1