Dingo & Co. Knowledge Work
A 23-deliverable consulting brief: research, financial reconciliation, regulatory analysis, decks and spreadsheets. Tests whether a model can run an entire knowledge-work engagement end to end.
The run recognizes some of the benchmark's weird business/legal premise, but it fails the central artifact-production job: eight required DOCX/PPTX/XLSX/PDF deliverables are invalid HTML/text stand-ins, image references are broken, workbooks have no real sheets/formulas/charts, research is shallow, and multiple planning numbers drift. Under strict normalization this is not a comparable production-grade knowledge-work package.
What it nailed
- Recognized several core absurdities: dingoes are not normal pets, Alaska does not solve legality, and the import program creates ethics and TAM risk.
- Used consistent main planning assumptions in many files, including $380K recognized revenue, 200 units, $899 MSRP, $799 early-bird, and $740K launch budget.
- Persona set separates curiosity traffic from real buyers.
Where it slipped
- Eight required DOCX/PPTX/XLSX/PDF artifacts are invalid HTML/text stand-ins.
- Provided image assets are referenced through broken paths, so deck, sales one-pager, dashboard, blog, and emails do not actually render those images.
- Regulatory research is shallow and often not tied to official jurisdiction-specific sources.
- No real Excel sheets, formulas, or charts exist in the required workbooks.
- Dashboard channel math conflicts with source totals and the broader 200-unit story.
- Raw model output/transcript evidence is absent from the evaluation package.