# One-Shot Prompt

**Topic**: Crest Alpha One-Shot Benchmark Catalog
**Theme**: Signal Forge
**Generated**: 2026-04-23
**Model**: GPT-5.4

## Prompt

Write a complete Node.js ES module named `generate.mjs` that uses `pptxgenjs` to generate a polished 15-slide PowerPoint deck named `presentation.pptx`.

The deck topic is **Crest Alpha: the one-shot benchmark catalog**. The story is not "AI is changing everything." The story is that benchmark design quality depends on isolation, artifact discipline, route variety, and visible outputs. The deck should feel like a research-and-product review prepared by a strong design systems team: concise, structured, and deliberate.

No external images. No templates. No screenshots. Use only PptxGenJS text, shapes, lines, charts, fills, and tables. Every slide needs at least one real visual composition, not just text.

### Theme

Use a custom theme called **Signal Forge** with a distinctive editorial feel:

- Primary: `#18181B`
- Secondary: `#2F3A45`
- Accent rust: `#C75C2A`
- Accent gold: `#E1B24C`
- Support sage: `#8DAA91`
- Surface: `#F7F1E8`
- Panel: `#FFFDFC`
- Border: `#D8CCBC`
- Text: `#191919`
- Muted text: `#5F625B`
- Inverse text: `#F7F1E8`

Typography must be sans serif only. Use `Arial Black` for slide titles and major numbers. Use `Arial` for body, labels, and notes. Title and closing slides should be dark. Content slides should be warm light surfaces with dark text. Avoid generic tech gradients and avoid blue-heavy "AI" styling.

### Benchmark facts to encode

Use these grounded benchmark facts:

- The public catalog spans **12 routes**.
- The benchmark policy is **one isolated run per route**.
- The fairness rule is **zero retries**.
- The PowerPoint route is a **download-first landing route** that ships prompt, generator script, and deck.

Other numerical charts may be modeled, but any modeled chart or forecast must be explicitly labelled as modeled, illustrative, or composite.

### Slide plan

Build all 15 slides in this exact order.

1. **Title slide**
   - Title: `Crest Alpha`
   - Subtitle: `The one-shot benchmark catalog`
   - Supporting line: `How isolated generation exposes product sense`
   - Include date `23 April 2026`
   - Include `Prepared by GPT-5.4`
   - Dark background with layered frames, route rails, and one bold rust accent block

2. **Agenda / Overview**
   - Five agenda blocks:
     - Benchmark frame
     - Route mix
     - Scoring signals
     - Failure modes
     - Next horizon
   - Add a right-side vertical stack of three cards labelled:
     - Prompt
     - Generate
     - Judge

3. **Context / Why this matters**
   - Headline: `A benchmark only matters if the artifact survives contact with reality`
   - Big stat callout: `12 routes`
   - Three friction points:
     - Generic demos hide weak product judgement
     - Retry-heavy workflows distort first-pass quality
     - Benchmarks need visible outputs, not just scores
   - Add a small caption: `Catalog scope: public routes only`

4. **Key data point**
   - Large number: `0 retries`
   - Supporting line: `Every route runs once in an isolated context`
   - Add a segmented benchmark flow bar with these labelled phases and relative widths:
     - Brief
     - Prompt
     - Generate
     - Validate
     - Publish
   - Add a small note: `Fairness rule: no recovery loop`

5. **Market / landscape overview**
   - Use a horizontal bar chart titled `Illustrative manual build effort by route family`
   - Categories and values in effort points:
     - Dashboard: 86
     - Data story: 79
     - Maproom: 74
     - Website: 68
     - Physics: 61
     - PowerPoint: 54
   - Add a note that the chart is a modeled effort comparison for benchmark framing

6. **Breakdown / categories**
   - Doughnut chart titled `What the catalog actually tests`
   - Segments:
     - Product sense: 28
     - Interaction craft: 22
     - Visual hierarchy: 18
     - Tool reliability: 17
     - Output discipline: 15
   - Add centre label `100%`
   - Note that the mix is an illustrative evaluation lens

7. **Timeline / history**
   - Build a visual left-to-right timeline with six milestones:
     - 2023: Prompt demos dominate
     - 2024: Agent tooling expands
     - Early 2025: Single-artifact tests plateau
     - Late 2025: Route isolation becomes the benchmark unit
     - Early 2026: Canonical artifact contracts lock in
     - Next: Multi-route benchmark catalogs become standard

8. **Comparison table**
   - Compare four evaluation formats:
     - Ad hoc demo
     - Static benchmark
     - Retry-heavy workflow
     - One-shot catalog
   - Compare across five attributes:
     - What it shows
     - Strength
     - Blind spot
     - Artifact proof
     - Fairness risk
   - Visually emphasise the `One-shot catalog` column

9. **Trend analysis**
   - Line chart titled `Modeled benchmark maturity over four releases`
   - X-axis: R1, R2, R3, R4
   - Series:
     - Artifact validity: 62, 74, 86, 93
     - Design distinctiveness: 48, 57, 71, 82
     - Route coverage depth: 41, 55, 67, 79
   - Add annotation: `Modeled internal maturity curve`

10. **Case study / example**
   - Frame this slide as `Composite review cycle`
   - Context bullets:
     - 12 public routes
     - Mixed HTML and download-first artifacts
     - Strict file-contract validation
     - No manual post-run polish
   - Result cards:
     - Valid outputs: `11/12`
     - Contract clean: `92%`
     - Standout routes: `4`
   - Add a callout that the scenario is composite and used for benchmark illustration

11. **Challenges & risks**
   - Create a 2x2 visual risk board or matrix
   - Risks:
     - Hidden retries
     - Visual sameness
     - Dependency drift
     - Subjective judging
   - Make hidden retries and subjective judging the highest-severity cells

12. **Opportunities / solutions**
   - Build four rounded opportunity cards with icon-like number dots
   - Cards:
     - Route isolation
     - Canonical artifacts
     - Prompt transparency
     - Validator coverage
   - Each card needs a one-line description and a measurable outcome label

13. **Future outlook**
   - Title: `The benchmark frontier moves from scoreboards to artifact systems`
   - Use a two-series line chart with years 2026 to 2030
   - Series 1: Route families covered = 12, 16, 20, 24, 28
   - Series 2: Automated validators = 5, 8, 12, 16, 21
   - Mark clearly as `Modeled forecast`
   - Add a side note that benchmark trust rises when the output contract gets stricter

14. **Key takeaways**
   - Four numbered takeaway cards:
     - Isolation beats narrative spin
     - Visible artifacts beat abstract scores
     - Design quality is part of capability
     - Contracts make benchmarks reusable

15. **Thank you / Q&A**
   - Dark closing slide echoing the title slide treatment
   - Title: `Questions, edge cases, and next routes`
   - Three short prompts:
     - What should count as a pass?
     - Which routes need stronger validators?
     - Where does design judgement break first?
   - Add `reading.sh` as the contact placeholder

### Layout rules

- Use `LAYOUT_WIDE`
- Use warm light content slides and dark bookend slides
- Keep 0.5" minimum margins
- Use strong asymmetry, card stacks, tall sidebars, and bold stat panels
- Never use underlined titles
- Use rounded rectangles for cards and callouts
- Use one repeated motif across the deck: narrow vertical rails plus numbered markers
- Keep copy short; the deck should scan quickly
- Use 30-34pt titles, 18-22pt subtitles, 12-16pt body, 9-10pt footnotes

### Speaker notes

Every slide must include speaker notes with:

- 2 to 3 talking points
- One transition sentence to the next slide
- Extra context not fully visible on the slide

### Output

- Write `presentation.pptx`
- Ensure `node generate.mjs` produces the deck cleanly
- Keep the generator self-contained and reproducible

## Notes

- Topic is benchmark design, not a generic AI-industry explainer
- Use programmatic visuals only; no images of interfaces or screenshots
- How to run: `node generate.mjs`
- Output filename: `presentation.pptx`
