# Manual Multimodal Test Client Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Add a standalone Python test client that can run one manual smoke-test request at a time for LLM, transcription, audio generation, video generation, video doubt, and music audio doubt with built-in defaults and explicit overrides.
**Architecture:** Implement a single focused CLI script under `tools/` using `argparse`, a mode registry, shared request/response helpers, and predictable local artifact saving. Keep the script independent from server modules, and cover it with unit tests that validate argument resolution, request construction, artifact handling, interactive fallback behavior, and the constrained “doubt” mode payloads using mocked HTTP responses.
-[]**Step 1: Write the failing parser/mode tests**
Create `tests/test_manual_multimodal_test_client.py` with initial tests for direct mode parsing, interactive fallback selection, and global override parsing:
assertspec["json"]["prompt"]=="Generate a short test clip"
assertspec["json"]["response_format"]=="url"
```
-[]**Step 2: Run the focused tests to verify they fail**
Run: `/storage/coderai/venv_all/bin/python -m pytest tests/test_manual_multimodal_test_client.py -k 'build_request_spec and (llm or transcription or audio_generation or video_generation)'`
Expected: FAIL because `build_request_spec` does not exist yet.
-[]**Step 3: Implement the request builders and required-file checks**
Add `build_request_spec()` and a helper for reading required local files into `tools/manual_multimodal_test_client.py`:
raiseValueError(f"Unsupported mode for this task: {mode}")
```
-[]**Step 4: Run the focused tests to verify they pass**
Run: `/storage/coderai/venv_all/bin/python -m pytest tests/test_manual_multimodal_test_client.py -k 'build_request_spec and (llm or transcription or audio_generation or video_generation)'`
Expected: PASS.
-[]**Step 5: Commit the direct endpoint builders**
-[]**Step 1: Write the failing doubt-mode payload tests**
Add tests that lock the first-version behavior to a proven text-endpoint-compatible contract using prompt plus file-path context text rather than inventing unsupported binary chat transport:
assert"Describe the music."inspec["json"]["messages"][0]["content"]
```
-[]**Step 2: Run the focused tests to verify they fail**
Run: `/storage/coderai/venv_all/bin/python -m pytest tests/test_manual_multimodal_test_client.py -k 'video_doubt or music_audio_doubt'`
Expected: FAIL because those modes are not implemented yet.
-[]**Step 3: Implement the constrained doubt-mode builders**
Extend `build_request_spec()` so the two doubt modes use the chat completions endpoint with explicit file-path context text that matches the approved constraint of using only proven text-compatible request shapes:
git commit -m"feat: add manual multimodal test client"
```
## Self-Review
- Spec coverage: Task 1 covers direct plus interactive invocation, Task 2 covers built-in defaults and override resolution, Tasks 3 and 4 cover all required mode request builders including the constrained doubt modes, Task 5 covers artifact saving and stdout text behavior, and Task 6 wires execution and error handling. Task 7 verifies the full new test file and confirms no regression to the whisper-server test file.
- Placeholder scan: all file paths, commands, test names, code snippets, and commit messages are explicit.
- Type consistency: the plan consistently uses `parse_args`, `choose_mode_interactively`, `resolve_mode_config`, `build_request_spec`, `handle_response_payload`, `execute_request`, `execute_mode`, and `main` across all tasks.