Architecture rules (non-negotiable)

Typed contracts are non-negotiable. Never widen a contract field to dict[str, Any] or Any. Every module output is a typed Pydantic BaseModel.
No fallback evaluators. Unmapped criteria surface as UNMAPPED in CriterionEvaluation.result. Do not add a default/catch-all evaluator.
Each review-data field is written from exactly one source. The assembler maps one module → one field. No merging, no fan-in.
Evaluator technique is private. Rule, LLM, or hybrid — all implement the same BaseEvaluator.evaluate() interface. The tree evaluator does not care how the decision was made.
required_fields is a contract. List the field path(s) on BaseReviewData that your evaluator needs. The tree evaluator enforces non-None before calling evaluate(). Do not add None-guards inside evaluators for required fields.

Key architectural boundaries

BaseExtractionModule[TOutput] — typed generic; output_schema is resolved at class-definition time via __init_subclass__. Always parametrize with a concrete BaseModel subclass.
@register_evaluator(*criterion_codes) — exact-match registry; one evaluator per criterion code; raises on duplicate. Registration triggers at Django app-ready time via evaluation/evaluators/__init__.py.
required_fields contract — declared as ClassVar[list[str]] on BaseEvaluator. The tree evaluator checks each dotted path against data before calling the evaluator. None at any point in the path = INSUFFICIENT_INFO.
No dict[str, Any] — enforced everywhere: module contracts, review data, assembler, evaluator inputs. Use model_dump() / model_validate() at DB boundaries only.


## LLM determinism

`temperature=0` helps but does not guarantee identical output across API versions. Cached fixtures in `fixtures/cached_llm_responses/` paper over this for tests. Do not delete or hand-edit those files — regenerate them via `scripts/generate_fixtures.py`.

The provider is env-driven (`LLM_PROVIDER`): anthropic (default), gemini, openai, groq, openrouter. Prompts must produce strict JSON that parses cleanly across providers — the contract is the Pydantic schema, not any one provider's quirks. All provider paths pass `temperature=0`.

## Cache key lifecycle

Cache keys are derived from the *data*, not the prompt. The extraction cache key now includes a prompt hash so prompt edits miss the cache cleanly — but evaluator cache keys do NOT include the evaluator version. If you change an evaluator's system prompt, bump the `cache_key_prefix` OR delete the cached file by hand OR re-run `scripts/generate_fixtures.py`.

### Provider-aware cache layout

Cache files live under `fixtures/cached_llm_responses/`. When `LLM_PROVIDER` is set, the client scopes reads/writes into a provider subdirectory:

- Read order (`LLM_MODE=cache`): `fixtures/cached_llm_responses/<provider>/<key>.json`, falling back to `fixtures/cached_llm_responses/<key>.json` (the shared baseline shipped in the repo). A total miss raises `LLMCacheMiss` naming both paths.
- Write location (`LLM_MODE=record`): `fixtures/cached_llm_responses/<provider>/<key>.json` when `LLM_PROVIDER` is set, else the shared top-level path. This keeps provider-specific recordings from clobbering the shared baseline.
- If `LLM_PROVIDER` is unset, behavior matches the pre-existing shared layout exactly.

## Running tests

```bash
make test

Tests run with LLM_MODE=cache by default (set in tests/conftest.py). They never hit the network.

ナビゲーション

Skillsとは？

リンク

Architecture rules (non-negotiable)

Architecture rules (non-negotiable)

Key architectural boundaries

関連スキル(🤖 AI・機械学習)