Two-Sided Recsys Feature Engineering

Version 0.1.0
Marketplace Engineering
April 2026

Note:
This document is mainly for agents and LLMs to follow when maintaining,
generating, or refactoring codebases. Humans may also find it useful,
but guidance here is optimized for automation and consistency by AI-assisted workflows.

Abstract

First-principles guide for deriving usable recommender features from the raw assets of a two-sided trust marketplace — listing photos, owner-entered listing metadata, and sitter wizard responses — for item-to-item, user-to-item, and user-to-user solutions. Contains 44 rules across 8 categories ordered by cascade impact on the feature-engineering lifecycle, from asset auditing and first-principles decomposition through vision, text, and wizard extraction, multi-modal composition into i2i/u2i/u2u scores, feature-store governance and training-serving parity, and incremental online value proof. Includes one playbook that composes the rules into an end-to-end feature discovery workflow. Functions as the upstream precursor to the companion marketplace-personalisation, marketplace-search-recsys-planning, and marketplace-pre-member-personalisation skills.

Asset Audit and Inventory — CRITICAL
- 1.1 Measure Coverage Before Declaring a Field a Feature — CRITICAL (prevents modelling features that only exist for 10% of items)
- 1.2 Quantify Freshness Per Asset Type — CRITICAL (prevents stale assets from poisoning similarity and affinity scores)
- 1.3 Sample Every Asset Type End-to-End Before Planning Features — CRITICAL (prevents silent garbage inputs to extraction pipelines)
- 1.4 Separate Raw Assets from Derived Features — CRITICAL (prevents 1-way data loss that blocks re-extraction with better models)
- 1.5 Verify Rights and Privacy Before Running Extraction — CRITICAL (prevents irreversible privacy and ToS violations)
First-Principles Feature Decomposition — CRITICAL
- 2.1 Ask What Signal a Human Uses to Make the Same Decision — CRITICAL (prevents guessing — surfaces 5-15 evidence-backed candidates per interview round)
- 2.2 Kill Features a Popularity Baseline Already Captures — CRITICAL (prevents redundant features inflating the portfolio)
- 2.3 Prefer Directly Observed Features over Learned Features at Launch — CRITICAL (delivers 80% of the lift at 10% of the system complexity)
- 2.4 Reject Features You Cannot Compute at Inference Time — CRITICAL (prevents the #1 cause of training-serving skew)
- 2.5 Start from the Decision, Not the Algorithm — CRITICAL (eliminates 60-80% of features that add cost without moving the outcome)
- 2.6 Tie Every Feature to a Specific Solution and Metric — CRITICAL (prevents orphan features that cost maintenance without lift)
Image Feature Extraction — HIGH
- 3.1 Apply Domain Fine-Tuning Only When Zero-Shot CLIP Plateaus — HIGH (closes 10-30% of the i2i relevance gap on domain taxonomies)
- 3.2 Detect Room Type Before Detecting Amenities — HIGH (makes amenity counts per-room, cutting false positives by 50%)
- 3.3 Extract Per-Object Counts, Not Just Presence — HIGH (prevents conflating a studio with a 6-bedroom villa)
- 3.4 Pool Embeddings Across a Listing's Photo Set — HIGH (reduces i2i variance by 2-4x versus single-photo features)
- 3.5 Quantify Image Quality Separately from Content — HIGH (prevents low-quality photos from flattening content embeddings)
- 3.6 Use CLIP for Zero-Shot Listing Embeddings Before Fine-Tuning — HIGH (ships the vision pipeline 10-15x faster than training from scratch)
Listing Text and Metadata Extraction — HIGH
- 4.1 Declare Categorical Fields for Bounded Vocabularies — HIGH (enables per-value learned features instead of text-bag processing)
- 4.2 Embed Description Text with a Pretrained Sentence Encoder — HIGH (prevents TF-IDF sparsity and synonym drift with 0 training cost)
- 4.3 Encode Amenity Lists as Multi-Hot Vectors, Not Free-Text Strings — HIGH (prevents string-tokenization drift across training and serving)
- 4.4 Encode Pet Requirements as Structured Triples — HIGH (enables per-axis matching that free text cannot)
- 4.5 Extract Stay Duration Shape, Not Just Length — HIGH (unlocks 3-5 sitter preference segments over a single integer)
- 4.6 Hash Geo to Hierarchies, Not Raw Lat/Lon — HIGH (prevents the model from treating geo as an arbitrary 2D plane)
Sitter Wizard and Profile Extraction — HIGH
- 5.1 Capture Experience as Counts and Dates, Not Adjectives — HIGH (prevents aspirational self-rating that flattens the feature)
- 5.2 Make Optional Questions Genuinely Skippable and Log the Skip — HIGH (preserves the "did not answer" signal instead of destroying it)
- 5.3 Order Wizard Questions by Information Gain — HIGH (2-3x feature usefulness per completed wizard question)
- 5.4 Prefer Multiple-Choice over Free Text in the Wizard — HIGH (prevents downstream NLP cost and training-serving drift)
- 5.5 Separate Hard Constraints from Soft Preferences in the Wizard — HIGH (prevents 30-50% of requests ending in owner rejection)
Derived Similarity and Affinity — MEDIUM-HIGH
- 6.1 Cache the User Embedding with a Short TTL, Not Per-Request — MEDIUM-HIGH (drops u2i latency from 80ms to 5ms per request)
- 6.2 Decompose Affinity into Interpretable Subscores — MEDIUM-HIGH (cuts rank-debug investigation time by 3-5x)
- 6.3 Fuse Modalities Before Computing Item Similarity — MEDIUM-HIGH (multi-modal i2i beats any single modality alone)
- 6.4 Precompute Item-to-Item Nearest Neighbours Offline — MEDIUM-HIGH (turns i2i from 500ms per request to 5ms)
- 6.5 Score User-to-User Compatibility as Symmetric Mutual Fit — MEDIUM-HIGH (prevents the 30-50% of requests that end in owner rejection)
- 6.6 Use a Two-Tower Model for User-to-Item Affinity — MEDIUM-HIGH (learned u2i affinity beats hand-crafted scoring 2-5x on NDCG)
Feature Quality and Governance — MEDIUM-HIGH
- 7.1 Freeze Feature Schemas per Model Version — MEDIUM-HIGH (prevents mid-flight schema drift from silently retraining the wrong model)
- 7.2 Gate Every Feature on Coverage and Drift Alarms — MEDIUM-HIGH (catches coverage collapse 10-20x earlier than metric drift)
- 7.3 Scrub PII Before Features Leave the Secure Zone — MEDIUM-HIGH (prevents GDPR exposure through embedding leaks)
- 7.4 Serve Training and Inference Features from One Store — MEDIUM-HIGH (eliminates the #1 cause of silent model regression)
- 7.5 Version Feature Definitions in a Single Registry — MEDIUM-HIGH (prevents two models silently computing the same feature differently)
Incremental Rollout and Value Proof — MEDIUM
- 8.1 Dedicate a Random Exploration Slice to New Features — MEDIUM (prevents offline-metric overfitting from blocking good features)
- 8.2 Kill Features That Do Not Earn Their Maintenance Cost — MEDIUM (removes 20-40% of features over the first year of portfolio maturity)
- 8.3 Measure Lift Against a Feature-Ablated Variant, Not the Old Model — MEDIUM (prevents attribution confounds from hyperparameter or data changes)
- 8.4 Retain a Feature-Free Baseline Permanently — MEDIUM (prevents silent ML-vs-baseline gap collapse)
- 8.5 Ship One Feature at a Time in the First Year — MEDIUM (prevents bundled-release attribution confounds)

References

Source Files

This document was compiled from individual reference files. For detailed editing or extension:

File	Description
references/_sections.md	Category definitions and impact ordering
assets/templates/_template.md	Template for creating new rules
SKILL.md	Quick reference entry point
metadata.json	Version and reference URLs

ナビゲーション

Skillsとは？

リンク

Two-Sided Recsys Feature Engineering

Two-Sided Recsys Feature Engineering

Abstract

Table of Contents

References

Source Files

関連スキル(🔧 開発ツール)