Skills リサーチ

Skills Research

ナビゲーション

Skillsとは？

Skills（SKILL.md）は、AIエージェント（Claude Code、Cursor、Codexなど）に特定の能力を追加するための設定ファイルです。

詳しく見る →

リンク

検索に戻る

performant-ai

https://github.com/andreibesleaga/GABBE

★18更新: 3週間前Claude Code🔧 開発ツールパフォーマンス最適化コードレビューテストドキュメント Git・PREnglish

Strategies for high-performance AI/LLM systems (Context Management, Prompt Engineering, RAG, Inference Tuning).

name: performant-ai description: Strategies for high-performance AI/LLM systems (Context Management, Prompt Engineering, RAG, Inference Tuning). triggers: [ai, llm, performance, context window, tokens, prompt engineering, rag, inference, latency] tags: [coding, ai, architecture] context_cost: medium

Performant AI Skill

Goal

Optimizing the interaction, speed, and cost-effectiveness of LLM-based systems by mastering context management and inference strategies.

Capabilities

1. Context Window Engineering

Context Pruning: Implement logic to remove irrelevant or redundant tokens from the prompt to fit within limits and reduce cost.
Summarization Chains: Use "recursive summarization" for long conversations or documents.
Observation Masking: Hide older or less critical data to keep the attention of the model on the immediate task.

2. Efficient Prompting (Latency & Cost)

Few-Shot Optimization: Minimize the number of examples to the bare minimum needed for accuracy.
Output Structuring: Use JSON mode or structured outputs to reduce parsing errors and retry loops.
Prompt Compression: Use tools or manual techniques to shorten instructions without losing semantic meaning.

3. RAG Optimization (Retrieval-Augmented Generation)

Chunking Strategy: Optimize chunk sizes and overlap for the specific domain (e.g., small chunks for semantic search, large for summaries).
Hybrid Search: Combine Vector search (semantic) with Keyword search (BM25) for higher precision.
Re-ranking: Use a secondary, smaller model to re-rank the top-K results before sending them to the expensive LLM.

4. Inference & Routing Strategies

Brain Mode Routing: Arbitrate between "Local" models (faster/cheaper) and "Remote" models (complex/slower) based on task difficulty.
Speculative Decoding: (Where possible) use smaller models to draft tokens for larger models to verify, speeding up generation.
Cache Hits: Implement semantic caching (Redis) to reuse LLM responses for similar queries.

5. Architectural Patterns

Self-Correction Loops: Build reflection phases into the agent flow to catch errors early.
Asynchronous Agents: Run independent research or tool calls in parallel to reduce perceived latency (Loki Mode).

Steps

Token Audit: Trace the token count of typical requests to find "bloat" in systemic prompts.
Latency Mapping: Break down Time-to-First-Token (TTFT) and Total Generation Time.
Retrieval Benchmark: Measure the Hit Rate and Recall of the RAG pipeline.
Cost Projection: Estimate monthly burn based on different model providers and context sizes.

Deliverables

COST_OPTIMIZATION_REPORT_TEMPLATE.md: Analysis of prompt efficiency and LLM token usage.
ARCHITECTURE_REVIEW_TEMPLATE.md: Configuration for vector DB, chunking, and search weights.
SCALABILITY_ANALYSIS_TEMPLATE.md: Logic table for local vs remote model selection and context scaling.

Security & Guardrails

1. Data Privacy

PII Masking: Ensure no Personally Identifiable Information is sent to remote LLM providers without encryption or redaction.
Data Leakage: Verify that RAG sources do not inadvertently expose unauthorized documents to the user.

2. Reliability

Hallucination Checks: Mandatory verification step for critical facts generated by the LLM.
Fallback Logic: Always have a "conservative" fallback if the primary LLM fails or hits rate limits.

3. Agent Guardrails

No Infinite Loops: Implement strict limits on agent reflection or self-healing cycles (Max 5 attempts).
Cost Ceiling: Set token or dollar limits per session to prevent runaway autonomous spending.

GitHub で開く Raw を見るマーケットで出品する →

関連スキル(🔧 開発ツール)

blacksmith-testbox

Run Blacksmith Testbox for CI-parity checks, secrets, hosted services, migration

openclaw-ghsa-maintainer

Inspect, patch, validate, publish, or confirm OpenClaw GHSA security advisories

openclaw-parallels-smoke

Run, rerun, debug, or interpret OpenClaw Parallels install, onboarding, gateway

openclaw-pr-maintainer

Review, triage, close, label, comment on, or land OpenClaw PRs/issues with maint

openclaw-qa-testing

Run, watch, debug, extend, or explain OpenClaw qa-lab and qa-channel scenarios,

開発ツールのスキルをもっと見る →