name: image version: "1.0" description: Extract text from images using a vision LLM entry: script: scripts/main.py class: ImageSkill triggers: extensions: - ".png" - ".jpg" - ".jpeg" - ".webp" - ".gif" - ".tiff" intents: - "image" - "screenshot" - "diagram" - "photo" requires: [] author: axoviq.com license: AGPL-3.0-or-later

Image Skill

Base64-encodes the image and returns it in metadata for the IngestAgent to process via a vision-capable LLM. The text field is left empty at extract time and filled in by the agent.

When this skill is used

Source path ends with .png, .jpg, .jpeg, .webp, .gif, or .tiff
User intent contains: image, screenshot, diagram, photo

ナビゲーション

Skillsとは？

リンク

image

Image Skill

When this skill is used

関連スキル(📊 データ・分析)