name: dist-op-dev description: Execution-oriented workflow for HyperParallel distributed operator development. Analyzes the operator, implements or updates code and tests.
HyperParallel Distributed Operator Development Workflow
✅ 【Unified Entry】When developing HyperParallel distributed operators, just call this SKILL, and I will automatically handle the entire process including operator analysis, implementation, testing, etc.
When to Use This Workflow
Use this workflow when developers need to add distributed operator support for the HyperParallel framework or optimize sharding strategy inference for existing operators.
How to Use
Call this SKILL directly, providing the MindSpore mint interface name or PyTorch operator name, along with source code paths:
# Develop distributed support for MindSpore mint interface
/dist-op-dev I want to develop distributed support for MindSpore mint interface mint.matmul. MindSpore source code is at /root/workspace/mindspore, PyTorch source code is at /root/workspace/pytorch.
# Develop distributed support for PyTorch operator
/dist-op-dev I want to develop distributed support for PyTorch operator torch.nn.functional.linear. MindSpore source code is at /root/workspace/mindspore, PyTorch source code is at /root/workspace/pytorch.
Source code paths are required — the dist-op-analysis SKILL needs them to locate interface definitions, Primitive mappings, and distributed strategy references.
Execution Flow Overview
Distributed operator development follows a 5-step process, from operator analysis to code push:
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ 1. Operator │ ──▶ │ 2. Python │ ──▶ │ 3. YAML │
│ Analysis │ │ Implement │ │ Registration│
│ Call SKILL │ │ Inherit/Custom │ │ Configure map │
│ 🔴Output report │ │ infer_layout │ │ Select suffix │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│
┌───────────────────────────────────────────────┘
▼
┌─────────────────┐ ┌─────────────────┐
│ 4. Unit Test │ ──▶ │ 5. Integration │
│ (UT) │ │ Test (ST) │
│ Verify inference│ │ 8-card verify │
│ Cover DP/MP │ │ Compare output │
└─────────────────┘ └─────────────────┘
Workflow Execution Checklist
When using this SKILL to develop distributed operators, create a TODOLIST, then execute the following workflows in order:
-
Step 1: Operator Analysis
- The operator analysis process must follow the procedure described in workflows/01-operator-analysis.md. Execute each step in order.
- Goal: Get operator interface definition, distributed implementation plan, implementation reference
- Input: MindSpore mint Interface, PyTorch Interface, MindSpore Source Code Path,PyTorch Source Code Path
- Output: Analysis report file
.claude/skills/dist-op-dev/analysis-results/{OpName}-analysis.md(🔴required)
-
Step 2: Python Implementation
- Must: The Python implementation process must follow the procedure described in workflows/02-python-implementation.md. Execute each step in order.
- Goal: Create distributed operator implementation class, implement infer_layout and get_expand_impl
- Input: Analysis report from Step 1
- Output:
hyper_parallel/core/shard/ops/parallel_*.pyfile
-
Step 3: YAML Registration
- Must: The yaml registration process must follow the procedure described in workflows/03-yaml-registration.md. Execute each step in order.
- Goal: Register operator in YAML config file, configure infer_layout_suffix
- Input: Analysis report from Step 1, Python implementation class info from Step 2
- Output:
hyper_parallel/core/shard/ops/yaml/*.yamlentry
-
Step 4: Unit Testing (UT)
- Must: The test generation process must follow the procedure described in workflows/04-unit-testing.md. Execute each step in order.
- Goal: Verify infer_layout and get_expand_impl logic correctness, cover supported/unsupported scenarios
- Input: Python implementation class from Step 2, analysis report from Step 1
- Output:
tests/ut/core/shard/ops/test_parallel_*.py
-
Step 5: Integration Testing (ST)
- Must: The test generation process must follow the procedure described in workflows/05-integration-testing.md. Execute each step in order.
- Goal: Verify end-to-end distributed execution correctness in 8-card environment
- Input: YAML config from Step 3, Python implementation from Step 2, analysis report from Step 1
- Output:
tests/mindspore/st/shard/ops/test_ops_*.py+*_shard_in_python.pyortests/torch/shard/ops/test_parallel_op_*.py+parallel_op_*.py
-
Step 6: Git Commit and PR Creation
- Goal: Create feature branch, call autogit to complete lint check, commit, push, and create PR if needed
- Input: All modified code, operator name
- Output: Feature branch
feat/{OpName}-distributed-support, commit pushed, PR created (if needed)
Key Decision Points
| Decision Point | Criteria | Options | Impact |
|---|---|---|---|
| Operator Category | Semantic matching | ElementWise/MatMul/Reduce/Reshape/Gather | Determines base class and YAML file |
| Implementation Method | Need custom logic | Scenario 0/Scenario 1/Scenario 2 | Code volume and UT coverage |
| Broadcast Support | Support broadcasting | No suffix/WithShape | YAML config and test scenarios |
| Partial Support | Handle partial state | _allow_partial_inputs=True/False | get_expand_impl implementation |
| Detailed decision reference: See Implementation Decisions |
Quick Reference
File Location Quick Reference
| Task | File Location | Key Notes |
|---|---|---|
| Python Implementation | hyper_parallel/core/shard/ops/parallel_*.py | Inherit DistributedOp or its subclass |
| YAML Registration | hyper_parallel/core/shard/ops/yaml/*.yaml | Configure operator to distributed implementation class mapping |
| Unit Test (UT) | tests/ut/core/shard/ops/ | Platform-agnostic, verify infer_layout and get_expand_impl logic |
| Integration Test (ST) | tests/mindspore/st/shard/ops/ tests/torch/shard/ops/ | 8-card environment verify distributed execution |
Detailed quick reference: See references/quick-reference.md
Platform Differences
| Item | MindSpore | PyTorch |
|---|---|---|
| Interface Name Style | mint.matmul, mint.nn.functional.relu | torch.matmul, torch.nn.functional.linear |
| YAML Files | element_wise_ops.yaml, matmul_ops.yaml, etc. | torch_*.yaml |
| UT Test Directory | tests/ut/core/shard/ops/ (shared) | tests/ut/core/shard/ops/ (shared) |
| ST Test Directories | tests/mindspore/st/shard/ops/ | tests/torch/shard/ops/ |
Important Note: If MindSpore operator and PyTorch operator have the same semantics, they can reuse the same distributed operator implementation class.
Related SKILLs
| SKILL | Purpose | When Called |
|---|---|---|
| autogit | Git workflow automation (commit, pr, status, etc.) | Workflow 6, complete code commit and PR creation |
| dist-op-analysis | Internal operator analysis (read-only) | Workflow 1, provides interface specs, distributed strategies, and HyperParallel implementation guidance |
Reference Document Paths
-
Workflow detailed steps:
workflows/directory -
Knowledge reference documents:
references/directory -
Template files:
templates/operator-analysis-template.md