name: running_gradle_tests
description: >
Executes and diagnoses Gradle tests with high-precision --tests filtering, surgical per-test failure isolation, and full stack traces;
ALWAYS use instead of ./gradlew test for test execution, failure investigation, and post-mortem analysis.
Do NOT use for general build lifecycle tasks (use running_gradle_builds) or dependency auditing.
license: Apache-2.0
metadata:
author: https://github.com/rnett/gradle-mcp
version: "3.3"
Authoritative Gradle Test Execution & Diagnostics
Executes tests with deep diagnostic tools to isolate and fix failures fast, ensuring maximum code quality and build reliability.
Constitution
- ALWAYS use the
gradletool instead of./gradlewvia shell. - ALWAYS use the
--testsflag for surgical test selection to minimize feedback loops. - ALWAYS provide absolute paths for
projectRoot. - ALWAYS prefer foreground execution (default) unless the test suite is extremely long-running (>2 minutes) or you explicitly intend to perform independent research while it proceeds.
- ONLY use
background: truefor managed background orchestration when context isolation and non-blocking exploration are required. - STRONGLY PREFERRED: Use
query_buildfor all test diagnostics. It provides isolated output and full stack traces that are often truncated in the main console. - ALWAYS use
query_buildwithkind="TESTS"andquery="FullTestName"to access full test output and stack traces. - NEVER use
taskPathorcaptureTaskOutputto investigate specific test failures; these provide the overall task log which is often truncated and lacks per-test isolation.
Surgical Test Inspection with query_build
When tests fail, query_build is your most powerful diagnostic tool. It provides isolated output and full stack traces that are often truncated in the main console.
1. List All Failed Tests (Summary)
Use outcome="FAILED" to quickly see which tests failed without being overwhelmed by logs.
- Example:
query_build(buildId="ID", kind="TESTS", outcome="FAILED")
2. Get Full Test Details (Details)
CRITICAL: ALWAYS use kind="TESTS" and query to see the complete stdout, stderr, and stack trace for a specific test.
- Unique Prefix Support: You can provide a unique prefix of the test name (e.g.,
query="com.example.MyTest"instead of the full FQN). If the prefix is unique, the tool will automatically select the test. If ambiguous, it will return a list of matching names for refinement. - Example:
query_build(buildId="ID", kind="TESTS", query="com.example.MyTest.myMethod")
3. Filter by Name (Summary)
Use query with the default summary view to see all executions of a test (e.g., across different projects or iterations).
- Example:
query_build(buildId="ID", kind="TESTS", query="MyTest")
4. Full Export
Use outputFile="path/to/results.txt" to write the entire result (e.g., all test outcomes, full console output) to a file. This bypasses pagination limits and is much more token-efficient for large results.
5. Progress Monitoring
Use timeout, waitFor, or waitForTask to block until a condition is met in a background test run.
- Example:
wait_build(buildId="ID", timeout=60, waitForTask=":app:test") - Wait for completion: If
timeoutis set without a wait condition, the tool waits for the build to finish.
Directives
- ALWAYS use foreground for authoritative tests: If you intend to wait for results, ALWAYS use foreground execution. It provides superior progressive disclosure and simpler control flow than starting a background build only to
immediately call
wait_build(timeout=...). - Background ONLY for long test suites: Use
background: trueONLY for test suites that take a long time to run and you explicitly intend to perform independent research while they proceed. - Foreground tests are safe: Do not fear running high-output test suites in the foreground. The
gradletool uses progressive disclosure to provide concise summaries and structured results, keeping session history clean and efficient. - Monitor with
query_buildandwait_build: Usequery_buildto check the status of background test runs or to retrieve structured output and stack traces for failed tests. - Check for environment failures: If a test run fails with a general error, use
query_build(kind="FAILURES")to check for compilation or configuration issues. - Investigate specifically: Use the
queryoption inquery_buildwithkind="TESTS"to isolate specific failure details. For detailed diagnostic workflows, see thetest_diagnostics.mdreference.
Authoritative Test Selection Patterns
The --tests flag supports powerful, high-precision filtering. Use these patterns to minimize execution time and context noise.
1. Simple Filters
- Exact Class:
--tests com.example.MyTest - Exact Method:
--tests com.example.MyTest.myTestMethod - Wildcard Method:
--tests com.example.MyTest.test*(Runs all methods starting with 'test')
2. Wildcard Filters (* and ?)
- Package Filter:
--tests com.example.service.*(Runs all tests in the 'service' package) - Class Prefix:
--tests *IntegrationTest(Runs all classes ending in 'IntegrationTest') - Character Wildcard:
--tests com.example.Test?(Matches Test1, TestA, etc.)
3. Syntax Rules
- No Class Path: Patterns match against the fully qualified name of the test class or method.
- Multi-Filter: You can provide multiple
--testsflags to run a specific selection of tests.gradle(commandLine=["test", "--tests", "ClassA", "--tests", "ClassB"])
Authoritative Task Path Syntax
Understanding how to target tests in a multi-project build is critical to avoid running more tests than necessary.
1. Task Selectors (Recursive)
Providing test without a leading colon executes the test task in every project (root and all subprojects) that has one.
- Example:
gradle(commandLine=["test", "--tests", "MyTest"])-> Searches for and runs 'MyTest' in all projects.
2. Absolute Task Paths (Targeted)
Providing a path with a leading colon targets a single specific project.
- Root Project Only:
gradle(commandLine=[":test", "--tests", "MyTest"]) - Subproject Only:
gradle(commandLine=[":app:test", "--tests", "MyTest"])
When to Use
- Targeted Test Execution: When you need to run specific tests or suites using precise filters (like
--tests) to minimize feedback loops. - Rapid Failure Isolation: When a build has failed and you need high-resolution diagnostics, including stdout/stderr and detailed stack traces.
- Large-Scale Suite Management: When running extensive test suites that benefit from managed background execution and real-time progress monitoring.
Workflows
Running Specific Tests
- Identify the project path (e.g.,
:app) and the test filter (e.g.,com.example.MyTestClass*). - Call
gradlewithcommandLineincluding--tests. - If the tool reports failures, review the included console output.
Investigating Failures
- Identify the
BuildIdfrom the result. - Use
query_build(buildId=ID, kind="TESTS", outcome="FAILED")to list all failed tests. - CRITICAL: Use
query_build(buildId=ID, kind="TESTS", query=TNAME)to see the full output and stack trace for a specific test.
- DO NOT use
taskPathorcaptureTaskOutputfor this. - Per-test output is authoritative, isolated, and contains full stack traces that are often omitted from the task console.
Examples
Run a single test class in a specific subproject
{
"commandLine": [":module-a:test", "--tests", "com.example.service.MyServiceTest"]
}
// Reasoning: Using an absolute task path and exact class filter for the fastest possible feedback loop.
List all failed tests in a build
{
"buildId": "build_20240301_130000_def456",
"outcome": "FAILED"
}
// Reasoning: Using query_build to isolate only the failures from a large test suite.
### Look up details for a specific failed test
```json
{
"buildId": "build_20240301_130000_def456",
"kind": "TESTS",
"query": "com.example.a.MyTest.shouldFail"
}
// Reasoning: Retrieving the full stack trace and isolated stdout/stderr for a specific failure.
## Troubleshooting
- **Missing environment variables**: Set `invocationArguments: { envSource: "SHELL" }` if Gradle cannot find expected env vars (e.g., `JAVA_HOME`).
## Resources
- [Test Diagnostics](./references/test_diagnostics.md)