name: fraud-investigation-data-consolidator description: Investigates a suspicious transaction by gathering all related data from a BigQuery analytics dataset, consolidating it into a structured JSON format, and triggering alert workflows.
Instructions
Execute the following steps to investigate a suspicious transaction. The primary goal is to create a comprehensive data snapshot for a given transaction_id and trigger the required alerting and archiving workflows.
1. Initial Setup & Discovery
- Input: You will be given a specific
transaction_id(e.g.,T8492XJ3). - First, confirm the target dataset and storage buckets exist.
- Use
google-cloud-bigquery_get_dataset_infoto verify thetransactions_analyticsdataset is accessible. - Use
google-cloud-storage_list_bucketsto locate the archive bucket (name prefixed bymcp-fraud-investigation-archive-) and the log bucket (name prefixed byTrading_Logging-). Note their exact names.
- Use
2. Schema Exploration & Data Querying
- Discover all tables in the
transactions_analyticsdataset usinggoogle-cloud-bigquery_run_queryon theINFORMATION_SCHEMA.TABLESview. - For the target transaction, query data from two primary sources:
live_transactionstable: Get the full record for the giventransaction_id. This record contains key foreign IDs (user_id,account_id,merchant_id,card_id,device_id,location_id).fraud_alertstable: Check for any existing alerts for thistransaction_id.
- Using the IDs from the
live_transactionsrecord, query all related dimension tables:users(byuser_id)accounts(byaccount_id)merchants(bymerchant_id)cards(bycard_id)devices(bydevice_id)locations(bylocation_id)risk_scores(byuser_id)blacklist(Check if any of the IDsuser_id,account_id,card_id,device_id, ormerchant_idappear in thevaluecolumn. Note: Theblacklisttable schema usesentity_id,entity_type,value).
- Find related transactions: Query the
live_transactionstable for all other transactions by the sameuser_id, excluding the targettransaction_id. Order bytimestamp DESCand limit results appropriately (e.g., 1000).
3. Data Consolidation & JSON Creation
- Structure the final JSON object with the following keys. Convert Python objects (like
datetime,list,dict) to JSON-serializable strings (ISO format for dates, proper JSON arrays/objects).live_transactions: The main transaction record.fraud_alerts: The associated alert record (if any).users,accounts,merchants,cards,devices,locations,risk_scores,blacklist: The related dimension data.related_transactions: An array containing the other transactions for the user.
- Save the JSON file locally to the workspace using
filesystem-write_file. Name the file<transaction_id>.json(e.g.,T8492XJ3.json).
4. Archiving & Alerting
- Upload the JSON file to the identified archive storage bucket using
google-cloud-storage_upload_file. Use thetransaction_idas the blob name. - Write a CRITICAL log entry to the identified logging bucket using
google-cloud-logging_write_log.- Log Name: Use the full name of the log bucket (e.g.,
Trading_Logging-e877351c7447). - Severity:
CRITICAL - Message/Payload: A JSON string with the exact structure:
{"alert_type": "Fraud", "transaction_id": "<TRANSACTION_ID>", "status": "Pending_Investigation"}
- Log Name: Use the full name of the log bucket (e.g.,
Key Considerations
- Error Handling: If a query for a specific table returns no data, include an empty object
{}for that key in the final JSON. - Data Types: Pay special attention to serializing complex fields (flags, velocity_checks, etc.) from the
live_transactionstable from stringified JSON to proper JSON objects/arrays in the output. - Bucket Names: The archive and log bucket names are dynamic (with unique suffixes). Always list buckets first to confirm their exact names.
- Logging: The log write must happen after successful archiving. The log entry is new and independent of any existing logs in the bucket.