Evidence-Verified Trace-Level Reuse and Selective Regeneration for LLM-Based Cloud Incident Remediation

Alejandro Serrano; Yi Guo

pdf

Published: 2025-11-20

Alejandro Serrano

University of São Paulo

Yi Guo

University of São Paulo

Abstract

Large language model (LLM) agents are increasingly used to interpret operational evidence, diagnose failures, and recommend remediation actions in cloud-native systems. However, incident requests often share stable reasoning structure while differing in localized evidence, service names, or root-cause conditions. Whole-response semantic caching is brittle in this setting, while full regeneration repeats the same evidence organization, localization, and action-selection work. This paper presents TracePatch, a backend-agnostic reuse layer for LLM-based cloud incident remediation. TracePatch stores prior agent outputs as ordered trace blocks, retrieves a similar incident request, verifies each block against the new log evidence, reuses blocks that pass verification, and selectively regenerates only the failing suffix or structured action block. The design combines evidence-aware trace verification, conservative skip-reuse for semantic drift, and final structured-output validation for root cause and remediation fields. We evaluate TracePatch on a reproducible controlled benchmark built from the public LogHub HDFS dataset. Across 720 replayed evaluation requests over three random seeds, TracePatch reduces mean latency proxy from 1.684 s to 0.939 s, reduces token usage from 118.5k to 93.3k tokens, and raises final-check pass rate from 88.8% to 94.9%. The reuse-only path handles 54.2% of requests, 21.7% require selective patching, and 24.2% trigger skip-reuse under stronger evidence or root-cause changes. These results indicate that trace-level reuse can reduce LLM serving cost for operational agents while preserving evidence-grounded correctness under localized perturbations.

Issue

Vol. 1 No. 2 (2025)

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details

Issue

Section