Doc: document firewall-audit-logs artifact name for downstream consumers (#25684)

Copilot · lpcox · web-flow · commit cc5664276950 · 2026-04-10T11:57:16.000-07:00
* Initial plan * doc: document firewall-audit-logs artifact name for downstream consumers Add firewall-audit-logs artifact documentation to help downstream repositories that write custom token analysis workflows using gh run download directly. Changes: - Create docs/src/content/docs/reference/artifacts.md with complete artifact naming reference including directory structures - Update scratchpad/artifact-naming-compatibility.md to add multi-file artifacts section with firewall-audit-logs details - Update docs/src/content/docs/reference/compilation-process.md to include firewall-audit-logs in the Artifacts Created table Closes #1494 Agent-Logs-Url: https://github.com/github/gh-aw/sessions/28093af6-8cb7-4db7-a198-a7f06c6f20e3 Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * doc: address review feedback on artifact documentation - Fix directory structure to match CLI expectations: token usage at api-proxy-logs/ (not logs/api-proxy-logs/), audit.jsonl and policy-manifest.json at root (not audit/ subdirectory) - Move detection to single-file artifacts table (it only contains detection.log) - Clarify multi-file artifacts may be post-processed by CLI - Mark safe-output and agent-output as legacy/back-compat (now merged into unified agent artifact) - Fix grammar: "uploads" → "upload" - Fix firewall-logs path to /tmp/gh-aw/sandbox/firewall/logs/ Agent-Logs-Url: https://github.com/github/gh-aw/sessions/fbe8b2c5-5e0c-4a3d-8e42-a6bd64a1cbba Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
diff --git a/docs/src/content/docs/reference/artifacts.md b/docs/src/content/docs/reference/artifacts.md
@@ -0,0 +1,173 @@
+---
+title: Artifacts
+description: Complete reference for artifact names, directory structures, and download patterns used by GitHub Agentic Workflows.
+sidebar:
+  order: 298
+---
+
+GitHub Agentic Workflows upload several artifacts during workflow execution. This reference documents every artifact name, its contents, and how to access the data — especially for downstream workflows that use `gh run download` directly instead of `gh aw logs`.
+
+## Quick Reference
+
+| Artifact Name | Constant | Type | Description |
+|---------------|----------|------|-------------|
+| `agent` | `constants.AgentArtifactName` | Multi-file | Unified agent job outputs (logs, safe outputs, token usage summary) |
+| `activation` | `constants.ActivationArtifactName` | Multi-file | Activation job output (`aw_info.json`, `prompt.txt`, rate limits) |
+| `firewall-audit-logs` | `constants.FirewallAuditArtifactName` | Multi-file | AWF firewall audit/observability logs (token usage, network policy, audit trail) |
+| `detection` | `constants.DetectionArtifactName` | Single-file | Threat detection log (`detection.log`) |
+| `safe-output` | `constants.SafeOutputArtifactName` | Legacy/back-compat | Historical standalone safe output artifact (`safe_output.jsonl`); in current compiled workflows this content is included in the unified `agent` artifact instead |
+| `agent-output` | `constants.AgentOutputArtifactName` | Legacy/back-compat | Historical standalone agent output artifact (`agent_output.json`); in current compiled workflows this content is included in the unified `agent` artifact instead |
+| `aw-info` | — | Single-file | Engine configuration (`aw_info.json`) |
+| `prompt` | — | Single-file | Generated prompt (`prompt.txt`) |
+| `safe-outputs-items` | `constants.SafeOutputItemsArtifactName` | Single-file | Safe output items manifest |
+| `code-scanning-sarif` | `constants.SarifArtifactName` | Single-file | SARIF file for code scanning results |
+
+## Artifact Sets
+
+The `gh aw logs` and `gh aw audit` commands support `--artifacts` to download only specific artifact groups:
+
+| Set Name | Artifacts Downloaded | Use Case |
+|----------|---------------------|----------|
+| `all` | Everything | Full analysis (default) |
+| `agent` | `agent` | Agent logs and outputs |
+| `activation` | `activation` | Activation data (`aw_info.json`, `prompt.txt`) |
+| `firewall` | `firewall-audit-logs` | Network policy and firewall audit data |
+| `mcp` | `firewall-audit-logs` | MCP gateway traffic logs |
+| `detection` | `detection` | Threat detection output |
+| `github-api` | `activation`, `agent` | GitHub API rate limit logs |
+
+```bash
+# Download only firewall artifacts
+gh aw logs <run-id> --artifacts firewall
+
+# Download agent and firewall artifacts
+gh aw logs <run-id> --artifacts agent --artifacts firewall
+
+# Download everything (default)
+gh aw logs <run-id>
+```
+
+## `firewall-audit-logs`
+
+The `firewall-audit-logs` artifact is uploaded by **all firewall-enabled workflows**. It contains AWF (Agent Workflow Firewall) structured audit and observability logs.
+
+> **⚠️ Important:** This artifact is **separate** from the `agent` artifact. Token usage data (`token-usage.jsonl`) lives here, not in the `agent` artifact.
+
+### Directory Structure
+
+```
+firewall-audit-logs/
+├── api-proxy-logs/
+│   └── token-usage.jsonl        ← Token usage data (input/output/cache tokens per API request)
+├── squid-logs/
+│   └── access.log               ← Network policy log (domain allow/deny decisions)
+├── audit.jsonl                  ← Firewall audit trail (policy matches, rule evaluations)
+└── policy-manifest.json         ← Policy configuration snapshot
+```
+
+### Accessing Token Usage Data
+
+**Recommended: Use `gh aw logs`**
+
+```bash
+# Download and analyze firewall data
+gh aw logs <run-id> --artifacts firewall
+
+# Output as JSON for scripting
+gh aw logs <run-id> --artifacts firewall --json
+```
+
+**Direct download with `gh run download`:**
+
+```bash
+# Download the firewall-audit-logs artifact
+gh run download <run-id> -n firewall-audit-logs
+
+# Token usage data is at:
+cat firewall-audit-logs/api-proxy-logs/token-usage.jsonl
+
+# Network access log is at:
+cat firewall-audit-logs/squid-logs/access.log
+
+# Audit trail is at:
+cat firewall-audit-logs/audit.jsonl
+
+# Policy manifest is at:
+cat firewall-audit-logs/policy-manifest.json
+```
+
+### Common Mistake
+
+Downstream workflows sometimes download `agent-artifacts` or `agent` expecting to find `token-usage.jsonl`. This will silently return no data — the token usage file is only in the `firewall-audit-logs` artifact.
+
+```bash
+# ❌ WRONG — token-usage.jsonl is NOT in the agent artifact
+gh run download <run-id> -n agent
+cat agent/token-usage.jsonl  # File not found!
+
+# ✅ CORRECT — download from firewall-audit-logs
+gh run download <run-id> -n firewall-audit-logs
+cat firewall-audit-logs/api-proxy-logs/token-usage.jsonl
+```
+
+## `agent`
+
+The unified `agent` artifact contains all agent job outputs.
+
+### Contents
+
+- Agent execution logs
+- Safe output data (`agent_output.json`)
+- GitHub API rate limit logs (`github_rate_limits.jsonl`)
+- Token usage summary (`agent_usage.json`) — aggregated totals only; per-request data is in `firewall-audit-logs`
+
+## `activation`
+
+The `activation` artifact contains activation job outputs.
+
+### Contents
+
+- `aw_info.json` — Engine configuration and workflow metadata
+- `prompt.txt` — The generated prompt sent to the AI agent
+- `github_rate_limits.jsonl` — Rate limit data from the activation job
+
+## `detection`
+
+The `detection` artifact contains threat detection output.
+
+### Contents
+
+- `detection.log` — Threat detection analysis results
+
+Legacy name: `threat-detection.log` (still supported for backward compatibility).
+
+## Naming Compatibility
+
+Artifact names changed between upload-artifact v4 and v5. The `gh aw logs` and `gh aw audit` commands handle both naming schemes transparently:
+
+| Old Name (pre-v5) | New Name (v5+) | File Inside |
+|--------------------|----------------|-------------|
+| `aw_info.json` | `aw-info` | `aw_info.json` |
+| `safe_output.jsonl` | `safe-output` | `safe_output.jsonl` |
+| `agent_output.json` | `agent-output` | `agent_output.json` |
+| `prompt.txt` | `prompt` | `prompt.txt` |
+| `threat-detection.log` | `detection` | `detection.log` |
+
+Single-file artifacts are automatically flattened to root level regardless of their artifact directory name. Multi-file artifacts (`firewall-audit-logs`, `agent`, `activation`) retain their directory structure.
+
+## Workflow Call Prefixes
+
+When workflows are invoked via `workflow_call`, GitHub Actions prepends a short hash to artifact names (e.g., `abc123-firewall-audit-logs`). The CLI handles this automatically by matching artifact names that end with `-{base-name}`.
+
+```bash
+# Both of these are recognized as the firewall artifact:
+# - firewall-audit-logs           (direct invocation)
+# - abc123-firewall-audit-logs    (workflow_call invocation)
+```
+
+## Related Documentation
+
+- [Audit Commands](/gh-aw/reference/audit/) — Download and analyze workflow run artifacts
+- [Cost Management](/gh-aw/reference/cost-management/) — Track token usage and inference spend
+- [Network](/gh-aw/reference/network/) — Firewall and domain allow/deny configuration
+- [Compilation Process](/gh-aw/reference/compilation-process/) — How workflows are compiled including artifact upload steps
diff --git a/docs/src/content/docs/reference/compilation-process.md b/docs/src/content/docs/reference/compilation-process.md
@@ -252,10 +252,27 @@ Workflows generate several artifacts during execution:
 | **agent_output.json** | `/tmp/gh-aw/safeoutputs/` | AI agent output with structured safe output data (create_issue, add_comment, etc.) | Uploaded by agent job, downloaded by safe output jobs, auto-deleted after 90 days |
 | **agent_usage.json** | `/tmp/gh-aw/` | Aggregated token counts: `{"input_tokens":…,"output_tokens":…,"cache_read_tokens":…,"cache_write_tokens":…}` | Bundled in the unified agent artifact when the firewall is enabled; accessible to third-party tools without parsing step summaries |
 | **prompt.txt** | `/tmp/gh-aw/aw-prompts/` | Generated prompt sent to AI agent (includes markdown instructions, imports, context variables) | Retained for debugging and reproduction |
-| **firewall-logs/** | `/tmp/gh-aw/firewall-logs/` | Network access logs in Squid format (when `network.firewall:` enabled) | Analyzed by `gh aw logs` command |
+| **firewall-audit-logs** | See structure below | Dedicated artifact for AWF audit/observability logs (token usage, network policy, audit trail) | Uploaded by all firewall-enabled workflows; analyzed by `gh aw logs --artifacts firewall` |
+| **firewall-logs/** | `/tmp/gh-aw/sandbox/firewall/logs/` | Network access logs in Squid format (when `network.firewall:` enabled) | Analyzed by `gh aw logs` command |
 | **cache-memory/** | `/tmp/gh-aw/cache-memory/` | Persistent agent memory across runs (when `tools.cache-memory:` configured) | Restored at start, saved at end via GitHub Actions cache |
 | **patches/**, **sarif/**, **metadata/** | Various | Safe output data (git patches, SARIF files, metadata JSON) | Temporary, cleaned after processing |
 
+### `firewall-audit-logs` Artifact Structure
+
+The `firewall-audit-logs` artifact is a dedicated multi-file artifact uploaded by all firewall-enabled workflows. It is **separate** from the unified `agent` artifact. Downstream workflows that need token usage data or firewall audit logs must download this artifact specifically.
+
+```
+firewall-audit-logs/
+├── api-proxy-logs/
+│   └── token-usage.jsonl        ← Token usage data per request
+├── squid-logs/
+│   └── access.log               ← Network policy log (allow/deny)
+├── audit.jsonl                  ← Firewall audit trail
+└── policy-manifest.json         ← Policy configuration snapshot
+```
+
+> **Tip:** Use `gh aw logs <run-id> --artifacts firewall` to download and analyze firewall data instead of `gh run download` directly. The CLI handles artifact naming and backward compatibility automatically. See the [Artifacts reference](/gh-aw/reference/artifacts/) for the complete artifact naming guide.
+
 ## MCP Server Integration
 
 Model Context Protocol (MCP) servers provide tools to AI agents. Compilation generates `mcp-config.json` from workflow configuration.
diff --git a/scratchpad/artifact-naming-compatibility.md b/scratchpad/artifact-naming-compatibility.md
@@ -31,12 +31,71 @@ The `gh aw logs` and `gh aw audit` commands maintain full backward and forward c
 
 ## Compatibility Matrix
 
+### Single-File Artifacts
+
+These artifacts contain exactly one file and are flattened to the root directory by `flattenSingleFileArtifacts()`:
+
 | Artifact Name (Old) | Artifact Name (New) | File in Artifact | After Flattening | CLI Expects |
 |---------------------|---------------------|------------------|------------------|-------------|
 | `aw_info.json` | `aw-info` | `aw_info.json` | `aw_info.json` | ✅ |
 | `safe_output.jsonl` | `safe-output` | `safe_output.jsonl` | `safe_output.jsonl` | ✅ |
 | `agent_output.json` | `agent-output` | `agent_output.json` | `agent_output.json` | ✅ |
 | `prompt.txt` | `prompt` | `prompt.txt` | `prompt.txt` | ✅ |
+| `threat-detection.log` | `detection` | `detection.log` | `detection.log` | ✅ |
+
+### Multi-File Artifacts
+
+These artifacts are initially downloaded by `gh run download` as directory trees that retain their internal structure. However, unlike the single-file artifact handling above, `gh aw logs` / `gh aw audit` may perform additional post-processing for some multi-file artifacts (notably `agent` and `activation`) to move expected files into the final layout used by the CLI.
+
+| Artifact Name | Constant | Contents | Notes |
+|---------------|----------|----------|-------|
+| `firewall-audit-logs` | `constants.FirewallAuditArtifactName` | AWF structured audit/observability logs | Uploaded by all firewall-enabled workflows; retains directory structure after download |
+| `agent` | `constants.AgentArtifactName` | Unified agent job outputs (logs, safe outputs, token usage) | Downloaded as a directory tree, then post-processed by CLI flattening/reorganization helpers |
+| `activation` | `constants.ActivationArtifactName` | Activation job output (`aw_info.json`, `prompt.txt`) | Downloaded as a directory tree, then post-processed by CLI flattening helpers for downstream use |
+
+#### `firewall-audit-logs` Directory Structure
+
+The `firewall-audit-logs` artifact (constant: `constants.FirewallAuditArtifactName`) is uploaded by all firewall-enabled agentic workflows. It is **separate** from the `agent` artifact and must be downloaded independently.
+
+```
+firewall-audit-logs/
+├── api-proxy-logs/
+│   └── token-usage.jsonl        ← Token usage data (input/output/cache tokens per request)
+├── squid-logs/
+│   └── access.log               ← Network policy log (domain allow/deny decisions)
+├── audit.jsonl                  ← Firewall audit trail (policy matches, rule evaluations)
+└── policy-manifest.json         ← Policy configuration snapshot
+```
+
+**Downloading firewall audit logs with `gh run download`:**
+
+```bash
+# Download only the firewall-audit-logs artifact
+gh run download <run-id> -n firewall-audit-logs
+
+# The data is then at:
+#   firewall-audit-logs/api-proxy-logs/token-usage.jsonl
+#   firewall-audit-logs/squid-logs/access.log
+#   firewall-audit-logs/audit.jsonl
+#   firewall-audit-logs/policy-manifest.json
+```
+
+**Recommended: Use `gh aw logs` instead of `gh run download`:**
+
+The `gh aw logs` command knows the correct artifact names and handles backward compatibility automatically:
+
+```bash
+# Download and analyze all logs (including firewall data)
+gh aw logs <run-id>
+
+# Download only firewall artifacts
+gh aw logs <run-id> --artifacts firewall
+
+# Output as JSON for programmatic use
+gh aw logs <run-id> --artifacts firewall --json
+```
+
+> **⚠️ Common mistake:** Downloading `agent-artifacts` or `agent` and expecting to find `token-usage.jsonl` there. Token usage data lives in the `firewall-audit-logs` artifact, not in the agent artifact.
 
 ## Testing