Skip to content

Commit cc56642

Browse files
Copilotlpcox
andauthored
Doc: document firewall-audit-logs artifact name for downstream consumers (#25684)
* Initial plan * doc: document firewall-audit-logs artifact name for downstream consumers Add firewall-audit-logs artifact documentation to help downstream repositories that write custom token analysis workflows using gh run download directly. Changes: - Create docs/src/content/docs/reference/artifacts.md with complete artifact naming reference including directory structures - Update scratchpad/artifact-naming-compatibility.md to add multi-file artifacts section with firewall-audit-logs details - Update docs/src/content/docs/reference/compilation-process.md to include firewall-audit-logs in the Artifacts Created table Closes #1494 Agent-Logs-Url: https://github.com/github/gh-aw/sessions/28093af6-8cb7-4db7-a198-a7f06c6f20e3 Co-authored-by: lpcox <[email protected]> * doc: address review feedback on artifact documentation - Fix directory structure to match CLI expectations: token usage at api-proxy-logs/ (not logs/api-proxy-logs/), audit.jsonl and policy-manifest.json at root (not audit/ subdirectory) - Move detection to single-file artifacts table (it only contains detection.log) - Clarify multi-file artifacts may be post-processed by CLI - Mark safe-output and agent-output as legacy/back-compat (now merged into unified agent artifact) - Fix grammar: "uploads" → "upload" - Fix firewall-logs path to /tmp/gh-aw/sandbox/firewall/logs/ Agent-Logs-Url: https://github.com/github/gh-aw/sessions/fbe8b2c5-5e0c-4a3d-8e42-a6bd64a1cbba Co-authored-by: lpcox <[email protected]> --------- Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: lpcox <[email protected]>
1 parent 5b9e980 commit cc56642

File tree

3 files changed

+250
-1
lines changed

3 files changed

+250
-1
lines changed
Lines changed: 173 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
---
2+
title: Artifacts
3+
description: Complete reference for artifact names, directory structures, and download patterns used by GitHub Agentic Workflows.
4+
sidebar:
5+
order: 298
6+
---
7+
8+
GitHub Agentic Workflows upload several artifacts during workflow execution. This reference documents every artifact name, its contents, and how to access the data — especially for downstream workflows that use `gh run download` directly instead of `gh aw logs`.
9+
10+
## Quick Reference
11+
12+
| Artifact Name | Constant | Type | Description |
13+
|---------------|----------|------|-------------|
14+
| `agent` | `constants.AgentArtifactName` | Multi-file | Unified agent job outputs (logs, safe outputs, token usage summary) |
15+
| `activation` | `constants.ActivationArtifactName` | Multi-file | Activation job output (`aw_info.json`, `prompt.txt`, rate limits) |
16+
| `firewall-audit-logs` | `constants.FirewallAuditArtifactName` | Multi-file | AWF firewall audit/observability logs (token usage, network policy, audit trail) |
17+
| `detection` | `constants.DetectionArtifactName` | Single-file | Threat detection log (`detection.log`) |
18+
| `safe-output` | `constants.SafeOutputArtifactName` | Legacy/back-compat | Historical standalone safe output artifact (`safe_output.jsonl`); in current compiled workflows this content is included in the unified `agent` artifact instead |
19+
| `agent-output` | `constants.AgentOutputArtifactName` | Legacy/back-compat | Historical standalone agent output artifact (`agent_output.json`); in current compiled workflows this content is included in the unified `agent` artifact instead |
20+
| `aw-info` || Single-file | Engine configuration (`aw_info.json`) |
21+
| `prompt` || Single-file | Generated prompt (`prompt.txt`) |
22+
| `safe-outputs-items` | `constants.SafeOutputItemsArtifactName` | Single-file | Safe output items manifest |
23+
| `code-scanning-sarif` | `constants.SarifArtifactName` | Single-file | SARIF file for code scanning results |
24+
25+
## Artifact Sets
26+
27+
The `gh aw logs` and `gh aw audit` commands support `--artifacts` to download only specific artifact groups:
28+
29+
| Set Name | Artifacts Downloaded | Use Case |
30+
|----------|---------------------|----------|
31+
| `all` | Everything | Full analysis (default) |
32+
| `agent` | `agent` | Agent logs and outputs |
33+
| `activation` | `activation` | Activation data (`aw_info.json`, `prompt.txt`) |
34+
| `firewall` | `firewall-audit-logs` | Network policy and firewall audit data |
35+
| `mcp` | `firewall-audit-logs` | MCP gateway traffic logs |
36+
| `detection` | `detection` | Threat detection output |
37+
| `github-api` | `activation`, `agent` | GitHub API rate limit logs |
38+
39+
```bash
40+
# Download only firewall artifacts
41+
gh aw logs <run-id> --artifacts firewall
42+
43+
# Download agent and firewall artifacts
44+
gh aw logs <run-id> --artifacts agent --artifacts firewall
45+
46+
# Download everything (default)
47+
gh aw logs <run-id>
48+
```
49+
50+
## `firewall-audit-logs`
51+
52+
The `firewall-audit-logs` artifact is uploaded by **all firewall-enabled workflows**. It contains AWF (Agent Workflow Firewall) structured audit and observability logs.
53+
54+
> **⚠️ Important:** This artifact is **separate** from the `agent` artifact. Token usage data (`token-usage.jsonl`) lives here, not in the `agent` artifact.
55+
56+
### Directory Structure
57+
58+
```
59+
firewall-audit-logs/
60+
├── api-proxy-logs/
61+
│ └── token-usage.jsonl ← Token usage data (input/output/cache tokens per API request)
62+
├── squid-logs/
63+
│ └── access.log ← Network policy log (domain allow/deny decisions)
64+
├── audit.jsonl ← Firewall audit trail (policy matches, rule evaluations)
65+
└── policy-manifest.json ← Policy configuration snapshot
66+
```
67+
68+
### Accessing Token Usage Data
69+
70+
**Recommended: Use `gh aw logs`**
71+
72+
```bash
73+
# Download and analyze firewall data
74+
gh aw logs <run-id> --artifacts firewall
75+
76+
# Output as JSON for scripting
77+
gh aw logs <run-id> --artifacts firewall --json
78+
```
79+
80+
**Direct download with `gh run download`:**
81+
82+
```bash
83+
# Download the firewall-audit-logs artifact
84+
gh run download <run-id> -n firewall-audit-logs
85+
86+
# Token usage data is at:
87+
cat firewall-audit-logs/api-proxy-logs/token-usage.jsonl
88+
89+
# Network access log is at:
90+
cat firewall-audit-logs/squid-logs/access.log
91+
92+
# Audit trail is at:
93+
cat firewall-audit-logs/audit.jsonl
94+
95+
# Policy manifest is at:
96+
cat firewall-audit-logs/policy-manifest.json
97+
```
98+
99+
### Common Mistake
100+
101+
Downstream workflows sometimes download `agent-artifacts` or `agent` expecting to find `token-usage.jsonl`. This will silently return no data — the token usage file is only in the `firewall-audit-logs` artifact.
102+
103+
```bash
104+
# ❌ WRONG — token-usage.jsonl is NOT in the agent artifact
105+
gh run download <run-id> -n agent
106+
cat agent/token-usage.jsonl # File not found!
107+
108+
# ✅ CORRECT — download from firewall-audit-logs
109+
gh run download <run-id> -n firewall-audit-logs
110+
cat firewall-audit-logs/api-proxy-logs/token-usage.jsonl
111+
```
112+
113+
## `agent`
114+
115+
The unified `agent` artifact contains all agent job outputs.
116+
117+
### Contents
118+
119+
- Agent execution logs
120+
- Safe output data (`agent_output.json`)
121+
- GitHub API rate limit logs (`github_rate_limits.jsonl`)
122+
- Token usage summary (`agent_usage.json`) — aggregated totals only; per-request data is in `firewall-audit-logs`
123+
124+
## `activation`
125+
126+
The `activation` artifact contains activation job outputs.
127+
128+
### Contents
129+
130+
- `aw_info.json` — Engine configuration and workflow metadata
131+
- `prompt.txt` — The generated prompt sent to the AI agent
132+
- `github_rate_limits.jsonl` — Rate limit data from the activation job
133+
134+
## `detection`
135+
136+
The `detection` artifact contains threat detection output.
137+
138+
### Contents
139+
140+
- `detection.log` — Threat detection analysis results
141+
142+
Legacy name: `threat-detection.log` (still supported for backward compatibility).
143+
144+
## Naming Compatibility
145+
146+
Artifact names changed between upload-artifact v4 and v5. The `gh aw logs` and `gh aw audit` commands handle both naming schemes transparently:
147+
148+
| Old Name (pre-v5) | New Name (v5+) | File Inside |
149+
|--------------------|----------------|-------------|
150+
| `aw_info.json` | `aw-info` | `aw_info.json` |
151+
| `safe_output.jsonl` | `safe-output` | `safe_output.jsonl` |
152+
| `agent_output.json` | `agent-output` | `agent_output.json` |
153+
| `prompt.txt` | `prompt` | `prompt.txt` |
154+
| `threat-detection.log` | `detection` | `detection.log` |
155+
156+
Single-file artifacts are automatically flattened to root level regardless of their artifact directory name. Multi-file artifacts (`firewall-audit-logs`, `agent`, `activation`) retain their directory structure.
157+
158+
## Workflow Call Prefixes
159+
160+
When workflows are invoked via `workflow_call`, GitHub Actions prepends a short hash to artifact names (e.g., `abc123-firewall-audit-logs`). The CLI handles this automatically by matching artifact names that end with `-{base-name}`.
161+
162+
```bash
163+
# Both of these are recognized as the firewall artifact:
164+
# - firewall-audit-logs (direct invocation)
165+
# - abc123-firewall-audit-logs (workflow_call invocation)
166+
```
167+
168+
## Related Documentation
169+
170+
- [Audit Commands](/gh-aw/reference/audit/) — Download and analyze workflow run artifacts
171+
- [Cost Management](/gh-aw/reference/cost-management/) — Track token usage and inference spend
172+
- [Network](/gh-aw/reference/network/) — Firewall and domain allow/deny configuration
173+
- [Compilation Process](/gh-aw/reference/compilation-process/) — How workflows are compiled including artifact upload steps

docs/src/content/docs/reference/compilation-process.md

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -252,10 +252,27 @@ Workflows generate several artifacts during execution:
252252
| **agent_output.json** | `/tmp/gh-aw/safeoutputs/` | AI agent output with structured safe output data (create_issue, add_comment, etc.) | Uploaded by agent job, downloaded by safe output jobs, auto-deleted after 90 days |
253253
| **agent_usage.json** | `/tmp/gh-aw/` | Aggregated token counts: `{"input_tokens":…,"output_tokens":…,"cache_read_tokens":…,"cache_write_tokens":…}` | Bundled in the unified agent artifact when the firewall is enabled; accessible to third-party tools without parsing step summaries |
254254
| **prompt.txt** | `/tmp/gh-aw/aw-prompts/` | Generated prompt sent to AI agent (includes markdown instructions, imports, context variables) | Retained for debugging and reproduction |
255-
| **firewall-logs/** | `/tmp/gh-aw/firewall-logs/` | Network access logs in Squid format (when `network.firewall:` enabled) | Analyzed by `gh aw logs` command |
255+
| **firewall-audit-logs** | See structure below | Dedicated artifact for AWF audit/observability logs (token usage, network policy, audit trail) | Uploaded by all firewall-enabled workflows; analyzed by `gh aw logs --artifacts firewall` |
256+
| **firewall-logs/** | `/tmp/gh-aw/sandbox/firewall/logs/` | Network access logs in Squid format (when `network.firewall:` enabled) | Analyzed by `gh aw logs` command |
256257
| **cache-memory/** | `/tmp/gh-aw/cache-memory/` | Persistent agent memory across runs (when `tools.cache-memory:` configured) | Restored at start, saved at end via GitHub Actions cache |
257258
| **patches/**, **sarif/**, **metadata/** | Various | Safe output data (git patches, SARIF files, metadata JSON) | Temporary, cleaned after processing |
258259

260+
### `firewall-audit-logs` Artifact Structure
261+
262+
The `firewall-audit-logs` artifact is a dedicated multi-file artifact uploaded by all firewall-enabled workflows. It is **separate** from the unified `agent` artifact. Downstream workflows that need token usage data or firewall audit logs must download this artifact specifically.
263+
264+
```
265+
firewall-audit-logs/
266+
├── api-proxy-logs/
267+
│ └── token-usage.jsonl ← Token usage data per request
268+
├── squid-logs/
269+
│ └── access.log ← Network policy log (allow/deny)
270+
├── audit.jsonl ← Firewall audit trail
271+
└── policy-manifest.json ← Policy configuration snapshot
272+
```
273+
274+
> **Tip:** Use `gh aw logs <run-id> --artifacts firewall` to download and analyze firewall data instead of `gh run download` directly. The CLI handles artifact naming and backward compatibility automatically. See the [Artifacts reference](/gh-aw/reference/artifacts/) for the complete artifact naming guide.
275+
259276
## MCP Server Integration
260277
261278
Model Context Protocol (MCP) servers provide tools to AI agents. Compilation generates `mcp-config.json` from workflow configuration.

scratchpad/artifact-naming-compatibility.md

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,12 +31,71 @@ The `gh aw logs` and `gh aw audit` commands maintain full backward and forward c
3131

3232
## Compatibility Matrix
3333

34+
### Single-File Artifacts
35+
36+
These artifacts contain exactly one file and are flattened to the root directory by `flattenSingleFileArtifacts()`:
37+
3438
| Artifact Name (Old) | Artifact Name (New) | File in Artifact | After Flattening | CLI Expects |
3539
|---------------------|---------------------|------------------|------------------|-------------|
3640
| `aw_info.json` | `aw-info` | `aw_info.json` | `aw_info.json` ||
3741
| `safe_output.jsonl` | `safe-output` | `safe_output.jsonl` | `safe_output.jsonl` ||
3842
| `agent_output.json` | `agent-output` | `agent_output.json` | `agent_output.json` ||
3943
| `prompt.txt` | `prompt` | `prompt.txt` | `prompt.txt` ||
44+
| `threat-detection.log` | `detection` | `detection.log` | `detection.log` ||
45+
46+
### Multi-File Artifacts
47+
48+
These artifacts are initially downloaded by `gh run download` as directory trees that retain their internal structure. However, unlike the single-file artifact handling above, `gh aw logs` / `gh aw audit` may perform additional post-processing for some multi-file artifacts (notably `agent` and `activation`) to move expected files into the final layout used by the CLI.
49+
50+
| Artifact Name | Constant | Contents | Notes |
51+
|---------------|----------|----------|-------|
52+
| `firewall-audit-logs` | `constants.FirewallAuditArtifactName` | AWF structured audit/observability logs | Uploaded by all firewall-enabled workflows; retains directory structure after download |
53+
| `agent` | `constants.AgentArtifactName` | Unified agent job outputs (logs, safe outputs, token usage) | Downloaded as a directory tree, then post-processed by CLI flattening/reorganization helpers |
54+
| `activation` | `constants.ActivationArtifactName` | Activation job output (`aw_info.json`, `prompt.txt`) | Downloaded as a directory tree, then post-processed by CLI flattening helpers for downstream use |
55+
56+
#### `firewall-audit-logs` Directory Structure
57+
58+
The `firewall-audit-logs` artifact (constant: `constants.FirewallAuditArtifactName`) is uploaded by all firewall-enabled agentic workflows. It is **separate** from the `agent` artifact and must be downloaded independently.
59+
60+
```
61+
firewall-audit-logs/
62+
├── api-proxy-logs/
63+
│ └── token-usage.jsonl ← Token usage data (input/output/cache tokens per request)
64+
├── squid-logs/
65+
│ └── access.log ← Network policy log (domain allow/deny decisions)
66+
├── audit.jsonl ← Firewall audit trail (policy matches, rule evaluations)
67+
└── policy-manifest.json ← Policy configuration snapshot
68+
```
69+
70+
**Downloading firewall audit logs with `gh run download`:**
71+
72+
```bash
73+
# Download only the firewall-audit-logs artifact
74+
gh run download <run-id> -n firewall-audit-logs
75+
76+
# The data is then at:
77+
# firewall-audit-logs/api-proxy-logs/token-usage.jsonl
78+
# firewall-audit-logs/squid-logs/access.log
79+
# firewall-audit-logs/audit.jsonl
80+
# firewall-audit-logs/policy-manifest.json
81+
```
82+
83+
**Recommended: Use `gh aw logs` instead of `gh run download`:**
84+
85+
The `gh aw logs` command knows the correct artifact names and handles backward compatibility automatically:
86+
87+
```bash
88+
# Download and analyze all logs (including firewall data)
89+
gh aw logs <run-id>
90+
91+
# Download only firewall artifacts
92+
gh aw logs <run-id> --artifacts firewall
93+
94+
# Output as JSON for programmatic use
95+
gh aw logs <run-id> --artifacts firewall --json
96+
```
97+
98+
> **⚠️ Common mistake:** Downloading `agent-artifacts` or `agent` and expecting to find `token-usage.jsonl` there. Token usage data lives in the `firewall-audit-logs` artifact, not in the agent artifact.
4099
41100
## Testing
42101

0 commit comments

Comments
 (0)