Skip to main content
Use a browser agent to confirm a public report is reachable and capture evidence for downstream processing.

Evidence goal

  • Public file verification without private credentials.
  • Download or file evidence that downstream code can process.
  • Artifact hygiene: store IDs and source URLs, not bulky file contents in the answer.

Setup

Use the CLI with a configured model profile. The example starts from a public govinfo.gov PDF URL.

Run it from the CLI

web agent \
  --url https://www.govinfo.gov/content/pkg/BUDGET-2025-BUD/pdf/BUDGET-2025-BUD.pdf \
  --allow-domain govinfo.gov \
  "Confirm whether the public PDF is reachable. Return the visible document title or metadata, final URL, and whether a downloadable artifact was captured. Do not navigate away from govinfo.gov."

Expected output

Public PDF check

Title or metadata: ...
Final URL: https://www.govinfo.gov/content/pkg/BUDGET-2025-BUD/pdf/BUDGET-2025-BUD.pdf
Artifact captured: yes/no

Build it into an app

const result = await agent.run({
  startUrl: "https://www.govinfo.gov/content/pkg/BUDGET-2025-BUD/pdf/BUDGET-2025-BUD.pdf",
  goal:
    "Confirm whether the public PDF is reachable. Return document metadata, final URL, and whether a downloadable artifact was captured.",
});

await saveReportEvidence({
  status: result.status,
  text: result.text,
  artifacts: result.artifacts,
});

Production notes

  • Domain policy: govinfo.gov.
  • Treat downloaded files as untrusted input.
  • Store artifact IDs and source URLs with the job.
  • Do not put bulky file contents in a final answer.

Inspect

  • Artifacts for captured file evidence.
  • Recording for replay.
  • Status if the PDF fails to load.
  • Debug UI for visible document state.

Cleanup

The CLI agent closes browsers it creates. If you saved downloaded files locally, delete or archive them according to your retention policy.

Common failures

  • If the PDF viewer opens but no download is captured, use the final URL and visible metadata as evidence.
  • If the file is large or slow, keep file processing outside the agent response and hand the artifact to deterministic code.