Issue 18 | Terminal-based Automated UI Testing

Updated on 4/5/2026

🎯 Learning Objectives

After completing this issue, you will master:

  1. How to use Claude Code to drive Playwright / Cypress for automated UI testing
  2. How PMs / QAs can describe test cases in natural language for agents to execute
  3. Practical methods for visual regression testing
  4. Automated closed-loop from test results to bug reports

📖 Core Concepts Explained

18.1 Why should PMs / QAs care about automated testing?

Traditional QA Process:
  PM writes acceptance criteria → QA manually clicks → Screenshots recorded →
  Discovers Bug → Writes Bug Report → Waits for Dev Fix
  Time per round: 2-4 hours

Claude Code Assisted Process:
  PM describes user story → Claude generates test script →
  Automatically executes → Automatically compares screenshots → Automatically generates report
  Time per round: 5-10 minutes

PM's Role Changes:
  ❌ No manual testing required
  ❌ No test code writing required
  ✅ Needs to clearly describe acceptance criteria
  ✅ Needs to review automatically generated test reports

18.2 Playwright Test Generation Flow

PM writes acceptance criteria:
  "User clicks the register button on the homepage,
   fills out the form, and should see a success message."
          │
          ▼
Claude generates Playwright test:
  test('用户注册流程', async ({ page }) => {
    await page.goto('/');
    await page.click('text=注册');
    await page.fill('[name=email]', '[email protected]');
    await page.fill('[name=password]', '123456');
    await page.click('button[type=submit]');
    await expect(page.locator('.success')).toBeVisible();
  });
          │
          ▼
Automatically executes and captures keyframes:
  🖼️ screenshot-1: Homepage loaded
  🖼️ screenshot-2: Registration form
  🖼️ screenshot-3: Success message
          │
          ▼
Generates test report

💻 Simulating Claude TUI Interaction

Scenario: Describing Test Cases in Natural Language

> Please test the following user stories:
  1. Unauthenticated users visiting the homepage should see a "Login" button
  2. Clicking login, entering correct credentials, should redirect to the dashboard
  3. The dashboard should display the username and recent activity
  4. Clicking logout should return to the homepage

╭─ Claude ─────────────────────────────────────────────╮
│                                                      │
│  I will generate and execute these E2E tests for you.│
│                                                      │
│  ⚙️ Tool: Bash                                       │
│  $ npm install -D @playwright/test                    │
│  $ npx playwright install chromium                   │
│  ✅ Playwright installation complete                 │
│                                                      │
│  ⚙️ Tool: Write → tests/e2e/login-flow.spec.ts      │
│  Created 4 test cases matching your user stories     │
│                                                      │
│  ⚙️ Tool: Bash                                       │
│  $ npx playwright test tests/e2e/login-flow.spec.ts  │
│                                                      │
│  📋 Test Results:                                    │
│  ╭────────────────────────────────────────────╮      │
│  │ Test Case              │ Status │ Duration │      │
│  │───────────────────────│───────│─────────│      │
│  │ ✅ Homepage displays login button    │ PASS  │ 1.2s    │      │
│  │ ✅ Login redirects to dashboard      │ PASS  │ 2.3s    │      │
│  │ ❌ Dashboard displays user info  │ FAIL  │ 3.1s    │      │
│  │ ✅ Logout returns to homepage    │ PASS  │ 1.8s    │      │
│  ╰────────────────────────────────────────────╯      │
│                                                      │
│  ❌ Test 3 Failure Details:                          │
│  Expected: Page contains username 'Test User'        │
│  Actual: Username area is empty                      │
│  📸 Screenshot saved: test-results/screenshot-fail.png     │
│                                                      │
│  🔍 Tool: Read → src/pages/Dashboard.tsx             │
│  Found the reason! The return value of useUser() Hook│
│  is null during SSR, and only has a value during client-side rendering.             │
│                                                      │
│  ⚙️ Tool: Edit → src/pages/Dashboard.tsx             │
│  Added loading state check: if (!user) return <Loading/>    │
│                                                      │
│  ⚙️ Tool: Bash → $ npx playwright test (rerun)   │
│  ✅ ALL 4 TESTS PASSED                               │
│                                                      │
│  🎉 All tests passed! Automatically fixed the dashboard's SSR issue.     │
│                                                      │
╰──────────────────────────────────────────────────────╯

💻 Code Demonstration

# ✅ Generate and execute tests from acceptance criteria
claude "Generate Playwright tests based on the following acceptance criteria:
  - Product search should return relevant results
  - Empty search should display a prompt message
  - Search results should be paginated
  Run tests and report results"

# ✅ Visual Regression Testing
claude "Take full-page screenshots of the following pages:
  /dashboard, /settings, /profile
  Compare with baselines in the screenshots/ directory,
  Report any visual differences"

# ✅ Cross-Browser Testing
claude "On Chromium, Firefox, and WebKit,
  run login flow tests and compare results across different browsers"

# ✅ Performance Testing
claude "Use Playwright to measure the homepage's first load time,
  record LCP, FID, and CLS metrics"

🔧 Tools Involved

Tool Testing Phase Purpose
Bash Environment Setup Installs Playwright and browsers
Write Script Generation Creates test files
Bash Test Execution Runs playwright test
Read Bug Localization Reads relevant source files for failures
Edit Bug Fix Modifies code and reruns tests

📝 Key Takeaways from this Issue

  1. PMs / QAs use natural language to describe acceptance criteria, and Claude generates test code
  2. When tests fail, Claude automatically locates source code → fixes → reruns
  3. Visual regression testing can compare screenshots to capture UI changes
  4. Playwright supports multiple browsers, and Claude can automate cross-browser testing
  5. Suitable for PMs and QAs unfamiliar with coding to quickly verify functionality

🔗 References