Implement Sprint 1 stories: collections, RBAC, audit logging, load testing

Complete 6 Sprint 1 stories for Epic 1 web migration infrastructure. Portfolio Collection: - Add 7 fields: title, slug, url, image, description, websiteType, tags - Configure R2 storage and authenticated access control Categories Collection: - Add nameEn, order, textColor, backgroundColor fields - Add color picker UI configuration Posts Collection: - Add excerpt with 200 char limit and ogImage for social sharing - Add showInFooter checkbox and status select (draft/review/published) Role-Based Access Control: - Add role field to Users collection (admin/editor) - Create adminOnly and authenticated access functions - Apply access rules to Portfolio, Categories, Posts, Users collections Audit Logging System (NFR9): - Create Audit collection with timestamps for 90-day retention - Add auditLogger utility for login/logout/content change tracking - Add auditChange and auditGlobalChange hooks to all collections and globals - Add cleanupAuditLogs job with 90-day retention policy Load Testing Framework (NFR4): - Add k6 load testing with 3 scripts: public-browsing, admin-operations, api-performance - Configure targets: p95 < 500ms, error rate < 1%, 100 concurrent users - Add verification script and comprehensive documentation Other Changes: - Remove unused Form blocks - Add Header/Footer audit hooks - Regenerate Payload TypeScript types
2026-01-31 17:20:35 +08:00
parent 0846318d6e
commit 7fd73e0e3d
48 changed files with 19497 additions and 5261 deletions
--- a/apps/backend/tests/k6/TESTING-GUIDE.md
+++ b/apps/backend/tests/k6/TESTING-GUIDE.md
@@ -0,0 +1,364 @@
+# Load Testing Execution Guide
+
+## Test Execution Checklist
+
+### Pre-Test Requirements
+
+- [ ] Backend server is running (`pnpm dev`)
+- [ ] Database is accessible
+- [ ] k6 is installed (`k6 version`)
+- [ ] Environment variables are configured
+
+### Verification Test
+
+Run the verification script first to ensure everything is set up correctly:
+
+```bash
+k6 run tests/k6/verify-setup.js
+```
+
+**Expected Output:**
+```
+=== K6 Setup Verification ===
+Target: http://localhost:3000
+Home page status: 200
+API status: 200
+Pages API response time: 123ms
+✓ Server is reachable
+✓ Home page responds
+✓ API endpoint responds
+```
+
+## Test Scenarios
+
+### 1. Public Browsing Test
+
+**Purpose:** Verify public pages can handle 100 concurrent users
+
+**Prerequisites:**
+- Backend running
+- Public pages accessible
+
+**Execution:**
+```bash
+# Local testing
+pnpm test:load
+
+# With custom URL
+k6 run --env BASE_URL=http://localhost:3000 tests/k6/public-browsing.js
+
+# Staging
+k6 run --env BASE_URL=https://staging.enchun.tw tests/k6/public-browsing.js
+```
+
+**Success Criteria:**
+- p95 response time < 500ms
+- Error rate < 1%
+- 100 concurrent users sustained for 2 minutes
+
+**What It Tests:**
+- Homepage rendering
+- Page navigation
+- Static content delivery
+- Database read operations
+
+---
+
+### 2. API Performance Test
+
+**Purpose:** Verify API endpoints meet performance targets
+
+**Prerequisites:**
+- Backend running
+- API endpoints accessible
+
+**Execution:**
+```bash
+# Local testing
+pnpm test:load:api
+
+# With custom URL
+k6 run --env BASE_URL=http://localhost:3000 tests/k6/api-performance.js
+
+# Staging
+k6 run --env BASE_URL=https://staging.enchun.tw tests/k6/api-performance.js
+```
+
+**Success Criteria:**
+- p95 response time < 300ms
+- Error rate < 0.5%
+- Throughput > 100 req/s
+
+**What It Tests:**
+- REST API endpoints
+- GraphQL queries
+- Authentication endpoints
+- Concurrent API requests
+
+---
+
+### 3. Admin Operations Test
+
+**Purpose:** Verify admin panel can handle 20 concurrent users
+
+**Prerequisites:**
+- Backend running
+- Valid admin credentials
+
+**Execution:**
+```bash
+# Local testing
+k6 run \
+  --env ADMIN_EMAIL=admin@enchun.tw \
+  --env ADMIN_PASSWORD=yourpassword \
+  tests/k6/admin-operations.js
+
+# Or use npm script
+ADMIN_EMAIL=admin@enchun.tw ADMIN_PASSWORD=yourpassword \
+  pnpm test:load:admin
+```
+
+**Success Criteria:**
+- p95 response time < 700ms
+- Error rate < 1%
+- 20 concurrent users sustained for 3 minutes
+
+**What It Tests:**
+- Login/authentication
+- CRUD operations
+- Admin panel performance
+- Database write operations
+
+**Warning:** This test creates draft posts in the database. Clean up manually after testing.
+
+---
+
+## Test Execution Strategy
+
+### Phase 1: Development Testing
+
+Run tests locally during development with low load:
+
+```bash
+# Quick smoke test (10 users)
+k6 run --env STAGED_USERS=10 tests/k6/public-browsing.js
+
+# API smoke test (5 users)
+k6 run --env STAGED_USERS=5 tests/k6/api-performance.js
+```
+
+### Phase 2: Pre-Deployment Testing
+
+Run full test suite against staging:
+
+```bash
+# Run all tests
+pnpm test:load:all
+
+# Or individual tests with full load
+k6 run --env BASE_URL=https://staging.enchun.tw tests/k6/public-browsing.js
+k6 run --env BASE_URL=https://staging.enchun.tw tests/k6/api-performance.js
+k6 run --env BASE_URL=https://staging.enchun.tw \
+  --env ADMIN_EMAIL=$STAGING_ADMIN_EMAIL \
+  --env ADMIN_PASSWORD=$STAGING_ADMIN_PASSWORD \
+  tests/k6/admin-operations.js
+```
+
+### Phase 3: Production Monitoring
+
+Schedule automated tests (via GitHub Actions or cron):
+
+- Daily: Public browsing and API tests
+- Weekly: Full test suite including admin operations
+- On-demand: Before major releases
+
+---
+
+## Result Analysis
+
+### Key Metrics to Check
+
+1. **p95 Response Time**
+   - Public pages: < 500ms ✅
+   - API endpoints: < 300ms ✅
+   - Admin operations: < 700ms ✅
+
+2. **Error Rate**
+   - Public: < 1% ✅
+   - API: < 0.5% ✅
+   - Admin: < 1% ✅
+
+3. **Throughput**
+   - API: > 100 req/s ✅
+   - Pages: > 50 req/s ✅
+
+4. **Virtual Users (VUs)**
+   - Should sustain target VUs for full duration
+   - No drops or connection errors
+
+### Sample Output Analysis
+
+```
+✓ http_req_duration..............: avg=185ms p(95)=420ms
+✓ http_req_failed................: 0.00% ✓ 0      ✗ 12000
+✓ checks.........................: 100.0% ✓ 12000 ✗ 0
+iterations.....................: 2400   20  /s
+vus............................: 100    min=100    max=100
+```
+
+**This result shows:**
+- ✅ p95 = 420ms (< 500ms threshold)
+- ✅ Error rate = 0% (< 1% threshold)
+- ✅ All checks passed
+- ✅ Sustained 100 VUs
+
+---
+
+## Troubleshooting Guide
+
+### Issue: Connection Refused
+
+**Symptoms:**
+```
+ERRO[0000] GoError: dial tcp 127.0.0.1:3000: connect: connection refused
+```
+
+**Solutions:**
+1. Ensure backend is running: `pnpm dev`
+2. Check port is correct: `--env BASE_URL=http://localhost:3000`
+3. Verify firewall isn't blocking connections
+
+---
+
+### Issue: High Error Rate (> 1%)
+
+**Symptoms:**
+```
+✗ http_req_failed................: 2.50% ✓ 11700  ✗ 300
+```
+
+**Common Causes:**
+1. Server overloaded → Reduce VUs
+2. Database connection issues → Check DB logs
+3. Missing authentication → Check credentials
+4. Invalid URLs → Verify BASE_URL
+
+**Solutions:**
+```bash
+# Reduce load to diagnose
+k6 run --env STAGED_USERS=10 tests/k6/public-browsing.js
+
+# Check server logs in parallel
+tail -f logs/backend.log
+```
+
+---
+
+### Issue: Slow Response Times (p95 >= 500ms)
+
+**Symptoms:**
+```
+http_req_duration..............: avg=450ms p(95)=850ms
+```
+
+**Common Causes:**
+1. Unoptimized database queries
+2. Missing database indexes
+3. Large payload sizes
+4. Server resource constraints
+
+**Solutions:**
+1. Check database query performance
+2. Review database indexes
+3. Optimize images/assets
+4. Scale server resources
+5. Enable caching
+
+---
+
+### Issue: Login Failed (Admin Test)
+
+**Symptoms:**
+```
+✗ login successful
+status: 401
+```
+
+**Solutions:**
+1. Verify admin credentials are correct
+2. Check admin user exists in database
+3. Ensure admin user has correct role
+4. Try logging in via admin panel first
+
+---
+
+## Performance Optimization Tips
+
+Based on test results, you may need to:
+
+1. **Add Database Indexes**
+   ```javascript
+   // Example: Index on frequently queried fields
+   Posts.createIndex({ title: 1 });
+   Posts.createIndex({ status: 1, createdAt: -1 });
+   ```
+
+2. **Enable Caching**
+   - Cache global API responses
+   - Cache public pages
+   - Use CDN for static assets
+
+3. **Optimize Images**
+   - Use WebP format
+   - Implement lazy loading
+   - Serve responsive images
+
+4. **Database Optimization**
+   - Limit query depth where possible
+   - Use projection to reduce payload size
+   - Implement pagination
+
+5. **Scale Resources**
+   - Increase server memory/CPU
+   - Use database connection pooling
+   - Implement load balancing
+
+---
+
+## Reporting
+
+### Generate Reports
+
+```bash
+# JSON report
+k6 run --out json=results.json tests/k6/public-browsing.js
+
+# HTML report
+npm install -g k6-reporter
+k6-reporter results.json --output results.html
+open results.html
+```
+
+### Share Results
+
+Include in your PR/daily standup:
+- p95 response time
+- Error rate
+- Throughput
+- Any issues found
+
+---
+
+## Best Practices
+
+1. **Run tests regularly** - Catch regressions early
+2. **Test on staging first** - Avoid breaking production
+3. **Monitor trends** - Track performance over time
+4. **Share results** - Keep team informed
+5. **Update baselines** - Adjust as system evolves
+6. **Don't ignore failures** - Investigate all issues
+
+---
+
+**Last Updated:** 2025-01-31
+**Owner:** QA Team