Non-Functional Requirements
Status: Final Version: 1.0
Purpose
Define performance, scalability, observability, backup, disaster recovery, and operational requirements for the ESG platform.
Performance SLAs
| Metric | Target | Measurement |
|---|---|---|
| API Response Time (p95) | < 500ms | GET requests, non-report generation |
| API Response Time (p99) | < 1s | GET requests |
| Submission Upload | < 2s | POST /api/v1/collector/submissions |
| Evidence Upload (10MB) | < 5s | POST evidence endpoint |
| Report Generation | < 5min | Async job, full GRI report (PDF) |
| Page Load Time | < 2s | Web dashboard (admin UI) |
Scalability Assumptions
| Dimension | v1 Assumptions | Scaling Strategy |
|---|---|---|
| Concurrent Users | 100 simultaneous collectors | Horizontal scaling (load balancer + multiple app servers) |
| Submissions/Day | 10,000 submissions | Queue workers scale horizontally |
| Data Volume | 100 GB/year (submissions + evidence) | S3 auto-scales, database vertical scaling |
| Tenants | 50 tenants | Multi-tenant DB with tenant_id sharding (vNext) |
Queue Configuration
| Queue Name | Worker Count | Memory Limit | Retry Limit | Timeout |
|---|---|---|---|---|
validations |
5 workers | 512 MB | 3 | 60s |
processing |
3 workers | 512 MB | 3 | 120s |
reporting |
2 workers | 1 GB | 2 | 600s (10 min) |
default |
2 workers | 256 MB | 3 | 90s |
Laravel Queue Worker Command:
php artisan queue:work redis --queue=validations,processing,reporting,default --tries=3 --timeout=90 --memory=512
Background Jobs
| Job | Schedule | Purpose |
|---|---|---|
EscalateOverdueReviewsJob |
Hourly | Flag reviews overdue by 3+ days |
MarkEvidenceForDeletionJob |
Daily (midnight) | Mark expired evidence (7+ years) |
GeneratePeriodicReportsJob |
Weekly (Sunday) | Auto-generate draft reports for open periods |
PruneAuditLogsJob |
Monthly | Archive audit logs older than 2 years to cold storage |
Observability
Logging
Laravel Log Channels:
- daily: Application logs (7-day rotation)
- errorlog: PHP errors, exceptions
- syslog: Critical errors sent to external SIEM
Structured Logging:
Log::info('Submission processed', [
'submission_id' => $submission->id,
'tenant_id' => $submission->tenant_id,
'state' => $submission->state,
'processing_time_ms' => $processingTime,
]);
Metrics
Laravel Telescope (Development/Staging): - Request/response logging - Query performance tracking - Job/queue monitoring
Production Monitoring (Prometheus + Grafana): - API response times (histogram) - Queue depth (gauge) - Submission throughput (counter) - Error rates (counter)
Key Metrics:
esg_submissions_total{state="approved"} 5000
esg_api_response_time_seconds{endpoint="/submissions",quantile="0.95"} 0.45
esg_queue_depth{queue="validations"} 12
Alerting
Alerts (PagerDuty/Slack): - API p95 latency > 1s for 5 minutes - Queue depth > 100 for 10 minutes - Error rate > 5% for 5 minutes - Disk usage > 80%
Backups
| Resource | Frequency | Retention | RTO | RPO |
|---|---|---|---|---|
| PostgreSQL | Continuous (WAL archiving) | 30 days | 1 hour | 5 minutes |
| S3 Evidence Bucket | Versioning enabled | 7 years | N/A (immutable) | Real-time |
| Application Code | Git (on push) | Indefinite | 15 minutes | N/A |
| Secrets | AWS Secrets Manager backup | 90 days | 30 minutes | N/A |
Database Backup:
# Automated via AWS RDS automated backups
# Snapshot retention: 30 days
# Point-in-time recovery (PITR): 5-minute granularity
Disaster Recovery
RTO & RPO Targets
- RTO (Recovery Time Objective): 4 hours (critical period: data collection deadlines)
- RPO (Recovery Point Objective): 1 hour (max acceptable data loss)
DR Strategy
Multi-AZ Deployment (AWS): - RDS: Multi-AZ automatic failover (< 2 min) - Application servers: Auto Scaling Group across 2+ AZs - S3: Cross-region replication (CRR) to DR region
Failover Procedure: 1. Detect outage (CloudWatch alarms) 2. Promote RDS standby (automatic) 3. Route 53 DNS failover to DR region load balancer 4. Notify team via PagerDuty 5. Verify application health checks
Data Residency
GDPR Requirement: EU customer data must remain in EU regions.
Implementation:
- Tenant metadata includes data_region (e.g., eu-west-1, us-east-1)
- S3 buckets per region: evidence-eu-west-1, evidence-us-east-1
- Database: Regional RDS instances (future multi-region support)
Acceptance Criteria
- API p95 latency < 500ms under normal load (100 users)
- Queue workers configured with correct retry/timeout limits
- PostgreSQL automated backups enabled (30-day retention)
- S3 versioning enabled for evidence bucket
- CloudWatch alarms configured for latency, queue depth, errors
- RDS Multi-AZ enabled for high availability
- Scheduled jobs run as expected (cron verified)
Cross-References
Change Log
| Version | Date | Author | Changes |
|---|---|---|---|
| 1.0 | 2026-01-03 | Senior Product Architect | Initial NFR specification |