Skip to content

Non-Functional Requirements

Status: Final Version: 1.0


Purpose

Define performance, scalability, observability, backup, disaster recovery, and operational requirements for the ESG platform.


Performance SLAs

Metric Target Measurement
API Response Time (p95) < 500ms GET requests, non-report generation
API Response Time (p99) < 1s GET requests
Submission Upload < 2s POST /api/v1/collector/submissions
Evidence Upload (10MB) < 5s POST evidence endpoint
Report Generation < 5min Async job, full GRI report (PDF)
Page Load Time < 2s Web dashboard (admin UI)

Scalability Assumptions

Dimension v1 Assumptions Scaling Strategy
Concurrent Users 100 simultaneous collectors Horizontal scaling (load balancer + multiple app servers)
Submissions/Day 10,000 submissions Queue workers scale horizontally
Data Volume 100 GB/year (submissions + evidence) S3 auto-scales, database vertical scaling
Tenants 50 tenants Multi-tenant DB with tenant_id sharding (vNext)

Queue Configuration

Queue Name Worker Count Memory Limit Retry Limit Timeout
validations 5 workers 512 MB 3 60s
processing 3 workers 512 MB 3 120s
reporting 2 workers 1 GB 2 600s (10 min)
default 2 workers 256 MB 3 90s

Laravel Queue Worker Command:

php artisan queue:work redis --queue=validations,processing,reporting,default --tries=3 --timeout=90 --memory=512


Background Jobs

Job Schedule Purpose
EscalateOverdueReviewsJob Hourly Flag reviews overdue by 3+ days
MarkEvidenceForDeletionJob Daily (midnight) Mark expired evidence (7+ years)
GeneratePeriodicReportsJob Weekly (Sunday) Auto-generate draft reports for open periods
PruneAuditLogsJob Monthly Archive audit logs older than 2 years to cold storage

Observability

Logging

Laravel Log Channels: - daily: Application logs (7-day rotation) - errorlog: PHP errors, exceptions - syslog: Critical errors sent to external SIEM

Structured Logging:

Log::info('Submission processed', [
    'submission_id' => $submission->id,
    'tenant_id' => $submission->tenant_id,
    'state' => $submission->state,
    'processing_time_ms' => $processingTime,
]);

Metrics

Laravel Telescope (Development/Staging): - Request/response logging - Query performance tracking - Job/queue monitoring

Production Monitoring (Prometheus + Grafana): - API response times (histogram) - Queue depth (gauge) - Submission throughput (counter) - Error rates (counter)

Key Metrics:

esg_submissions_total{state="approved"} 5000
esg_api_response_time_seconds{endpoint="/submissions",quantile="0.95"} 0.45
esg_queue_depth{queue="validations"} 12

Alerting

Alerts (PagerDuty/Slack): - API p95 latency > 1s for 5 minutes - Queue depth > 100 for 10 minutes - Error rate > 5% for 5 minutes - Disk usage > 80%


Backups

Resource Frequency Retention RTO RPO
PostgreSQL Continuous (WAL archiving) 30 days 1 hour 5 minutes
S3 Evidence Bucket Versioning enabled 7 years N/A (immutable) Real-time
Application Code Git (on push) Indefinite 15 minutes N/A
Secrets AWS Secrets Manager backup 90 days 30 minutes N/A

Database Backup:

# Automated via AWS RDS automated backups
# Snapshot retention: 30 days
# Point-in-time recovery (PITR): 5-minute granularity


Disaster Recovery

RTO & RPO Targets

  • RTO (Recovery Time Objective): 4 hours (critical period: data collection deadlines)
  • RPO (Recovery Point Objective): 1 hour (max acceptable data loss)

DR Strategy

Multi-AZ Deployment (AWS): - RDS: Multi-AZ automatic failover (< 2 min) - Application servers: Auto Scaling Group across 2+ AZs - S3: Cross-region replication (CRR) to DR region

Failover Procedure: 1. Detect outage (CloudWatch alarms) 2. Promote RDS standby (automatic) 3. Route 53 DNS failover to DR region load balancer 4. Notify team via PagerDuty 5. Verify application health checks


Data Residency

GDPR Requirement: EU customer data must remain in EU regions.

Implementation: - Tenant metadata includes data_region (e.g., eu-west-1, us-east-1) - S3 buckets per region: evidence-eu-west-1, evidence-us-east-1 - Database: Regional RDS instances (future multi-region support)

$evidenceDisk = "evidence-{$tenant->data_region}";
Storage::disk($evidenceDisk)->put($path, $file);

Acceptance Criteria

  • API p95 latency < 500ms under normal load (100 users)
  • Queue workers configured with correct retry/timeout limits
  • PostgreSQL automated backups enabled (30-day retention)
  • S3 versioning enabled for evidence bucket
  • CloudWatch alarms configured for latency, queue depth, errors
  • RDS Multi-AZ enabled for high availability
  • Scheduled jobs run as expected (cron verified)

Cross-References


Change Log

Version Date Author Changes
1.0 2026-01-03 Senior Product Architect Initial NFR specification