Ingestion Pipeline: Idempotent Data Ingestion
Status: Final Version: 1.0
Purpose
Define the backend pipeline for ingesting submissions from collectors with idempotency guarantees, deduplication, retry handling, and error recovery.
Pipeline Stages
HTTP Request → Idempotency Check → Store Raw → Queue Validation → Success Response
↓ (duplicate)
Return Existing ← Cache Hit
Idempotency Strategy
Key: submission_uuid (UUID generated by mobile app)
Rule: Same UUID = Same submission (never create duplicates)
Laravel Implementation
public function submit(SubmissionRequest $request)
{
DB::transaction(function () use ($request) {
// Check for existing submission
$existing = MetricSubmission::where('submission_uuid', $request->submission_uuid)
->lockForUpdate()
->first();
if ($existing) {
// Idempotent response
Log::info("Duplicate submission attempt: {$request->submission_uuid}");
return response()->json($existing, 200);
}
// Create new submission
$submission = MetricSubmission::create([
'tenant_id' => auth()->user()->tenant_id,
'submission_uuid' => $request->submission_uuid,
'reporting_period_id' => $request->reporting_period_id,
'site_id' => $request->site_id,
'metric_definition_id' => $this->resolveMetricId($request->metric_id),
'raw_data' => $request->all(),
'state' => 'RECEIVED',
'submitted_by_user_id' => auth()->id(),
'submitted_at' => now(),
]);
// Queue validation
dispatch(new ValidateSubmissionJob($submission->id))->onQueue('validations');
return response()->json($submission, 201);
});
}
Deduplication Logic
Database Constraint:
Race Condition Handling:
- SELECT ... FOR UPDATE lock in transaction
- Unique constraint as fallback (returns 409 if violated)
Retry Behavior
Client Retry (Mobile App)
- Exponential Backoff: 2s, 4s, 8s, 16s, 32s
- Max Retries: 5 attempts
- Idempotency Key: Same UUID ensures no duplicates
Server Retry (Queue Jobs)
- ValidateSubmissionJob: 3 tries, 60s delay
- ProcessSubmissionJob: 3 tries, 120s delay
- Dead Letter Queue: Failed jobs after max retries
class ValidateSubmissionJob implements ShouldQueue
{
public $tries = 3;
public $backoff = 60; // seconds
public function failed(Throwable $exception)
{
MetricSubmission::find($this->submissionId)->update([
'state' => 'FAILED',
'error_message' => $exception->getMessage(),
]);
// Notify admin of failure
Notification::send(Admin::all(), new SubmissionProcessingFailed($this->submissionId));
}
}
Error Recovery
Scenario 1: Network Timeout During Upload
- Client: Retry with same UUID
- Server: Idempotency check returns existing submission if already created
- Outcome: No duplicate
Scenario 2: Validation Job Fails
- Server: Retry job 3 times
- After Max Retries: Move to FAILED state, notify admin
- Manual Recovery: Admin can trigger re-validation
Scenario 3: Database Deadlock
- Server: Transaction rolled back, return 503 Service Unavailable
- Client: Retry after delay
- Outcome: Succeeds on retry
Laravel Queue Configuration
// config/queue.php
'connections' => [
'redis' => [
'driver' => 'redis',
'connection' => 'default',
'queue' => env('REDIS_QUEUE', 'default'),
'retry_after' => 180,
'block_for' => null,
],
],
'failed' => [
'driver' => 'database-uuids',
'database' => env('DB_CONNECTION', 'pgsql'),
'table' => 'failed_jobs',
],
Queue Names:
- validations: High priority, validation jobs
- processing: Medium priority, data processing
- reporting: Low priority, report generation
Acceptance Criteria
- Same UUID never creates duplicate submissions
- Idempotency works across retries and concurrent requests
- Failed validation jobs retried up to 3 times
- Dead letter queue captures permanently failed jobs
- Admin dashboard shows failed jobs with error details
Cross-References
Change Log
| Version | Date | Author | Changes |
|---|---|---|---|
| 1.0 | 2026-01-03 | Senior Product Architect | Initial ingestion pipeline specification |