Skip to content

Ingestion Pipeline: Idempotent Data Ingestion

Status: Final Version: 1.0


Purpose

Define the backend pipeline for ingesting submissions from collectors with idempotency guarantees, deduplication, retry handling, and error recovery.


Pipeline Stages

HTTP Request → Idempotency Check → Store Raw → Queue Validation → Success Response
                     ↓ (duplicate)
              Return Existing ← Cache Hit

Idempotency Strategy

Key: submission_uuid (UUID generated by mobile app)

Rule: Same UUID = Same submission (never create duplicates)

Laravel Implementation

public function submit(SubmissionRequest $request)
{
    DB::transaction(function () use ($request) {
        // Check for existing submission
        $existing = MetricSubmission::where('submission_uuid', $request->submission_uuid)
            ->lockForUpdate()
            ->first();

        if ($existing) {
            // Idempotent response
            Log::info("Duplicate submission attempt: {$request->submission_uuid}");
            return response()->json($existing, 200);
        }

        // Create new submission
        $submission = MetricSubmission::create([
            'tenant_id' => auth()->user()->tenant_id,
            'submission_uuid' => $request->submission_uuid,
            'reporting_period_id' => $request->reporting_period_id,
            'site_id' => $request->site_id,
            'metric_definition_id' => $this->resolveMetricId($request->metric_id),
            'raw_data' => $request->all(),
            'state' => 'RECEIVED',
            'submitted_by_user_id' => auth()->id(),
            'submitted_at' => now(),
        ]);

        // Queue validation
        dispatch(new ValidateSubmissionJob($submission->id))->onQueue('validations');

        return response()->json($submission, 201);
    });
}

Deduplication Logic

Database Constraint:

ALTER TABLE metric_submissions ADD UNIQUE (tenant_id, submission_uuid);

Race Condition Handling: - SELECT ... FOR UPDATE lock in transaction - Unique constraint as fallback (returns 409 if violated)


Retry Behavior

Client Retry (Mobile App)

  • Exponential Backoff: 2s, 4s, 8s, 16s, 32s
  • Max Retries: 5 attempts
  • Idempotency Key: Same UUID ensures no duplicates

Server Retry (Queue Jobs)

  • ValidateSubmissionJob: 3 tries, 60s delay
  • ProcessSubmissionJob: 3 tries, 120s delay
  • Dead Letter Queue: Failed jobs after max retries
class ValidateSubmissionJob implements ShouldQueue
{
    public $tries = 3;
    public $backoff = 60; // seconds

    public function failed(Throwable $exception)
    {
        MetricSubmission::find($this->submissionId)->update([
            'state' => 'FAILED',
            'error_message' => $exception->getMessage(),
        ]);

        // Notify admin of failure
        Notification::send(Admin::all(), new SubmissionProcessingFailed($this->submissionId));
    }
}

Error Recovery

Scenario 1: Network Timeout During Upload

  • Client: Retry with same UUID
  • Server: Idempotency check returns existing submission if already created
  • Outcome: No duplicate

Scenario 2: Validation Job Fails

  • Server: Retry job 3 times
  • After Max Retries: Move to FAILED state, notify admin
  • Manual Recovery: Admin can trigger re-validation

Scenario 3: Database Deadlock

  • Server: Transaction rolled back, return 503 Service Unavailable
  • Client: Retry after delay
  • Outcome: Succeeds on retry

Laravel Queue Configuration

// config/queue.php
'connections' => [
    'redis' => [
        'driver' => 'redis',
        'connection' => 'default',
        'queue' => env('REDIS_QUEUE', 'default'),
        'retry_after' => 180,
        'block_for' => null,
    ],
],

'failed' => [
    'driver' => 'database-uuids',
    'database' => env('DB_CONNECTION', 'pgsql'),
    'table' => 'failed_jobs',
],

Queue Names: - validations: High priority, validation jobs - processing: Medium priority, data processing - reporting: Low priority, report generation


Acceptance Criteria

  • Same UUID never creates duplicate submissions
  • Idempotency works across retries and concurrent requests
  • Failed validation jobs retried up to 3 times
  • Dead letter queue captures permanently failed jobs
  • Admin dashboard shows failed jobs with error details

Cross-References


Change Log

Version Date Author Changes
1.0 2026-01-03 Senior Product Architect Initial ingestion pipeline specification