PLANNING · DIAGRAMS

Visual planning surface

Every diagram here is the single source of truth for one decision. When the diagram changes, the entity doc changes, and a changelog entry follows. This page is the GUI for project planning — schema, workflows, and operations live next to the prose that explains them.

Heads-up: these diagrams reflect the v0.2 entity model. Not all of v0.2 should survive to v0.3 — see the prior-art critique for which locks should be reopened (message store, carrier abstraction, multi-tenancy mechanism, JSONB validation, GDPR deletion, entity resolution).

Identity Graph

The centerpiece. Identities, phones, emails, and properties are nodes joined through M:N tables that carry provenance, confidence, and source. Re-imports collide on hashes — duplicates do not accumulate.

erDiagram
    IDENTITIES ||--o{ IDENTITY_PHONES : has
    IDENTITIES ||--o{ IDENTITY_EMAILS : has
    PHONE_NUMBERS ||--o{ IDENTITY_PHONES : "linked to"
    EMAIL_ADDRESSES ||--o{ IDENTITY_EMAILS : "linked to"
    PROPERTIES ||--o{ PROPERTY_OWNERS : "owned via"
    IDENTITIES ||--o{ PROPERTY_OWNERS : owns
    IDENTITIES {
      bigserial id PK
      text identity_hash UK
      text first_name
      text last_name
      text entity_type
      bool is_corporate
      jsonb metadata
    }
    PHONE_NUMBERS {
      bigserial id PK
      text phone_e164 UK
      text line_type
      text carrier_name
      bool is_litigator
      numeric validation_confidence
      jsonb raw_validation_data
    }
    EMAIL_ADDRESSES {
      bigserial id PK
      text email_address UK
      text email_hash
      bool is_disposable
      timestamptz validated_at
    }
    IDENTITY_PHONES {
      bigint identity_id PK,FK
      bigint phone_id PK,FK
      text relationship_type
      numeric confidence_score
      text source
      bool is_active
      timestamptz verified_at
    }
    IDENTITY_EMAILS {
      bigint identity_id PK,FK
      bigint email_id PK,FK
      text relationship_type
      numeric confidence_score
      text source
    }
    PROPERTIES {
      bigserial id PK
      text property_hash UK
      text full_address
      numeric lat
      numeric lng
      int year_built
      bool is_owner_occupied
    }
    PROPERTY_OWNERS {
      bigint property_id PK,FK
      bigint identity_id PK,FK
      text role
      timestamptz started_at
      timestamptz ended_at
    }

Why M:N with provenance? Real contact data has two phones that belong to the same person, two people in the same household, the same phone reassigned across owners, and the same address with three identities on title. Every join row remembers which file, integration, or operator put it there. Reconciliation queries become possible instead of catastrophic.

Leads as a Junction

A lead is not a row — it's an intersection of person × property × vertical. The same identity can be a lead in three verticals without three duplicate contact rows. Campaigns enroll leads; templates send to leads; conversations key off the underlying phone.

erDiagram
    IDENTITIES ||--o{ LEADS : "is a"
    PROPERTIES ||--o{ LEADS : "located at"
    VERTICALS ||--o{ LEADS : "in"
    LEADS ||--o{ CAMPAIGN_ENROLLMENTS : "enrolled in"
    CAMPAIGNS ||--o{ CAMPAIGN_ENROLLMENTS : "has"
    VERTICALS {
      smallserial id PK
      text slug UK
      smallint same_vertical_cooldown_days
      smallint cross_vertical_cooldown_days
      bool respect_global_dnc
      bool allow_optin_override
    }
    LEADS {
      bigserial id PK
      bigint identity_id FK
      bigint property_id FK
      smallint vertical_id FK
      text source
      text status
      jsonb tags
      jsonb custom_fields
      text environment
    }
    CAMPAIGNS {
      bigserial id PK
      text name
      smallint vertical_id FK
      bigint twilio_account_id FK
      text status
      text environment
      int throttle_per_min
      int throttle_per_day
    }
    CAMPAIGN_ENROLLMENTS {
      bigserial id PK
      bigint campaign_id PK,FK
      bigint lead_id PK,FK
      text status
      timestamptz enrolled_at
    }

Locked decision: UNIQUE(identity_id, vertical_id, property_id) on leads. Cooldown policy lives on verticals, not in code, not in per-campaign config — change a vertical's defaults and every campaign in it inherits the new policy on next send.

Messaging Schema

The send queue gates on timezone and dedup. messages is partitioned monthly by created_at so the hot partition stays small and old partitions can be detached cheaply. Conversations are keyed by (twilio_account, from, to).

erDiagram
    MESSAGE_QUEUE ||--o| MESSAGES : "produces"
    CONVERSATIONS ||--o{ MESSAGES : "groups"
    TEMPLATE_VERSIONS ||--o{ MESSAGE_QUEUE : "renders"
    TWILIO_ACCOUNTS ||--o{ TWILIO_PHONE_NUMBERS : "owns"
    TWILIO_PHONE_NUMBERS ||--o{ MESSAGE_QUEUE : "sends from"
    INBOUND_MESSAGES }o--|| CONVERSATIONS : "lands in"
    MESSAGE_QUEUE {
      bigserial id PK
      bigint campaign_id FK
      bigint lead_id FK
      bigint phone_id FK
      bigint template_version_id FK
      bigint twilio_phone_id FK
      text customer_timezone
      bool send_window_ok
      timestamptz scheduled_at
      text status
      smallint attempts
      text blocked_reason
    }
    MESSAGES {
      bigserial id
      bigint message_queue_id
      bigint conversation_id
      text direction
      text from_phone_e164
      text to_phone_e164
      text body
      text twilio_message_sid
      text status
      int segments
      jsonb raw_webhook
      timestamptz created_at PK
    }
    CONVERSATIONS {
      bigserial id PK
      bigint twilio_account_id FK
      text from_phone_e164
      text to_phone_e164
      bigint lead_id FK
      timestamptz last_message_at
      bool is_opted_out
    }
    INBOUND_MESSAGES {
      bigserial id PK
      text twilio_message_sid UK
      bigint twilio_account_id FK
      text from_phone_e164
      text to_phone_e164
      bool is_opt_out
      bool processed
      jsonb raw_webhook
    }
    TWILIO_ACCOUNTS {
      bigserial id PK
      text slug UK
      text account_sid UK
      text auth_token_encrypted
      text environment
      bool is_internal
    }
    TWILIO_PHONE_NUMBERS {
      bigserial id PK
      bigint twilio_account_id FK
      bigint phone_number_id FK
      text twilio_sid UK
      int daily_limit
      int daily_usage
      bool is_soft_disabled
    }
    TEMPLATE_VERSIONS {
      bigserial id PK
      bigint template_id FK
      text body
      text sms_or_mms
      bigint created_by
      timestamptz retired_at
    }

Partitioning: messages PARTITION BY RANGE(created_at), monthly. Partitions named messages_YYYY_MM, pre-created via numbered migrations through 2027 and rolling. The PK is (id, created_at) because Postgres requires the partition key in the PK.

Ingestion Workflow

A vendor file lands in files; raw rows go into file_rows.payload jsonb verbatim. A parser extracts identity, phones, properties, and emails into the graph. Re-uploading the same file is a no-op (sha256 unique). Re-ingesting is supported.

flowchart TD
    A[Operator uploads CSV] --> B{sha256 seen before?}
    B -- yes --> Z[Return existing file row · no-op]
    B -- no --> C[Insert files row · status=parsing]
    C --> D[Stream rows · INSERT file_rows payload jsonb]
    D --> E[Pick parser by header signature]
    E --> F[Parser extracts: identity_hash · phones · emails · property_hash]
    F --> G[UPSERT identities · phone_numbers · email_addresses · properties]
    G --> H[INSERT identity_phones · identity_emails · property_owners with source=file_id]
    H --> I[INSERT leads on identity × vertical × property]
    I --> J[ingestion_runs · rows_inserted · rows_skipped_dup · rows_failed]
    J --> K[files.status = ready]
    K --> L[Operator can now enroll leads into campaigns]

Anti-pattern avoided: the legacy property-centric ingest discarded contact-centric exports and welded denormalized phone columns onto the lead row. Here the parser is metadata + code (registry-driven), the raw row is preserved, and the same downstream graph absorbs every vendor format.

Send Pipeline

Workers claim ready rows with FOR UPDATE SKIP LOCKED. Every send is checked against the dedup service before Twilio is called. Multi-account routing picks the right twilio_phone_numbers row using area-code preference and daily-usage headroom.

sequenceDiagram
    autonumber
    participant Sched as Scheduler
    participant Q as message_queue
    participant W as Worker (FOR UPDATE SKIP LOCKED)
    participant DD as dedup-service
    participant TW as Twilio API
    participant DB as messages (partitioned)

    Sched->>Q: INSERT status=pending · scheduled_at
    Note over Q: Timezone gate → status=ready when window OK
    W->>Q: SELECT ... FOR UPDATE SKIP LOCKED LIMIT N
    W->>W: Pick twilio_phone_id (area-code preference + daily-usage)
    W->>DD: POST /api/check {from, to, campaign_id, vertical_id}
    alt BLOCK
        DD-->>W: BLOCK · reason · layer
        W->>Q: status=blocked · blocked_reason
    else ALLOW
        DD-->>W: ALLOW
        W->>TW: messages.create
        TW-->>W: SID · status
        W->>DB: INSERT messages (direction=outbound)
        W->>Q: status=sent · attempts++
        W->>DD: POST /api/record {from, to, sent_at}
    end

Sandbox parity: the worker calls a SMSSender interface — real TwilioClient or SandboxClient, picked by twilio_accounts.environment. Sandbox returns plausible SIDs, randomized failure codes, scripted responses. No "if test then return early" branches in the hot path.

Dedup Cascade

Three layers, cheapest first. Hot box prevents racing workers. FROM-TO cache encodes the FOREVER rule (this number has texted this number before). Postgres is authoritative. Optional fourth layer: 72-hour global cooldown on TO.

flowchart TD
    A[Worker requests /api/check] --> B{Hot box · Redis · TTL 5m}
    B -- hit --> X1[BLOCK · HOTBOX]
    B -- miss --> C{FROM→TO cache · Redis · TTL 1h}
    C -- hit --> X2[BLOCK · FOREVER_RULE]
    C -- miss --> D{global_from_to_history · Postgres}
    D -- found --> E[Back-cache FROM→TO · BLOCK · FOREVER_RULE]
    D -- not found --> F{Global cooldown enabled?}
    F -- no --> G[ALLOW]
    F -- yes --> H{TO sent within 72h by anyone?}
    H -- yes --> X3[BLOCK · GLOBAL_COOLDOWN]
    H -- no --> G
    G --> I[Worker calls Twilio · record on success]
    I --> J[INSERT or UPDATE global_from_to_history · increment contact_count]
    J --> K[Set hot box · set FROM→TO cache]

The FOREVER rule makes multi-account Twilio safe. A number that leaves one account and reappears on another cannot accidentally re-text someone — the cache is keyed by phone string, not Twilio SID. source column on global_from_to_history lets legacy systems and esd.quest share the same table during migration.

Inbound + Opt-out Cascade

Twilio webhook → idempotent insert into inbound_messages (unique on SID) → classify → append consent_events if STOP/START/HELP matched → derived DNC state updates on next read. Nothing flips a boolean; everything is an event.

sequenceDiagram
    autonumber
    participant TW as Twilio
    participant API as inbound-api /twilio/sms/inbound
    participant IM as inbound_messages (UNIQUE sid)
    participant CL as Classifier
    participant CE as consent_events (append-only)
    participant CV as conversations
    participant Q as outbound message_queue

    TW->>API: POST · X-Twilio-Signature
    API->>API: Verify signature · 401 if bad
    API->>IM: INSERT ... ON CONFLICT (twilio_message_sid) DO NOTHING
    alt new row inserted
        API->>CL: classify(body)
        CL-->>API: {is_opt_out, is_opt_in, classification, confidence}
        alt confidence ≥ threshold
            API->>CE: INSERT consent_events · trigger_matched=STOP/START
        else low confidence
            API->>CE: INSERT consent_events · scope=review · routed to optin_review_queue
        end
        API->>CV: UPSERT conversations · last_inbound_at · is_opted_out (derived)
        API->>Q: Cancel pending queue rows for this (account, from→to) if opt-out
    end
    API-->>TW: 200 empty TwiML

Idempotency is the whole game. Twilio retries webhooks. The unique constraint on twilio_message_sid turns reprocessing into a no-op. Classifier output is recorded as an event with its confidence, so a future review can override a low-confidence STOP without rewriting history.

Multiplayer Flow

Operators see each other on the live grid. Presence is Redis-only (not durable). Activity is written through Postgres and tailed by the presence service; ephemeral message-drops live in Redis with a flush job for any that need to persist.

flowchart LR
    subgraph Browsers
      U1[Operator A · WebSocket]
      U2[Operator B · WebSocket]
      U3[Operator C · WebSocket]
    end
    subgraph Server
      WS[presence-server · WebSocket]
      R0[(Redis db 0 · presence + drops)]
      PG[(Postgres · useract firehose)]
      AT[useract tailer]
    end
    U1 <--> WS
    U2 <--> WS
    U3 <--> WS
    WS <--> R0
    AT --> PG
    AT --> WS
    WS -. "broadcast · fan-out by room" .-> U1
    WS -. "broadcast · fan-out by room" .-> U2
    WS -. "broadcast · fan-out by room" .-> U3

Why Redis-only presence: if the box reboots and "who's online" goes blank for 5 seconds, no one cares. If useract writes fail, that's a real bug. Two stores, two failure modes, both right-sized.

Secrets Envelope

Operator pastes an API key in the UI. The plaintext never lands on disk and never appears in a log. A short-lived DEK encrypts the secret; a long-lived KEK (held by a KMS or, in early dev, a filesystem-protected master key) encrypts the DEK. Decryption is at-use only.

sequenceDiagram
    autonumber
    participant UI as Browser
    participant API as integrations API
    participant KMS as KMS (or master key)
    participant DB as integration_secrets

    UI->>API: POST /integrations/{id}/secrets {name, plaintext}
    API->>API: Generate DEK (random 256-bit)
    API->>API: ciphertext = AES-GCM(DEK, plaintext)
    API->>KMS: encrypt(DEK) → wrapped_dek
    API->>DB: INSERT {integration_id, name, ciphertext, wrapped_dek, dek_id}
    API->>API: zeroize plaintext, DEK in memory
    API-->>UI: 201 (no secret in response)

    Note over API,DB: ── at use ──
    API->>DB: SELECT ciphertext, wrapped_dek, dek_id
    API->>KMS: decrypt(wrapped_dek) → DEK
    API->>API: plaintext = AES-GCM⁻¹(DEK, ciphertext)
    API->>API: USE plaintext (one call), then zeroize

Public-repo safe: nothing in the codebase reveals a secret, even if the repo goes public. The master key lives outside the repo. Rotation is per-DEK (re-encrypt one secret) or per-KEK (re-wrap all DEKs) — both supported by the schema.

Deployment Topology

One binary, one Postgres, one Redis. Sidecars bind loopback only; Apache (or Caddy) reverse-proxies under a single TLS vhost. Same image deploys to vendor demo and to client tenants with config-only changes.

flowchart TB
    subgraph Internet
      Client[Operator Browser]
    end
    subgraph VPS
      subgraph TLS[Reverse Proxy · TLS · single vhost]
        RP[Apache / Caddy]
      end
      subgraph App[Loopback-only sidecars]
        API[esd-quest api]
        W[send worker]
        DD[dedup service]
        PR[presence service]
      end
      subgraph Store
        PG[(Postgres · partitioned)]
        RD[(Redis · presence + dedup cache)]
        FS[(Object store · file_rows overflow · optional)]
      end
    end
    subgraph External
      TW[Twilio]
      KMS[KMS · secret KEK]
    end
    Client -- HTTPS --> RP
    RP -- HTTP --> API
    RP -- WS --> PR
    API <--> PG
    W <--> PG
    DD <--> RD
    DD <--> PG
    PR <--> RD
    PR --> PG
    API <--> KMS
    W --> TW
    TW -- webhook --> RP

White-label promise: the only per-tenant inputs are the vhost name, the database name, and the secret keys. Skin tokens (logo, palette, copy) live in the DB. New tenant ≈ one Apache vhost + one createdb + one secret import.