# Deployment Guide

Deploy the analytics platform to production.

## Architecture Overview

```
┌─────────────────────────────────────────────────────────────────┐
│                        Load Balancer                            │
└─────────────────────────────────────────────────────────────────┘
                              │
        ┌─────────────────────┼─────────────────────┐
        │                     │                     │
        ▼                     ▼                     ▼
┌───────────────┐   ┌───────────────┐   ┌───────────────┐
│   Collector   │   │   Collector   │   │   Collector   │
│   Service     │   │   Service     │   │   Service     │
└───────────────┘   └───────────────┘   └───────────────┘
        │                     │                     │
        └─────────────────────┼─────────────────────┘
                              │
                              ▼
                    ┌───────────────┐
                    │     Redis     │
                    │   (BullMQ)    │
                    └───────────────┘
                              │
        ┌─────────────────────┼─────────────────────┐
        │                     │                     │
        ▼                     ▼                     ▼
┌───────────────┐   ┌───────────────┐   ┌───────────────┐
│   Processor   │   │   Processor   │   │   Processor   │
│   Worker      │   │   Worker      │   │   Worker      │
└───────────────┘   └───────────────┘   └───────────────┘
        │                     │                     │
        └─────────────────────┼─────────────────────┘
                              │
                              ▼
                    ┌───────────────┐
                    │  PostgreSQL   │
                    │  (TimescaleDB)│
                    └───────────────┘
                              │
                              ▼
                    ┌───────────────┐
                    │  API Service  │
                    └───────────────┘
```

## Services

| Service | Port | Description |
|---------|------|-------------|
| Collector | 4001 | Event ingestion |
| Processor | - | Queue worker (no HTTP) |
| API | 4002 | Query endpoints |
| Realtime | 4003 | WebSocket server |

## Docker Deployment

### docker-compose.yml

```yaml
version: '3.8'

services:
  collector:
    image: analytics/collector:latest
    ports:
      - "4001:4001"
    environment:
      - NODE_ENV=production
      - REDIS_URL=redis://redis:6379
      - LOG_LEVEL=info
    depends_on:
      - redis
    deploy:
      replicas: 3
      resources:
        limits:
          memory: 512M
          cpus: '0.5'

  processor:
    image: analytics/processor:latest
    environment:
      - NODE_ENV=production
      - REDIS_URL=redis://redis:6379
      - DATABASE_URL=postgresql://postgres:password@postgres:5432/analytics
      - CONCURRENCY=10
    depends_on:
      - redis
      - postgres
    deploy:
      replicas: 2
      resources:
        limits:
          memory: 1G
          cpus: '1'

  api:
    image: analytics/api:latest
    ports:
      - "4002:4002"
    environment:
      - NODE_ENV=production
      - DATABASE_URL=postgresql://postgres:password@postgres:5432/analytics
      - REDIS_URL=redis://redis:6379
    depends_on:
      - postgres
      - redis
    deploy:
      replicas: 2
      resources:
        limits:
          memory: 512M
          cpus: '0.5'

  realtime:
    image: analytics/realtime:latest
    ports:
      - "4003:4003"
    environment:
      - NODE_ENV=production
      - REDIS_URL=redis://redis:6379
    depends_on:
      - redis
    deploy:
      replicas: 2

  redis:
    image: redis:7-alpine
    volumes:
      - redis_data:/data
    command: redis-server --appendonly yes

  postgres:
    image: timescale/timescaledb:latest-pg15
    environment:
      - POSTGRES_DB=analytics
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=password
    volumes:
      - postgres_data:/var/lib/postgresql/data

volumes:
  redis_data:
  postgres_data:
```

## Kubernetes Deployment

### Collector Deployment

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: analytics-collector
spec:
  replicas: 3
  selector:
    matchLabels:
      app: analytics-collector
  template:
    metadata:
      labels:
        app: analytics-collector
    spec:
      containers:
        - name: collector
          image: analytics/collector:latest
          ports:
            - containerPort: 4001
          env:
            - name: NODE_ENV
              value: production
            - name: REDIS_URL
              valueFrom:
                secretKeyRef:
                  name: analytics-secrets
                  key: redis-url
          resources:
            requests:
              memory: "256Mi"
              cpu: "250m"
            limits:
              memory: "512Mi"
              cpu: "500m"
          readinessProbe:
            httpGet:
              path: /health
              port: 4001
            initialDelaySeconds: 5
            periodSeconds: 10
          livenessProbe:
            httpGet:
              path: /health
              port: 4001
            initialDelaySeconds: 15
            periodSeconds: 20
---
apiVersion: v1
kind: Service
metadata:
  name: analytics-collector
spec:
  selector:
    app: analytics-collector
  ports:
    - port: 4001
      targetPort: 4001
  type: ClusterIP
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: analytics-collector-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: analytics-collector
  minReplicas: 3
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
```

## Environment Variables

### Collector Service

| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| `NODE_ENV` | Yes | - | Environment (production/development) |
| `PORT` | No | 4001 | HTTP port |
| `REDIS_URL` | Yes | - | Redis connection URL |
| `LOG_LEVEL` | No | info | Logging level |
| `CORS_ORIGINS` | No | * | Allowed CORS origins |

### Processor Service

| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| `NODE_ENV` | Yes | - | Environment |
| `REDIS_URL` | Yes | - | Redis connection URL |
| `DATABASE_URL` | Yes | - | PostgreSQL connection URL |
| `CONCURRENCY` | No | 5 | Worker concurrency |
| `BATCH_SIZE` | No | 100 | Events per batch |

### API Service

| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| `NODE_ENV` | Yes | - | Environment |
| `PORT` | No | 4002 | HTTP port |
| `DATABASE_URL` | Yes | - | PostgreSQL connection URL |
| `REDIS_URL` | Yes | - | Redis for caching |
| `API_KEYS` | Yes | - | Comma-separated API keys |

## Database Setup

### PostgreSQL with TimescaleDB

```sql
-- Create database
CREATE DATABASE analytics;

-- Enable TimescaleDB
CREATE EXTENSION IF NOT EXISTS timescaledb;

-- Create tables
CREATE TABLE raw_events (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  session_id VARCHAR(64) NOT NULL,
  user_id VARCHAR(255),
  event_type VARCHAR(100) NOT NULL,
  event_action VARCHAR(255) NOT NULL,
  metadata JSONB DEFAULT '{}',
  timestamp TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Convert to hypertable for time-series optimization
SELECT create_hypertable('raw_events', 'timestamp');

-- Create indexes
CREATE INDEX idx_raw_events_session ON raw_events(session_id);
CREATE INDEX idx_raw_events_user ON raw_events(user_id);
CREATE INDEX idx_raw_events_type ON raw_events(event_type);
CREATE INDEX idx_raw_events_metadata ON raw_events USING GIN(metadata);

-- Aggregated tables
CREATE TABLE daily_metrics (
  date DATE NOT NULL,
  metric_name VARCHAR(100) NOT NULL,
  dimension_key VARCHAR(255),
  dimension_value VARCHAR(255),
  value BIGINT NOT NULL DEFAULT 0,
  PRIMARY KEY (date, metric_name, dimension_key, dimension_value)
);

-- Retention policy: keep raw events for 90 days
SELECT add_retention_policy('raw_events', INTERVAL '90 days');
```

## Nginx Configuration

```nginx
upstream collector {
    least_conn;
    server collector-1:4001;
    server collector-2:4001;
    server collector-3:4001;
}

upstream api {
    server api-1:4002;
    server api-2:4002;
}

upstream realtime {
    ip_hash;  # Sticky sessions for WebSocket
    server realtime-1:4003;
    server realtime-2:4003;
}

server {
    listen 443 ssl http2;
    server_name analytics.example.com;

    ssl_certificate /etc/ssl/certs/analytics.crt;
    ssl_certificate_key /etc/ssl/private/analytics.key;

    # Collector - high throughput
    location /collect {
        proxy_pass http://collector;
        proxy_http_version 1.1;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

        # Don't buffer - fast response
        proxy_buffering off;

        # Allow large batches
        client_max_body_size 1m;
    }

    # API - standard REST
    location /api {
        proxy_pass http://api;
        proxy_http_version 1.1;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;

        # Cache GET requests
        proxy_cache api_cache;
        proxy_cache_valid 200 1m;
        proxy_cache_key "$request_method$request_uri";
        add_header X-Cache-Status $upstream_cache_status;
    }

    # WebSocket - realtime
    location /realtime {
        proxy_pass http://realtime;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host $host;

        # Long-lived connections
        proxy_read_timeout 86400s;
        proxy_send_timeout 86400s;
    }
}
```

## Monitoring

### Health Checks

All services expose `/health` endpoint:

```json
{
  "status": "healthy",
  "version": "1.0.0",
  "uptime": 86400,
  "checks": {
    "redis": "ok",
    "database": "ok"
  }
}
```

### Metrics (Prometheus)

Services expose `/metrics` endpoint:

```
# Collector metrics
analytics_events_received_total{type="engagement"} 1234567
analytics_events_queued_total 1234500
analytics_batch_size_histogram_bucket{le="10"} 50000

# Processor metrics
analytics_events_processed_total 1234000
analytics_processing_duration_seconds_bucket{le="0.1"} 1200000
analytics_queue_depth 500

# API metrics
analytics_api_requests_total{endpoint="/trends",status="200"} 50000
analytics_api_latency_seconds_bucket{le="0.5"} 49000
```

### Grafana Dashboards

Import pre-built dashboards from `/dashboards/`:
- `collector-metrics.json` - Ingestion throughput
- `processor-metrics.json` - Processing performance
- `api-metrics.json` - Query latency and errors
- `business-metrics.json` - Analytics KPIs

## Scaling Guidelines

### Collector Service

- Scale horizontally based on incoming event rate
- Target: <100ms p99 response time
- Rule of thumb: 1 replica per 10,000 events/minute

### Processor Service

- Scale based on queue depth
- Target: Queue depth < 1000
- Increase `CONCURRENCY` before adding replicas

### API Service

- Scale based on query latency
- Target: <500ms p95 for complex queries
- Add read replicas to PostgreSQL for heavy read load

### Database

- Use TimescaleDB compression for historical data
- Partition by month for large deployments
- Consider ClickHouse for >1B events/day