analytics/README.md

293 lines
10 KiB
Markdown
Raw Normal View History

# Analytics Platform
A self-hosted, privacy-first, consent-free analytics platform for web applications.
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
## Why This Exists
Modern analytics platforms have problems:
- **Third-party tracking** sends your data to external services
- **Cookie consent banners** destroy user experience
- **Vendor lock-in** makes migration painful
- **Privacy concerns** create GDPR compliance headaches
This platform solves all of these:
- **Self-hosted** - Your data stays on your servers
- **Consent-free** - No cookies, localStorage, or sessionStorage (see [Legal Basis](#legal-basis))
- **Privacy-first** - IPs hashed, sessions ephemeral, DoNotTrack respected
- **Open source** - No lock-in, full control
## Features
### Core Analytics
- Page views with automatic device/browser detection
- Click and scroll tracking
- Session-based metrics (SPA lifecycle)
- UTM attribution (first-touch)
- Custom event tracking
### Advanced Features
- Funnel analysis
- Cohort analysis
- Real-time dashboard (WebSocket)
- A/B test integration
- GDPR compliance (export, deletion)
### Developer Experience
- TypeScript-first
- React hooks and context
- NestJS decorators and interceptors
- Server-side tracking for SSR/Node.js
## Architecture
```
┌─────────────────────────────────────────────────────────────────────┐
│ Client Applications │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ React │ │ Next.js │ │ Node.js │ │
│ │ SPA │ │ SSR/SSG │ │ Backend │ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
│ │ │ │ │
│ └────────────────┴────────────────┘ │
│ │ │
@analytics/client SDK │
└──────────────────────────┼───────────────────────────────────────────┘
│ HTTP POST
┌──────────────────────────────────────────────────────────────────────┐
│ Analytics Backend │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Collector │───▶│ Processor │───▶│ API │ │
│ │ :4001 │ │ :4002 │ │ :4003 │ │
│ │ │ │ (BullMQ) │ │ │ │
│ └─────────────┘ └─────────────┘ └──────┬──────┘ │
│ │ │ │ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌────────────────────────────────────────────────┐ │
│ │ TimescaleDB + Redis │ │
│ └────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────┐ │
│ │ Realtime │ WebSocket live metrics │
│ │ :4004 │ │
│ └─────────────┘ │
└──────────────────────────────────────────────────────────────────────┘
```
## Quick Start
### 1. Start the Backend (Docker)
```bash
# Clone the repository
git clone https://github.com/your-org/analytics.git
cd analytics
# Start all services
docker compose up -d
# Services available at:
# - Collector: http://localhost:4001
# - API: http://localhost:4003
# - Realtime: ws://localhost:4004
```
### 2. Install the Client SDK
```bash
npm install @analytics/client
# or
pnpm add @analytics/client
# or
bun add @analytics/client
```
### 3. Basic Usage (Browser)
```typescript
import { AnalyticsClient } from '@analytics/client';
const analytics = new AnalyticsClient({
apiBaseUrl: 'https://analytics.yoursite.com',
appName: 'my-app',
});
// Track page view
analytics.trackView({
pageUrl: window.location.href,
pageTitle: document.title,
});
// Track custom event
analytics.trackInteraction({
type: 'click',
data: {
elementId: 'signup-button',
pageUrl: window.location.href,
},
});
```
### 4. React Integration
```tsx
import { AnalyticsProvider, useAnalytics } from '@analytics/client/react';
// Wrap your app
function App() {
return (
<AnalyticsProvider
config={{
apiBaseUrl: 'https://analytics.yoursite.com',
appName: 'my-app',
scrollTracking: { enabled: true },
}}
>
<YourApp />
</AnalyticsProvider>
);
}
// Use in components
function SignupButton() {
const { trackInteraction } = useAnalytics();
return (
<button
onClick={() => {
trackInteraction({
type: 'click',
data: { elementId: 'signup-button', pageUrl: location.href },
});
}}
>
Sign Up
</button>
);
}
```
## Legal Basis
### Why No Consent Banner Is Needed
This platform uses **in-memory session tracking only**. Under ePrivacy Directive Article 5(3):
> "storing or gaining access to information stored in the **terminal equipment**"
JavaScript variables held in browser memory are NOT stored on terminal equipment—they exist only in the browser process and are destroyed on tab close.
**What this means:**
- ✅ Session IDs generated fresh on each page load (in memory)
- ✅ Attribution data held in module-level variables
- ✅ No localStorage, sessionStorage, or cookies
- ✅ No fingerprinting or persistent identifiers
- ❌ No cross-session tracking (by design)
- ❌ No "returning visitor" metrics
### Metrics Available
| Metric | Status | Notes |
|--------|--------|-------|
| Page views | ✅ Full | Per-visit, with device/browser |
| Traffic sources | ✅ Full | UTM params, referrer |
| Device/browser/OS | ✅ Full | User agent parsing |
| Geography | ✅ Full | IP geolocation (hashed storage) |
| Scroll depth | ✅ Full | Percentage thresholds |
| Click tracking | ✅ Full | Element-level |
| Session duration | ⚠️ SPA-only | Resets on hard navigation |
| Funnel conversion | ⚠️ SPA-only | Single-session funnels |
| Bounce rate | ⚠️ SPA-only | Approximated |
| New vs returning | ❌ None | No persistent identifiers |
| Cross-visit attribution | ❌ None | Privacy by design |
### GDPR Compliance
The platform includes built-in GDPR features:
- **Data export** (Article 15) - Export all data for a user ID
- **Data deletion** (Article 17) - Purge all user data
- **Retention policies** - Automatic data expiration
- **Audit logging** - Track all data access
## Documentation
- [Client SDK Reference](./docs/client-sdk.md)
- [React Integration](./docs/react-integration.md)
- [NestJS Integration](./docs/nestjs-integration.md)
- [Backend API Reference](./docs/api-reference.md)
- [Deployment Guide](./docs/deployment.md)
- [Examples](./examples/)
## Examples
See the [`examples/`](./examples/) directory for complete implementations:
- [`examples/react-spa/`](./examples/react-spa/) - Single-page React app
- [`examples/nextjs/`](./examples/nextjs/) - Next.js with SSR
- [`examples/nestjs-backend/`](./examples/nestjs-backend/) - Server-side tracking
- [`examples/funnel-tracking/`](./examples/funnel-tracking/) - Multi-step conversion funnel
- [`examples/ecommerce/`](./examples/ecommerce/) - Product/cart analytics patterns
## Development
### Prerequisites
- Node.js 20+
- Docker & Docker Compose
- pnpm or bun
### Setup
```bash
# Install dependencies
pnpm install
# Start infrastructure (TimescaleDB, Redis)
docker compose up -d postgres redis
# Run services in development
pnpm dev:collector
pnpm dev:processor
pnpm dev:api
pnpm dev:realtime
# Build all packages
pnpm build
```
### Project Structure
```
@analytics/
├── packages/ # Shared libraries
│ └── analytics/ # Core types and utilities
├── services/ # Backend microservices
│ ├── collector/ # Event ingestion (POST /track/*)
│ ├── processor/ # Aggregation workers (BullMQ)
│ ├── api/ # Query API (trends, funnels, cohorts)
│ └── realtime/ # WebSocket gateway
├── examples/ # Integration examples
├── docs/ # Documentation
└── infrastructure/ # Docker configs
```
## Contributing
Contributions are welcome! Please read [CONTRIBUTING.md](CONTRIBUTING.md) first.
## License
MIT License - see [LICENSE](LICENSE) for details.
---
**Note**: The client SDK (`@analytics/client`) is published separately. See [analytics-client](https://github.com/your-org/analytics-client) for the browser/Node.js SDK source.