diff options
Diffstat (limited to 'telemetry/README.md')
| -rw-r--r-- | telemetry/README.md | 351 |
1 files changed, 351 insertions, 0 deletions
diff --git a/telemetry/README.md b/telemetry/README.md new file mode 100644 index 000000000..aee050943 --- /dev/null +++ b/telemetry/README.md @@ -0,0 +1,351 @@ +# SeaweedFS Telemetry System + +A privacy-respecting telemetry system for SeaweedFS that collects cluster-level usage statistics and provides visualization through Prometheus and Grafana. + +## Features + +- **Privacy-First Design**: Uses in-memory cluster IDs (regenerated on restart), no personal data collection +- **Prometheus Integration**: Native Prometheus metrics for monitoring and alerting +- **Grafana Dashboards**: Pre-built dashboards for data visualization +- **Protocol Buffers**: Efficient binary data transmission for optimal performance +- **Opt-in Only**: Disabled by default, requires explicit configuration +- **Docker Compose**: Complete monitoring stack deployment +- **Automatic Cleanup**: Configurable data retention policies + +## Architecture + +``` +SeaweedFS Cluster → Telemetry Client → Telemetry Server → Prometheus → Grafana + (protobuf) (metrics) (queries) +``` + +## Data Transmission + +The telemetry system uses **Protocol Buffers exclusively** for efficient binary data transmission: + +- **Compact Format**: 30-50% smaller than JSON +- **Fast Serialization**: Better performance than text-based formats +- **Type Safety**: Strong typing with generated Go structs +- **Schema Evolution**: Built-in versioning support + +### Protobuf Schema + +```protobuf +message TelemetryData { + string cluster_id = 1; // In-memory generated UUID + string version = 2; // SeaweedFS version + string os = 3; // Operating system + repeated string features = 4; // Enabled features + string deployment = 5; // Deployment type + int32 volume_server_count = 6; // Number of volume servers + uint64 total_disk_bytes = 7; // Total disk usage + int32 total_volume_count = 8; // Total volume count + int64 timestamp = 9; // Collection timestamp +} +``` + +## Privacy Approach + +- **No Personal Data**: No hostnames, IP addresses, or user information +- **In-Memory IDs**: Cluster IDs are generated in-memory and change on restart +- **Aggregated Data**: Only cluster-level statistics, no individual file/user data +- **Opt-in Only**: Telemetry is disabled by default +- **Transparent**: Open source implementation, clear data collection policy + +## Collected Data + +| Field | Description | Example | +|-------|-------------|---------| +| `cluster_id` | In-memory UUID (changes on restart) | `a1b2c3d4-...` | +| `version` | SeaweedFS version | `3.45` | +| `os` | Operating system and architecture | `linux/amd64` | +| `features` | Enabled components | `["filer", "s3api"]` | +| `deployment` | Deployment type | `cluster` | +| `volume_server_count` | Number of volume servers | `5` | +| `total_disk_bytes` | Total disk usage across cluster | `1073741824` | +| `total_volume_count` | Total number of volumes | `120` | +| `timestamp` | When data was collected | `1640995200` | + +## Quick Start + +### 1. Deploy Telemetry Server + +```bash +# Clone and start the complete monitoring stack +git clone https://github.com/seaweedfs/seaweedfs.git +cd seaweedfs/telemetry +docker-compose up -d + +# Or run the server directly +cd server +go run . -port=8080 -dashboard=true +``` + +### 2. Configure SeaweedFS + +```bash +# Enable telemetry in SeaweedFS master (uses default telemetry.seaweedfs.com:3091) +weed master -telemetry=true + +# Or in server mode +weed server -telemetry=true + +# Or specify custom telemetry server +weed master -telemetry=true -telemetry.url=http://localhost:8080/api/collect +``` + +### 3. Access Dashboards + +- **Telemetry Server**: http://localhost:8080 +- **Prometheus**: http://localhost:9090 +- **Grafana**: http://localhost:3000 (admin/admin) + +## Configuration + +### SeaweedFS Master/Server + +```bash +# Enable telemetry +-telemetry=true + +# Set custom telemetry server URL (optional, defaults to telemetry.seaweedfs.com:3091) +-telemetry.url=http://your-telemetry-server:8080/api/collect +``` + +### Telemetry Server + +```bash +# Server configuration +-port=8080 # Server port +-dashboard=true # Enable built-in dashboard +-cleanup=24h # Cleanup interval +-max-age=720h # Maximum data retention (30 days) + +# Example +./telemetry-server -port=8080 -dashboard=true -cleanup=24h -max-age=720h +``` + +## Prometheus Metrics + +The telemetry server exposes these Prometheus metrics: + +### Cluster Metrics +- `seaweedfs_telemetry_total_clusters`: Total unique clusters (30 days) +- `seaweedfs_telemetry_active_clusters`: Active clusters (7 days) + +### Per-Cluster Metrics +- `seaweedfs_telemetry_volume_servers{cluster_id, version, os, deployment}`: Volume servers per cluster +- `seaweedfs_telemetry_disk_bytes{cluster_id, version, os, deployment}`: Disk usage per cluster +- `seaweedfs_telemetry_volume_count{cluster_id, version, os, deployment}`: Volume count per cluster +- `seaweedfs_telemetry_filer_count{cluster_id, version, os, deployment}`: Filer servers per cluster +- `seaweedfs_telemetry_broker_count{cluster_id, version, os, deployment}`: Broker servers per cluster +- `seaweedfs_telemetry_cluster_info{cluster_id, version, os, deployment, features}`: Cluster metadata + +### Server Metrics +- `seaweedfs_telemetry_reports_received_total`: Total telemetry reports received + +## API Endpoints + +### Data Collection +```bash +# Submit telemetry data (protobuf only) +POST /api/collect +Content-Type: application/x-protobuf +[TelemetryRequest protobuf data] +``` + +### Statistics (JSON for dashboard/debugging) +```bash +# Get aggregated statistics +GET /api/stats + +# Get recent cluster instances +GET /api/instances?limit=100 + +# Get metrics over time +GET /api/metrics?days=30 +``` + +### Monitoring +```bash +# Prometheus metrics +GET /metrics +``` + +## Docker Deployment + +### Complete Stack (Recommended) + +```yaml +# docker-compose.yml +version: '3.8' +services: + telemetry-server: + build: ./server + ports: + - "8080:8080" + command: ["-port=8080", "-dashboard=true", "-cleanup=24h"] + + prometheus: + image: prom/prometheus:latest + ports: + - "9090:9090" + volumes: + - ./prometheus.yml:/etc/prometheus/prometheus.yml + + grafana: + image: grafana/grafana:latest + ports: + - "3000:3000" + environment: + - GF_SECURITY_ADMIN_PASSWORD=admin + volumes: + - ./grafana-provisioning:/etc/grafana/provisioning + - ./grafana-dashboard.json:/var/lib/grafana/dashboards/seaweedfs.json +``` + +```bash +# Deploy the stack +docker-compose up -d + +# Scale telemetry server if needed +docker-compose up -d --scale telemetry-server=3 +``` + +### Server Only + +```bash +# Build and run telemetry server +cd server +docker build -t seaweedfs-telemetry . +docker run -p 8080:8080 seaweedfs-telemetry -port=8080 -dashboard=true +``` + +## Development + +### Protocol Buffer Development + +```bash +# Generate protobuf code +cd telemetry +protoc --go_out=. --go_opt=paths=source_relative proto/telemetry.proto + +# The generated code is already included in the repository +``` + +### Build from Source + +```bash +# Build telemetry server +cd telemetry/server +go build -o telemetry-server . + +# Build SeaweedFS with telemetry support +cd ../.. +go build -o weed ./weed +``` + +### Testing + +```bash +# Test telemetry server +cd telemetry/server +go test ./... + +# Test protobuf communication (requires protobuf tools) +# See telemetry client code for examples +``` + +## Grafana Dashboard + +The included Grafana dashboard provides: + +- **Overview**: Total and active clusters, version distribution +- **Resource Usage**: Volume servers and disk usage over time +- **Deployments**: Deployment type and OS distribution +- **Growth Trends**: Historical growth patterns + +### Custom Queries + +```promql +# Total active clusters +seaweedfs_telemetry_active_clusters + +# Disk usage by version +sum by (version) (seaweedfs_telemetry_disk_bytes) + +# Volume servers by deployment type +sum by (deployment) (seaweedfs_telemetry_volume_servers) + +# Filer servers by version +sum by (version) (seaweedfs_telemetry_filer_count) + +# Broker servers across all clusters +sum(seaweedfs_telemetry_broker_count) + +# Growth rate (weekly) +increase(seaweedfs_telemetry_total_clusters[7d]) +``` + +## Security Considerations + +- **Network Security**: Use HTTPS in production environments +- **Access Control**: Implement authentication for Grafana and Prometheus +- **Data Retention**: Configure appropriate retention policies +- **Monitoring**: Monitor the telemetry infrastructure itself + +## Troubleshooting + +### Common Issues + +**SeaweedFS not sending data:** +```bash +# Check telemetry configuration +weed master -h | grep telemetry + +# Verify connectivity +curl -v http://your-telemetry-server:8080/api/collect +``` + +**Server not receiving data:** +```bash +# Check server logs +docker-compose logs telemetry-server + +# Verify metrics endpoint +curl http://localhost:8080/metrics +``` + +**Prometheus not scraping:** +```bash +# Check Prometheus targets +curl http://localhost:9090/api/v1/targets + +# Verify configuration +docker-compose logs prometheus +``` + +### Debugging + +```bash +# Enable verbose logging in SeaweedFS +weed master -v=2 -telemetry=true + +# Check telemetry server metrics +curl http://localhost:8080/metrics | grep seaweedfs_telemetry + +# Test data flow +curl http://localhost:8080/api/stats +``` + +## Contributing + +1. Fork the repository +2. Create a feature branch +3. Make your changes +4. Add tests if applicable +5. Submit a pull request + +## License + +This telemetry system is part of SeaweedFS and follows the same Apache 2.0 license.
\ No newline at end of file |
