Architecture

Architecture overview

Icelake is a fully managed European observability data lakehouse. Under the hood it runs as a single Rust service that handles the full pipeline from ingestion through query — we operate it for you, you consume it as a SaaS.

Key components

Ingestion layer

Multi-protocol HTTP endpoints for Prometheus, Loki, and OpenTelemetry data, plus a native MQTT broker for IoT telemetry.

Storage engine

S3-native Parquet storage in the open Apache Iceberg table format for ACID-compliant snapshots. Your data is isolated per tenant and portable to any Iceberg-aware engine.

Query engine

DuckDB-powered SQL via pgwire, plus LogQL and PromQL for Grafana compatibility over the same data.

Compaction service

Background compaction merges small Parquet files into optimized segments so your queries stay fast. Runs transparently on our side — nothing to configure.

Data flow

Every signal follows the same pipeline:

Ingestion — Data arrives via Prometheus remote write, Loki push API, OTLP HTTP, or MQTT
Parsing — Protocol-specific parsers normalise the payload into a common internal format
Storage — Normalised data is written as Parquet files on Icelake’s S3 with tenant-scoped prefixes
Compaction — Background scheduler merges small files into larger, query-optimal segments
Query — DuckDB reads Parquet files directly from S3 for SQL, PromQL, and LogQL queries

Multi-tenant isolation

Every Icelake account is a tenant. All data, API keys, and queries are scoped to the tenant boundary:

Tenant-scoped storage — each tenant’s data is stored in separate S3 prefixes
API key authentication — keys are hashed with SHA-256 and scoped to a tenant
Query isolation — you can only query data your tenant owns
Rate limiting — per-tenant ingestion rate limits applied at the edge

Security note: API keys are hashed before storage. The plaintext ilk_… secret is shown once at creation and never persisted on our side.

pgwire SQL interface

Icelake exposes a PostgreSQL-compatible interface on sql.icelake.eu:5432 so any PostgreSQL client can query your data directly. Username is your API key’s client ID, password is the ilk_… secret.

-- Connect with any PostgreSQL client
psql "postgresql://your-client-id:ilk_your-api-key-here@sql.icelake.eu:5432/icelake?sslmode=require"

-- Query metrics
SELECT time, value, labels
FROM metrics
WHERE __name__ = 'http_requests_total'
  AND time > now() - interval '1 hour'
ORDER BY time DESC;

-- Query logs
SELECT timestamp, message, labels
FROM logs
WHERE labels->>'service' = 'api-gateway'
  AND timestamp > now() - interval '30 minutes';

For a notebook-first experience, see Jupyter Notebooks.

Background compaction

Compaction runs on our side on a continuous schedule:

Merges small Parquet files into larger segments (target: 256 MB)
Applies partition-pruning metadata for faster queries
Removes tombstoned or expired data based on your retention policy
Runs concurrently with ingestion and queries (snapshot isolation — no read impact)

You don’t configure or operate compaction. It just happens.

What you control

API keys — create, rotate, and revoke per-purpose keys from the admin UI
Data sources — MQTT brokers, uploaded CSVs, and the endpoints you point Prometheus/Loki/OTEL at
Dashboards and visualisations — composed in the admin UI or in your own Grafana
Retention — configurable per tenant (contact us if you need a longer window)

What we handle

Storage, compaction, replication, and backups on S3
pgwire SQL engine, PromQL, LogQL, OTLP parsing
High availability, rate limiting, and DDoS protection
Keycloak-based authentication and PKCE flow for the admin UI