Skip to content

Overview

Icelake is a sovereign European AI-driven data research workbench. Public datasets — weather, air quality, sovereign maps without tracking, geo and IoT — are already loaded and refreshed for you, alongside any data you ingest yourself via Prometheus, Loki, OpenTelemetry, or MQTT. Everything lands as Parquet in the open Apache Iceberg table format on S3, and is queryable with DuckDB through SQL, LogQL, PromQL, and AI MasterMind in natural language. Built on open standards so your data stays portable and independent from proprietary providers.

Icelake follows a pipeline architecture designed for high-throughput, low-latency observability:

  • Multi-Protocol Ingestion — Prometheus remote write, Loki push API, and OTLP protobuf
  • S3-Native Storage — All data stored as Parquet files on S3-compatible object storage
  • Apache Iceberg — Open table format with snapshot isolation and ACID transactions
  • Query Engine — DuckDB-powered SQL via pgwire, LogQL for logs, PromQL for metrics
  • Background Compaction — Automatic merging of small files for optimal query performance
TechnologyPurposeBenefits
RustCore RuntimeMemory safety, zero-cost abstractions, speed
Apache IcebergOpen Table FormatSnapshot isolation, ACID, portable to any engine
DuckDBQuery EngineSub-millisecond analytical queries
PostgreSQLCatalog BackendReliable, transactional Iceberg catalog
S3Object StorageCost-efficient, durable, scalable storage
ParquetData FormatColumnar, compressed, query-optimized
  1. Architecture — Understand the data flow and multi-tenant model
  2. Prometheus — Configure Prometheus remote write ingestion
  3. Loki & LogQL — Set up log ingestion and LogQL queries
  4. OpenTelemetry — Ingest via OTLP protobuf endpoints
  1. Geo & Public Datasets — DWD weather warnings, OpenAQ air quality, and Overture places joinable against your own data
  2. MQTT & IoT — Configure MQTT ingestion with TTN LoRaWAN auto-parsing
  3. AI MasterMind — Natural language chat analytics with DuckDB
  4. Admin Dashboard — Manage data sources, teams, and analytics
  5. Home Assistant — HACS integration for smart home metrics
  6. Query Interfaces — Grafana, pgwire SQL, REST API, and Loki Query API
  • GitHub — Contribute to the project and report issues
  • Discord — Join our community for discussions and support
  • Documentation — Comprehensive guides and API references