Jupyter Notebooks
New here? Start with the Quickstart so you have a running Icelake before wiring up the notebook.
Data scientists don’t want to leave their notebook. This guide connects a VS Code Jupyter notebook to Icelake via the JupySQL magic, runs SQL straight against your S3-backed lakehouse, returns results as pandas DataFrames, and plots them with matplotlib.
What you’ll build
Section titled “What you’ll build”A single notebook cell pattern that takes you from raw data on S3 to a matplotlib chart — no leaving VS Code, no separate BI tool, no export/import dance.

Prerequisites
Section titled “Prerequisites”- An Icelake account — sign up at app.icelake.eu or follow the Quickstart to get one.
- Sample data in Icelake — upload the
example-metrics.csvfrom CSV Upload so there’s a concrete dataset the notebook can query. - Python 3.10+ and VS Code with the Python and Jupyter extensions installed.
1. Create API credentials
Section titled “1. Create API credentials”Open the Icelake admin UI and go to Data In → API Keys → Create key. You’ll get a client ID and a secret starting with ilk_. The secret is shown exactly once — copy it now.

Export them in your shell so the notebook can pick them up without hardcoding anything:
export ICELAKE_CLIENT_ID="your-client-id"export ICELAKE_API_KEY="ilk_..."Launch VS Code from the same shell so the variables carry into the Jupyter kernel.
2. Set up the notebook environment
Section titled “2. Set up the notebook environment”Install the packages you need:
pip install jupysql sqlalchemy 'psycopg[binary]' pandas matplotlibWhy these:
- JupySQL — the
%sqland%%sqlnotebook magics. Actively maintained successor toipython-sql. - SQLAlchemy + psycopg[binary] — the driver JupySQL uses to talk to Icelake’s pgwire endpoint (PostgreSQL v3 client, pre-compiled wheels, no toolchain required).
- pandas + matplotlib — result DataFrames and plotting.
Create a new .ipynb file in VS Code (File → New File → Jupyter Notebook), pick your Python kernel, and run:
%load_ext sql
import osICELAKE_URL = ( "postgresql+psycopg://" f"{os.environ['ICELAKE_CLIENT_ID']}:{os.environ['ICELAKE_API_KEY']}" "@sql.icelake.eu:5432/icelake?sslmode=require")%sql $ICELAKE_URLTip: keep credentials out of version control by putting them in a
.envfile next to the notebook and loading withfrom dotenv import load_dotenv; load_dotenv()before readingos.environ. JupySQL’s connection string docs cover other secret-management patterns.
3. Query the sensor data
Section titled “3. Query the sensor data”With the connection active, any cell prefixed with %%sql runs against Icelake and renders the result as a table.
%%sqlSELECTtimestamp,temperature_celsius,humidity_percent,energy_kwhFROM metricsWHERE sensor_id = 'sensor-001'ORDER BY timestampLIMIT 20;
4. Plot with matplotlib
Section titled “4. Plot with matplotlib”Capture the last query result into a pandas DataFrame with JupySQL’s << operator, then plot with matplotlib’s dataframe integration:
%%sql result <<SELECT timestamp, temperature_celsius, humidity_percentFROM metricsWHERE sensor_id = 'sensor-001'ORDER BY timestamp;import matplotlib.pyplot as plt
df = result.DataFrame()df['timestamp'] = df['timestamp'].astype('datetime64[ns]')
ax = df.plot( x='timestamp', y=['temperature_celsius', 'humidity_percent'], figsize=(10, 4), title='Sensor 001 — temperature and humidity',)ax.set_ylabel('value')plt.tight_layout()plt.show()Working with pandas
Section titled “Working with pandas”Once you have a DataFrame, the rest is ordinary Python — filter, resample, merge with other notebooks’ data, export to Parquet, feed into scikit-learn. A couple of patterns:
# Resample to hourly averageshourly = ( df.set_index('timestamp') .resample('1H') .mean() .reset_index())
# Correlate temperature and energy production%sql select corr(temperature_celsius, energy_kwh) as t_vs_energy \from metrics where sensor_id = 'sensor-001';Troubleshooting
Section titled “Troubleshooting”| Symptom | What to check |
|---|---|
could not connect to server / connection timed out | Verify your network reaches sql.icelake.eu:5432. Corporate firewalls often block outbound 5432 — ask your network team to allowlist it. |
password authentication failed for user "…" | Username is the client ID, password is the ilk_… secret — the same credentials Grafana, Prometheus, and every other Icelake endpoint use. Check echo $ICELAKE_CLIENT_ID isn’t empty. |
SSL connection required | The managed endpoint enforces TLS — keep ?sslmode=require on the URL. |
%sql magic not recognised | Re-run %load_ext sql in the current kernel session. If you restarted VS Code, the kernel restarted too. |
| Slow query on a big scan | Narrow the WHERE clause by timestamp and sensor/metric name — Icelake partitions Parquet by date on S3, so bounded ranges read far fewer files. |
Related docs
Section titled “Related docs”- Quickstart — end-to-end Icelake + Grafana setup in 5 minutes
- CSV Upload — the
example-metrics.csvthis tutorial queries - Query Interfaces — Grafana, pgwire SQL, REST, and the Loki Query API
- AI MasterMind — natural-language analytics over the same datasets