Setup
Create a Dataset
A dataset maps to a single table or view in your data source. It defines which columns can be used as metric values, time references, filters, and segment dimensions.
Creating a dataset
Navigate to /data-sets and click Create. You'll be asked to:
- 1Select a data source — The warehouse connection you created in the previous step.
- 2Select a table or view — Lighthouse reads your schema to show available tables. You can search by name.
- 3Configure columns — Assign roles to each column (see below). At minimum, mark one datetime column.
Column roles
Each column in your dataset can be assigned one or more roles. Roles control how the column can be used when building metrics.
| Role | What it enables |
|---|---|
| Metric column | The column can be used as the value to aggregate — e.g. COUNT DISTINCT of user_id, SUM of revenue. Numeric columns and ID columns are typical candidates. |
| Filter column | The column can be used in WHERE conditions when defining a metric — e.g. filter to status = 'completed'. |
| Segment column | The column can be used to break a metric down by dimension — e.g. segment by country or user_type to get per-segment alerts. |
| Datetime column | The column can be used to scope a metric to a time window — e.g. created_at, event_time. Must be a date or timestamp type. |
A single column can hold multiple roles — for example, a status column might be both a filter column and a segment column.
Default datetime column
You must select a default datetime column for the dataset. This column is used as the default time reference when creating business metrics. You can override it per-metric if your dataset has multiple timestamp columns.
After creation
As soon as you save a new dataset, Lighthouse automatically runs an AI metric suggestions job in the background. It reads your column structure — names, types, and assigned roles — and proposes a list of metrics your team is likely to care about.
Suggestions appear on the dataset detail page. You can click Create Metric on any suggestion to open the metric creation form pre-filled with that recommendation. You can also trigger a fresh suggestion run at any time.
Working with multiple tables
Each dataset maps to exactly one table or view. If you need to monitor metrics from multiple tables, create a separate dataset for each one — they can all point to the same data source.
If you need to join data across tables before monitoring, create a view in your warehouse that joins the tables, then create a dataset pointing at that view.