Kyuubi
Kyuubi runs on each env-dataops data-platform EKS cluster and exposes Spark SQL through HiveServer2-compatible endpoints. The current service root is terraform/kyuubi/env/<env>.
Current Shape
Each environment has three warehouses:
| Warehouse | Namespace | Main Use |
|---|---|---|
default | kyuubi | dbt, Airflow ETL, regular service jobs |
maintenance | kyuubi-maintenance | Snowpack, Iceberg table health, maintenance jobs |
interactive | kyuubi-interactive | human and BI-style ad hoc SQL |
The endpoint pattern is:
kyuubi[-maintenance|-interactive].data-platform.us-east-1.<env>-dataops.fetchrewards.com:10009Catalogs
Kyuubi configures the same-account Glue catalog as lakehouse_<env>. It also configures remote catalogs for cross-env reads. Remote catalog names intentionally use the same lakehouse_<env> pattern instead of a _ro suffix; Lake Formation and S3 permissions enforce read-only behavior for remote catalogs except for the explicit dev-write exception on interactive Kyuubi.
Cost And Observability
Kyuubi writes Spark event logs into the environment lakehouse bucket. The spark-cost-tracking job reads those event logs and populates data_platform_usage tables used by Spark Query History and Grafana dashboards.
Legacy Reference
The older Kyuubi clusters reference has useful warehouse tuning context, but many URLs and account references are test-dataops-era. Prefer this page and live Terraform roots for env-dataops deployment details.
Checked Against
terraform/kyuubi/env/dev/main.tf,stage,preprod, andprodonorigin/main.terraform/config/services.yaml.implementations/2026-05-21-dl-419-prod-dataops-runtime-progress.md.implementations/2026-05-22-dl-419-dev-catalog-interactive-write-exception-progress.md.