Spark Query History
Spark Query History runs in each env-dataops cluster from terraform/spark-query-history/env/<env>.
Current Shape
The app runs in the kyuubi namespace and uses an IRSA role named:
data-platform-<env>-spark-query-historyThe host pattern is:
spark-query-history.data-platform.us-east-1.<env>-dataops.fetchrewards.comData Sources
Spark Query History queries the environment’s data_platform_usage database through Athena and Glue. It depends on the Kyuubi Spark event log pipeline and the spark-cost-tracking aggregation job.
Core tables include:
data_platform_usage.spark_query_costsdata_platform_usage.spark_query_operationsdata_platform_usage.iceberg_table_health
Validation
Use /api/health, /queries?limit=5, and /costs?days=7 for basic runtime checks. If data is missing, verify Kyuubi produced Spark event logs and the cost-tracking job has populated the Iceberg tables before debugging the web app.
Related Reference
Checked Against
terraform/spark-query-history/env/dev/main.tf,stage,preprod, andprodonorigin/main.terraform/modules/spark-cost-tracking.implementations/2026-05-28-dl-474-iceberg-table-health-all-envs-progress.md.