Skip to content

Auth And Data Flow

This page is the concise docs-site entry point for IAM, Lake Formation, CI/CD, local deployment, and query-flow relationships.

The detailed standalone visual map remains available in the repository at docs/data-platform-auth-and-data-flow.html. Keep that map linked until it is fully converted into MDX and reviewed against current code.

Deployment Flow

sequenceDiagram
  participant Dev as Engineer
  participant SSO as SSO Profile
  participant TF as Terraform
  participant Prov as Env Provisioner
  participant AWS as AWS APIs

  Dev->>SSO: authenticate to env-dataops or env-dataops-admin
  SSO->>TF: run plan/apply from a service root
  TF->>Prov: backend/provider assume role
  Prov->>AWS: manage state and resources

Runtime Query Flow

sequenceDiagram
  participant Client as Client or DAG
  participant Kyuubi as Kyuubi
  participant Spark as Spark Engine
  participant Catalog as Glue or Polaris
  participant LF as Lake Formation
  participant S3 as Iceberg S3

  Client->>Kyuubi: submit SQL
  Kyuubi->>Spark: launch Spark engine with IRSA
  Spark->>Catalog: resolve namespace/table metadata
  Catalog->>LF: authorize metadata/data access
  Spark->>S3: read or write Iceberg files

Checked Against