GCP Server Deployment
HeatSafe Server Deployment on GCP
This document is the production runbook for the HeatSafe sync server.
Target shape
- Cloud Run service for the API and dashboard
- Cloud Run job for database migrations
- Cloud SQL for PostgreSQL
- Secret Manager for
DATABASE_URL - Direct Cloud Run IAP for staff dashboard access
This keeps the first production version small. It avoids a load balancer unless the project later needs custom domains or multi-region routing.
Prerequisites
- A Google Cloud project with billing enabled
gcloudauthenticated to the target project- Artifact Registry, Cloud Run, Cloud SQL Admin, Secret Manager, and IAP APIs enabled
- A regional choice shared by Cloud Run and Cloud SQL. Default:
us-west1
Required environment values
PROJECT_IDPROJECT_NUMBERREGIONREPOSITORYfor Artifact Registry, exampleheatsafeIMAGEfull Artifact Registry image URLSERVICE_NAME, exampleheatsafe-serverMIGRATION_JOB, exampleheatsafe-server-migrateINSTANCE_NAME, exampleheatsafe-pgDATABASE_NAME, exampleheatsafeDATABASE_USER, exampleheatsafe_appSERVER_SURFACEfor split deployments:apiordashboard
Example shell session:
export PROJECT_ID="your-project-id"
export PROJECT_NUMBER="$(gcloud projects describe "$PROJECT_ID" --format='value(projectNumber)')"
export REGION="us-west1"
export REPOSITORY="heatsafe"
export SERVICE_NAME="heatsafe-server"
export MIGRATION_JOB="heatsafe-server-migrate"
export INSTANCE_NAME="heatsafe-pg"
export DATABASE_NAME="heatsafe"
export DATABASE_USER="heatsafe_app"
export IMAGE="${REGION}-docker.pkg.dev/${PROJECT_ID}/${REPOSITORY}/${SERVICE_NAME}:latest"
1. Enable services
gcloud services enable \
artifactregistry.googleapis.com \
run.googleapis.com \
sqladmin.googleapis.com \
secretmanager.googleapis.com \
iap.googleapis.com \
cloudbuild.googleapis.com \
monitoring.googleapis.com \
logging.googleapis.com
2. Create Artifact Registry
gcloud artifacts repositories create "$REPOSITORY" \
--repository-format=docker \
--location="$REGION" \
--description="HeatSafe server images"
If the repository already exists, this command can be skipped.
3. Create Cloud SQL for PostgreSQL
gcloud sql instances create "$INSTANCE_NAME" \
--database-version=POSTGRES_16 \
--cpu=2 \
--memory=8GiB \
--region="$REGION" \
--storage-size=20GB \
--availability-type=zonal \
--edition=enterprise
gcloud sql databases create "$DATABASE_NAME" \
--instance="$INSTANCE_NAME"
gcloud sql users create "$DATABASE_USER" \
--instance="$INSTANCE_NAME" \
--password="$(openssl rand -base64 24)"
Capture the generated password immediately and store it in a password manager before continuing.
4. Create Secret Manager secrets
The service currently reads a single DATABASE_URL environment variable. For Cloud Run plus Cloud SQL public-IP socket connectivity, store the final connection string as one secret.
Example connection string:
postgres://USER:PASSWORD@/DATABASE?host=/cloudsql/PROJECT:REGION:INSTANCE
Create the secret:
printf '%s' "postgres://${DATABASE_USER}:DB_PASSWORD@/${DATABASE_NAME}?host=/cloudsql/${PROJECT_ID}:${REGION}:${INSTANCE_NAME}" \
| gcloud secrets create heatsafe-database-url \
--replication-policy=automatic \
--data-file=-
Grant the runtime service account access after the service account exists.
5. Build and push the server image
Run this from the repository root:
gcloud builds submit ./server --tag "$IMAGE"
6. Create the Cloud Run runtime service account
gcloud iam service-accounts create heatsafe-server-sa \
--display-name="HeatSafe Server Runtime"
Grant required roles:
gcloud projects add-iam-policy-binding "$PROJECT_ID" \
--member="serviceAccount:heatsafe-server-sa@${PROJECT_ID}.iam.gserviceaccount.com" \
--role="roles/cloudsql.client"
gcloud secrets add-iam-policy-binding heatsafe-database-url \
--member="serviceAccount:heatsafe-server-sa@${PROJECT_ID}.iam.gserviceaccount.com" \
--role="roles/secretmanager.secretAccessor"
7. Deploy the migration job
Use the same container image for migrations. The compiled migration entrypoint is dist/src/db/migrate.js.
gcloud run jobs create "$MIGRATION_JOB" \
--image="$IMAGE" \
--region="$REGION" \
--service-account="heatsafe-server-sa@${PROJECT_ID}.iam.gserviceaccount.com" \
--set-secrets="DATABASE_URL=heatsafe-database-url:latest" \
--set-env-vars="NODE_ENV=production" \
--add-cloudsql-instances="${PROJECT_ID}:${REGION}:${INSTANCE_NAME}" \
--command="node" \
--args="dist/src/db/migrate.js"
Run it before the first deploy and before each schema rollout:
gcloud run jobs execute "$MIGRATION_JOB" \
--region="$REGION" \
--wait
8. Deploy the Cloud Run service
Google’s current guidance allows enabling IAP directly on Cloud Run. That is the simplest fit for this server because the dashboard already expects the Google-authenticated email header.
For production pilot deployments, prefer two Cloud Run services from the same image:
- public API service with
SERVER_SURFACE=api - IAP-protected dashboard service with
SERVER_SURFACE=dashboard
gcloud run deploy "$SERVICE_NAME" \
--image="$IMAGE" \
--region="$REGION" \
--service-account="heatsafe-server-sa@${PROJECT_ID}.iam.gserviceaccount.com" \
--no-allow-unauthenticated \
--iap \
--add-cloudsql-instances="${PROJECT_ID}:${REGION}:${INSTANCE_NAME}" \
--set-secrets="DATABASE_URL=heatsafe-database-url:latest" \
--set-env-vars="NODE_ENV=production,SERVER_SURFACE=dashboard" \
--min-instances=0 \
--max-instances=10 \
--cpu=1 \
--memory=512Mi
If this is the first time IAP is enabled in a project without an organization, Google may require the initial enablement to be done once in the Cloud Run console.
9. Grant IAP access
Grant the IAP service agent permission to invoke the service:
gcloud run services add-iam-policy-binding "$SERVICE_NAME" \
--region="$REGION" \
--member="serviceAccount:service-${PROJECT_NUMBER}@gcp-sa-iap.iam.gserviceaccount.com" \
--role="roles/run.invoker"
Grant staff access through IAP:
gcloud iap web add-iam-policy-binding \
--region="$REGION" \
--resource-type=cloud-run \
--service="$SERVICE_NAME" \
--member="user:staff1@example.com" \
--role="roles/iap.httpsResourceAccessor"
Repeat the final command for each user or Google Group that needs dashboard access.
10. Verify the deployment
Check the service configuration:
gcloud run services describe "$SERVICE_NAME" \
--region="$REGION"
Expected checks:
Iap Enabled: true- Cloud SQL connection name is attached
- service account is
heatsafe-server-sa@PROJECT_ID.iam.gserviceaccount.com
Operational checks:
- Confirm
GET /healthreturnsok: true - Open
/dashboardin a browser while signed in as a Google user or group member withroles/iap.httpsResourceAccessor - Run a registration request and a sync batch against the service URL from a trusted client
- Confirm
POST /v1/sync/batchesreturnsserverAckedAtandreceiptStatus - Confirm
POST /v1/sync/acksmoves the batch toclient_acked
11. Rollout sequence for future releases
- Build and push the new image.
- Execute the migration job with the new image if the schema changed.
- Deploy the service with the new image.
- Verify
/dashboard, registration, batch upload, and final ACK.
12. Monitoring and alerts
Set these alerts in Cloud Monitoring before production use:
- Cloud Run 5xx rate above baseline for the service
- Cloud Run request latency spikes for
/v1/sync/batchesand/v1/sync/acks - Cloud SQL instance CPU, memory, and connection saturation
- Log-based alert on repeated sync ingest failures or database connection failures
- Daily check for
ingest_batches.receipt_status = 'awaiting_client_ack'older than 24 hours - Daily check for rising
receipt_status = 'expired'counts
Useful logging filters:
resource.type="cloud_run_revision"
resource.labels.service_name="heatsafe-server"
severity>=ERROR
resource.type="cloud_run_job"
resource.labels.job_name="heatsafe-server-migrate"
severity>=ERROR
13. Notes and limitations
- The dashboard trust model assumes Google IAP injects the authenticated user email header.
- Do not expose dashboard routes on a public unauthenticated Cloud Run service. If the mobile API must be public, deploy the API and dashboard as separate Cloud Run services or add a production route-surface guard so public traffic cannot reach the dashboard.
- The server keeps unconfirmed batch receipts for seven days.
- The client should only delete local outbox rows after
POST /v1/sync/ackssucceeds. - If the client retries the same batch before the final ACK, the server returns the stored receipt state instead of duplicating data.
14. Future production hardening
The current sync protection is device-token based:
POST /v1/sync/batches,POST /v1/sync/acks, andPOST /v1/opt-in/disablerequire a valid bearer device token.- Device tokens are generated by the server and only token hashes are stored in PostgreSQL.
- Upload payloads are schema-validated, request bodies are capped, and duplicate event IDs or batch retries are handled idempotently.
The current weak point is anonymous registration:
POST /v1/opt-in/registerintentionally allows a new install to register without prior staff approval.- Anyone who discovers the endpoint could register a fake device, receive a valid token, and then submit schema-valid fake data.
Before a broader production launch, add one or more enrollment controls:
- staff-issued participant or study enrollment codes
- one-time registration invite tokens
- mobile app integrity checks such as Play Integrity, App Attest, or Firebase App Check
- rate limiting on registration and sync endpoints
- abuse monitoring for unusual registration volume, upload volume, locations, or repeated invalid payloads
Dashboard access should be managed at the IAP layer:
- Cloud Run IAP should authenticate and authorize staff before traffic reaches dashboard routes.
- Manage dashboard access through Cloud Run IAP IAM, preferably by granting
roles/iap.httpsResourceAccessorto a Google Group. ALLOW_INSECURE_DEV_AUTHmust stay disabled in production.