Wake Up EC2 Active

Enables authorized users to activate a stopped EC2 instance on demand.

Overview

EC2 instances are cost-effective for services that run intermittently — analytics dashboards, ETL pipelines, development environments — but leaving them running 24/7 wastes money. This pattern solves that problem by keeping the instance stopped until it's needed, starting it on demand via a static web page, and stopping it automatically when idle.

The architecture requires no always-on backend. A Lambda function handles authentication, starts the EC2 instance, polls until it's ready, and redirects the user, all from a page hosted on S3 and served via CloudFront. For scheduled workloads, EventBridge triggers a second Lambda that starts the instance, runs the job via SSM Run Command, and lets the job stop the instance when it finishes.

Two complementary stop mechanisms keep costs in check: scheduled jobs stop the instance after the pipeline completes if no users are active, while a CloudWatch alarm stops it after 20 minutes of idle network traffic. If a user is on the dashboard when the pipeline finishes, the script defers to the CloudWatch alarm rather than dropping the session.

A concrete example of this architecture is kdparser. kdparser serves a Metabase analytics dashboard backed by PostgreSQL, processing a weekly Firebase export on a schedule and serving the dashboard on demand, running only when needed.

If this architecture could help solve a problem you're working on, or if you'd like to discuss the approach, we'd love to hear from you. Use the contact form at the bottom of this page to get in touch.

Component Diagram

Loading component diagram...

Key Components

CloudFront + S3

Serves the static wake-up page. No server required — the page is a plain HTML file uploaded to S3 and distributed via CloudFront.

Lambda — Wake-Up

The core of the pattern. Authenticates the user, issues ec2:StartInstances, polls DescribeInstances until the instance is running, updates DNS, and probes the application health endpoint. Returns a status on each call so the page can show progress. In kdparser, this is implemented in Python.

SSM Parameter Store

Stores credentials for authorized users. The Lambda retrieves and verifies them on every request — credentials are never cached or returned to the client. Any authentication mechanism can be plugged in here: Cognito, an external identity provider, or API keys. SSM Parameter Store was chosen for kdparser because it is a personal project with a defined set of users — it is simple, serverless, and requires no additional infrastructure.

EC2 Instance

Runs the actual service. Stopped when idle, started on demand. The instance type and storage are sized for the workload, not for always-on availability. In kdparser, the instance runs Docker Compose with PostgreSQL and Metabase.

Route 53

A dynamic A record updated by the wake-up Lambda on each poll once the instance has a public IP (TTL 60s). Eliminates the need for an Elastic IP and its associated idle charges.

EventBridge

Triggers the scheduler Lambda on a cron schedule for automated workloads. In kdparser, this runs on a weekly schedule.

Lambda — Scheduler

Starts the EC2 instance, waits for it to be running, then issues an SSM Run Command to execute the workload script. Returns immediately — does not wait for the script to finish. In kdparser, this is implemented in Python.

SSM Run Command

Executes a shell script on the EC2 instance without requiring SSH or an open inbound port. Output streams to CloudWatch Logs. In kdparser, the pipeline script is invoked with a flag that stops the instance on completion.

CloudWatch Alarm

Monitors EC2 network traffic at regular intervals and stops the instance after a sustained period of idle traffic. Uses TreatMissingData: notBreaching so the alarm does not fire when the instance is already stopped.

Sequence Diagrams

Manual Wake-Up

Loading sequence diagram...

Scheduled Wake-Up

Loading sequence diagram...

Auto-Stop

Loading sequence diagram...

Design Decisions

Dynamic DNS instead of Elastic IP

The wake-up Lambda upserts the Route 53 A record on every poll once the instance has a public IP (TTL 60s). This eliminates idle EIP charges without any user-visible impact — users always go through the wake-up flow before accessing the service.

Single schedule management point

EventBridge triggers a Lambda that starts EC2 and issues an SSM Run Command directly. There is no cron job on the EC2 instance itself, eliminating the dual-schedule management problem and making the schedule visible and auditable in AWS.

Health check instead of fixed wait

The wake-up Lambda polls the application's health endpoint directly rather than waiting a fixed number of seconds after the instance reaches running state. The page redirects exactly when the application is ready, regardless of boot time variance.

Polling uses POST throughout

The wake-up page polls instance state via POST rather than GET. Browsers silently discard request bodies on GET requests, which causes credential verification to fail — the Lambda returns 401 and the poll loop stalls silently. POST carries the credentials reliably on every request.

Two stop mechanisms, complementary scope

Scheduled runs stop the instance after the job finishes, but only if no users have an active session on port 443. If a session is present, the script skips the stop and the CloudWatch idle alarm takes over, stopping the instance once the session ends naturally. This prevents the pipeline from dropping an active Metabase session mid-use.

No session tokens

The Lambda re-validates credentials on every request, including each poll during startup. There is no session token mechanism. This keeps the Lambda stateless and eliminates an entire class of token management complexity at the cost of additional verification operations — acceptable at this scale.