Skip to content

Stale orphan EC2 instances not terminated when GitHub runner API returns no state or fetch fails #4724

@rajanikant05

Description

@rajanikant05

Issue

There are stale EC2 instances tagged as Orphan (orphan=true) but they are not being terminated as expected during the scale down process. This occurs when the scale-down lambda fails to fetch runner details from the GitHub API (returns no state or a failed fetch, such as a 404).

Impact

  • Orphaned EC2 instances remain active, causing resource leakage and increased costs.
  • The termination logic does not handle cases where the GitHub runner state cannot be fetched, resulting in instances persisting indefinitely.

Expected Behavior

  • Orphaned EC2 instances should be terminated if the GitHub API fails to fetch runner details or returns no state (e.g., 404 Not Found) by re-verifying it.
  • The scale-down process should treat missing runner details as sufficient to proceed with termination.

Steps to Reproduce

  1. Deregister a GitHub runner so it no longer exists in the GitHub API.
  2. Ensure the corresponding EC2 instance is tagged with orphan=true.
  3. Observe that the scale-down lambda fails to terminate the instance when the API fetch fails or returns 404.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions