github-runners

balena deployment of self-hosted GitHub runners

Runners are deployed in two variants, vm and container, where vm is isolated and safe to use on public repositories.

See github-runner-vm and self-hosted-runners for image sources.

VM Runner Sizes

Firecracker allows overprovisioning or oversubscribing of both CPU and memory resources for virtual machines (VMs) running on a host. This means that the total vCPUs and memory allocated to the VMs can exceed the actual physical CPU cores and memory available on the host machine.

In order to make the most efficient use of host resources, we want to slightly overprovision the host hardware so if/when all allocated resources are consumed by jobs (e.g. yocto) there would be minimal overlap that could lead to performance degredation.

See the github-runner-vm README for more.

Provision New Hardware

Hetzner

balenaOS can be deployed into Hetzner Robot

Order a suitable machine in an ES rack (remote power controls)
Download balenaOS production image from the target balenaCloud fleet:
- x64: https://dashboard.balena-cloud.com/fleets/2123949
- ARM64: https://dashboard.balena-cloud.com/fleets/2123948
For x64 only: Unwrap the image
Copy unwrapped image to S3 playground bucket and make public:
```
aws s3 cp balena.img s3://{{bucket}}/ --acl public-read
```
Activate Hetzner Rescue system
Reboot or reset server

Single drive

[!NOTE] This leaves the second block device unpaired and empty

Download and uncompress unwrapped balenaOS image to /tmp using wget

(Optional) Zero out target disk(s):

for device in nvme{0,1}n1; do
    blkdiscard /dev/${device} -f
done

Download image from S3 via wget (URL is in S3 dashboard)

Write image to disk:

dd if=balena.img of=/dev/nvme1n1 bs=$(blockdev --getbsz /dev/nvme1n1)

(Check lsblk output for block device)

Check resulting partitions with fdisk -l /dev/nvme1n1
Reboot
Manually power cycle again via the Robot dashboard to work around this issue
The machine should provision into the corresponding fleet

Two drives via RAID1

[!NOTE] Use generic-amd64 or generic-aarch64 balenaOS device type

Remove any existing RAID array:

mdadm --stop /dev/md127
mdadm --remove /dev/md127

Create RAID array:

mdadm --create --verbose /dev/md127 \
  --level=1 \
  --raid-devices=2 /dev/nvme{0,1}n1 \
  --metadata=1.0

Increase (re)sync speed:

sysctl -w dev.raid.speed_limit_min=500000
sysctl -w dev.raid.speed_limit_max=5000000

Download image from S3 via wget (URL is in S3 dashboard)

Write image to RAID array:

dd if=balena.img of=/dev/md127 bs=$(blockdev --getbsz /dev/md127)

Check resulting partitions with fdisk -l /dev/md127
Monitor synchronization progress:
```
watch cat /proc/mdstat
```
Reboot when 100% synchronized
Manually power cycle again via the Robot dashboard to work around this issue
The machine should provision into the corresponding fleet

Name		Name	Last commit message	Last commit date
Latest commit History 1,388 Commits
.github/workflows		.github/workflows
.versionbot		.versionbot
CHANGELOG.md		CHANGELOG.md
README.md		README.md
VERSION		VERSION
balena.yml		balena.yml
docker-compose.yml		docker-compose.yml
repo.yml		repo.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

github-runners

VM Runner Sizes

Provision New Hardware

Hetzner

Single drive

Two drives via RAID1

About

Releases 242

Packages

Contributors 7

product-os/github-runners

Folders and files

Latest commit

History

Repository files navigation

github-runners

VM Runner Sizes

Provision New Hardware

Hetzner

Single drive

Two drives via RAID1

About

Resources

Stars

Watchers

Forks

Releases 242

Packages 0

Contributors 7

Packages