As part of a disaster recovery plan, you can protect production data on your GitHub Enterprise instance by configuring automated backups.
In this guide
- About GitHub Enterprise Backup Utilities
- Installing GitHub Enterprise Backup Utilities
- Scheduling a backup
- Restoring a backup
GitHub Enterprise Backup Utilities is a backup system you install on a separate host, which takes backup snapshots of your GitHub Enterprise instance at regular intervals over a secure SSH network connection. You can use a snapshot to restore an existing GitHub Enterprise instance to a previous state from the backup host.
Only data added since the last snapshot will transfer over the network and occupy additional physical storage space. To minimize performance impact, backups are performed online under the lowest CPU/IO priority. You do not need to schedule a maintenance window to perform a backup.
For more detailed information on features, requirements, and advanced usage, see the GitHub Enterprise Backup Utilities README.
To use GitHub Enterprise Backup Utilities, you must have a Linux or Unix host system separate from your GitHub Enterprise instance.
You can also integrate GitHub Enterprise Backup Utilities into an existing environment for long-term permanent storage of critical data.
We recommend that the backup host and your GitHub Enterprise instance be geographically distant from each other. This ensures that backups are available for recovery in the event of a major disaster or network outage at the primary site.
Physical storage requirements will vary based on Git repository disk usage and expected growth patterns:
|Storage||Five times the primary instance's allocated storage|
More resources may be required depending on your usage, such as user activity and selected integrations.
Note: To ensure a recovered appliance is immediately available, perform backups targeting the primary instance even in a Geo-replication configuration.
Download the latest GitHub Enterprise Backup Utilities release and extract the file with the
tar -xzvf /path/to/github-backup-utils-vMAJOR.MINOR.PATCH.tar.gz
Copy the included
backup.configand open in an editor.
- Set the
GHE_HOSTNAMEvalue to your primary GitHub Enterprise instance's hostname or IP address.
- Set the
GHE_DATA_DIRvalue to the filesystem location where you want to store backup snapshots.
- Open your primary instance's settings page at
https://HOSTNAME/setup/settingsand add the backup host's SSH key to the list of authorized SSH keys. For more information, see Accessing the administrative shell (SSH).
Verify SSH connectivity with your GitHub Enterprise instance with the
To create an initial full backup, run the
For more information on advanced usage, see the GitHub Enterprise Backup Utilities README.
You can schedule regular backups on the backup host using the
cron(8) command or a similar command scheduling service. The configured backup frequency will dictate the worst case recovery point objective (RPO) in your recovery plan. For example, if you have scheduled the backup to run every day at midnight, you could lose up to 24 hours of data in a disaster scenario. We recommend starting with an hourly backup schedule, guaranteeing a worst case maximum of one hour of data loss if the primary site data is destroyed.
If backup attempts overlap, the
ghe-backup command will abort with an error message, indicating the existence of a simultaneous backup. If this occurs, we recommended decreasing the frequency of your scheduled backups. For more information, see the "Scheduling backups" section of the GitHub Enterprise Backup Utilities README.
In the event of prolonged outage or catastrophic event at the primary site, you can restore your GitHub Enterprise instance by provisioning another GitHub Enterprise appliance and performing a restore from the backup host. You must add the backup host's SSH key to the target GitHub Enterprise appliance as an authorized SSH key before restoring an appliance.
Warning: If you are restoring to a GitHub Enterprise 2.14 appliance from a backup taken on a 2.11, 2.12, or 2.13 appliance, then avoid using GitHub Enterprise Backup Utilities. Using GitHub Enterprise Backup Utilities could destroy old Elasticsearch indices not compatible with Elasticsearch 5.X. If you use GitHub Enterprise Backup Utilities to restore a backup to GitHub Enterprise 2.14, then manual reindexing could be necessary.
To restore your GitHub Enterprise instance from the last successful snapshot, use the
ghe-restore command. You should see output similar to this:
ghe-restore -c 18.104.22.168 Checking for leaked keys in the backup snapshot that is being restored ... * No leaked keys found Connect 22.214.171.124:122 OK (v2.9.0) WARNING: All data on GitHub Enterprise appliance 126.96.36.199 (v2.9.0) will be overwritten with data from snapshot 20170329T150710. Please verify that this is the correct restore host before continuing. Type 'yes' to continue: yes Starting restore of 188.8.131.52:122 from snapshot 20170329T150710 # ...output truncated Completed restore of 184.108.40.206:122 from snapshot 20170329T150710 Visit https://220.127.116.11/setup/settings to review appliance configuration.
Note: The network settings are excluded from the backup snapshot. You must manually configure the network on the target GitHub Enterprise appliance as required for your environment.
You can use these additional options with
-cflag overwrites the settings, certificate, and license data on the target host even if it is already configured. Omit this flag if you are setting up a staging instance for testing purposes and you wish to retain the existing configuration on the target. For more information, see the "Using using backup and restore commands" section of the GitHub Enterprise Backup Utilities README.
-sflag allows you to select a different backup snapshot.