GitHub Enterprise is your organization's private copy of GitHub contained within a virtual machine that you configure and control. Supported virtualization platforms include Amazon Web Services (cloud hosting) and VMWare (on premises).

This section outlines the basic system and network architecture and options for deploying in your environment.

Operating system environment

  • 64-bit Ubuntu Server 12.04 LTS (Precise Pangolin).
  • Administrative user account: admin.
  • Application user accounts: git, mysql, elasticsearch, redis.

Storage architecture

The GitHub Enterprise virtual machine requires two storage volumes, which must be attached to the virtual machine separately and are mounted under the following locations:

  • / - The root filesystem. Included in the distributed machine image and containing the base operating system along with the GitHub Enterprise application environment. The root filesystem should be treated as ephemeral. Note: Any data on this filesystem will be replaced when upgrading to future GitHub Enterprise releases. The root system maintains the following information:
    • Custom CA Certificates (in /data/user/ca-certificates)
    • Custom networking configurations
    • Custom firewall configurations
    • The replication state
  • /data/user - The user filesystem. This contains all configuration and user data, such as:
    • Git repositories
    • Databases
    • Search indexes
    • Published Pages content

The intent of this architecture is to simplify upgrade, rollback, and recovery procedures by separating the running software environment from persistent application data.

Network architecture

GitHub Enterprise may be run as a single virtual machine host or in a two-node active/passive configuration for increased redundancy in high availability deployments.

In the single-node case, network configuration is straightforward and requires only that the GitHub Enterprise instance be reachable from the user population and for administrative purposes. You may choose to make the instance network accessible over the public internet (in private mode only) or restrict access to specific closed networks. See Network Port Configuration for details on specific network ports required for application and administrative use.

The two-node active/passive configuration is more complex and requires that two identical virtual machines be provisioned with separate storage and as much underlying hardware/network redundancy as possible. The instances must be reachable from each other over ports 122/TCP (SSH) and 1194/UDP (OpenVPN) for replication services and DNS must be configured with short TTL values for network failover.

See High Availability Configuration for detailed instructions on configuring a two-node active/passive configuration.

Data retention and datacenter redundancy

GitHub Enterprise includes support for online and incremental backups via the GitHub Enterprise Backup Utilities. Incremental snapshots are taken over a secure network link (the SSH administrative port) and may be performed over long distances for off-site / geographically dispersed storage. Backup snapshots may then be restored over the network into a newly provisioned GitHub Enterprise virtual machine at time of recovery in case of disaster at the primary datacenter.

In addition to network backups, both AWS (EBS) and VMware disk snapshots of the user storage volumes are supported while the appliance is offline / in maintenance mode. Regular volume snapshots can be used as a low-cost / low-complexity alternative to network backups with backup-utils if your service level requirements allow for regular offline maintenance.

We strongly recommend setting up backups of some form and a disaster recovery plan before using GitHub Enterprise in a production environment. See Backups and Disaster Recovery for detailed instructions.