Skip to main content

Esta versión de GitHub Enterprise Server se discontinuó el 2024-12-19. No se realizarán lanzamientos de patch, ni siquiera para problemas de seguridad críticos. Para obtener rendimiento mejorado, seguridad mejorada y nuevas características, actualice a la versión más reciente de GitHub Enterprise Server. Para obtener ayuda con la actualización, póngase en contacto con el soporte técnico de GitHub Enterprise.

Reemplazar un nodo de agrupación

Si se produce un error en un clúster de GitHub Enterprise Server, o si quieres agregar un nuevo nodo con más recursos, marca los nodos que quieres reemplazar como sin conexión y agrega el nuevo nodo.

¿Quién puede utilizar esta característica?

GitHub determina la idoneidad para la agrupación en clústeres y debe habilitar la configuración de la licencia de la instancia. La agrupación en clústeres conlleva una planeación cuidadosa y una sobrecarga administrativa adicional. Para más información, consulta Acerca de las agrupaciones.

About replacement of GitHub Enterprise Server cluster nodes

You can replace a functional node in a GitHub Enterprise Server cluster, or you can replace a node that has failed unexpectedly.

After you replace a node, your GitHub Enterprise Server instance does not automatically distribute jobs to the new node. You can force your instance to balance jobs across nodes. For more information, see Rebalancing cluster workloads.

Warning

To avoid conflicts, do not reuse a hostname that was previously assigned to a node in the cluster.

Replacing a functional node

You can replace an existing, functional node in your cluster. For example, you may want to provide a virtual machine (VM) with additional CPU, memory, or storage resources.

To replace a functional node, install the GitHub Enterprise Server appliance on a new VM, configure an IP address, add the new node to the cluster configuration file, initialize the cluster and apply the configuration, then take the node you replaced offline.

Note

If you're replacing the primary database node, see Replacing the primary database node.

  1. Provision and install GitHub Enterprise Server with a unique hostname on the replacement node.

  2. Using the administrative shell or DHCP, only configure the IP address of the replacement node. Don't configure any other settings.

  3. To add the newly provisioned replacement node, on any node, modify the cluster.conf file to remove the failed node and add the replacement node. For example, this modified cluster.conf file replaces ghe-data-node-3 with the newly provisioned node, ghe-replacement-data-node-3:

    [cluster "ghe-replacement-data-node-3"]
      hostname = ghe-replacement-data-node-3
      ipv4 = 192.168.0.7
      # ipv6 = fd12:3456:789a:1::7
      consul-datacenter = PRIMARY-DATACENTER
      git-server = true
      pages-server = true
      mysql-server = true
      elasticsearch-server = true
      redis-server = true
      memcache-server = true
      metrics-server = true
      storage-server = true
    

    You can choose to defer database seeding of a new MySQL replica node, resulting in being able to open your appliance to traffic sooner. For more information, see Deferring database seeding.

  4. From the administrative shell of the node with the modified cluster.conf, run ghe-cluster-config-init. This will initialize the newly added node in the cluster.

  5. From the same node, run ghe-cluster-config-apply. This will validate the configuration file, copy it to each node in the cluster, and configure each node according to the modified cluster.conf file.

  6. If you're taking a node offline that provides data services, such as git-server, pages-server, or storage-server, evacuate the node. For more information, see Evacuating a cluster node running data services.

  7. To mark the failed node offline, on any node, modify the cluster configuration file (cluster.conf) in the relevant node section to include the text offline = true.

    For example, this modified cluster.conf will mark the ghe-data-node-3 node as offline:

    [cluster "ghe-data-node-3"]
    hostname = ghe-data-node-3
    offline = true
    ipv4 = 192.168.0.6
    # ipv6 = fd12:3456:789a:1::6
    
  8. From the administrative shell of the node where you modified cluster.conf, run ghe-cluster-config-apply. This will validate the configuration file, copy it to each node in the cluster, and mark the node offline.

Replacing a node in an emergency

You can replace a failed node in your cluster. For example, a software or hardware issue may affect a node's availability.

Note

If you're replacing the primary database node, see Replacing the primary database node.

To replace a node in an emergency, install the GitHub Enterprise Server appliance on a new VM, configure an IP address, take the failed node offline, apply the configuration, add the new node to the cluster configuration file, initialize the cluster and apply the configuration, and optionally, evacuate the failed node.

  1. Provision and install GitHub Enterprise Server with a unique hostname on the replacement node.

  2. Using the administrative shell or DHCP, only configure the IP address of the replacement node. Don't configure any other settings.

  3. To mark the failed node offline, on any node, modify the cluster configuration file (cluster.conf) in the relevant node section to include the text offline = true.

    For example, this modified cluster.conf will mark the ghe-data-node-3 node as offline:

    [cluster "ghe-data-node-3"]
    hostname = ghe-data-node-3
    offline = true
    ipv4 = 192.168.0.6
    # ipv6 = fd12:3456:789a:1::6
    
  4. From the administrative shell of the node where you modified cluster.conf, run ghe-cluster-config-apply. This will validate the configuration file, copy it to each node in the cluster, and mark the node offline.

  5. To add the newly provisioned replacement node, on any node, modify the cluster.conf file to remove the failed node and add the replacement node. For example, this modified cluster.conf file replaces ghe-data-node-3 with the newly provisioned node, ghe-replacement-data-node-3:

    [cluster "ghe-replacement-data-node-3"]
      hostname = ghe-replacement-data-node-3
      ipv4 = 192.168.0.7
      # ipv6 = fd12:3456:789a:1::7
      consul-datacenter = PRIMARY-DATACENTER
      git-server = true
      pages-server = true
      mysql-server = true
      elasticsearch-server = true
      redis-server = true
      memcache-server = true
      metrics-server = true
      storage-server = true
    

    You can choose to defer database seeding of a new MySQL replica node, resulting in being able to open your appliance to traffic sooner. For more information, see Deferring database seeding.

  6. If you're replacing the primary Redis node, in cluster.conf, modify the redis-master value with the replacement node name.

    Note

    If your primary Redis node is also your primary MySQL node, see Replacing the primary database node.

    For example, this modified cluster.conf file specifies a newly provisioned cluster node, ghe-replacement-data-node-1 as the primary Redis node:

    redis-master = ghe-replacement-data-node-1
    
  7. From the administrative shell of the node with the modified cluster.conf, run ghe-cluster-config-init. This will initialize the newly added node in the cluster.

  8. From the same node, run ghe-cluster-config-apply. This will validate the configuration file, copy it to each node in the cluster, and configure each node according to the modified cluster.conf file.

  9. If you're taking a node offline that provides data services, such as git-server, pages-server, or storage-server, evacuate the node. For more information, see Evacuating a cluster node running data services.

Replacing the primary database node (MySQL or MySQL and MSSQL)

To provide database services, your cluster requires a primary MySQL node and at least one replica MySQL node. For more information, see About cluster nodes.

If your cluster has GitHub Actions enabled, you will also need to account for MSSQL in the following steps.

If you need to allocate more resources to your primary MySQL (or MySQL and MSSQL) node or replace a failed node, you can add a new node to your cluster. To minimize downtime, add the new node, replicate the MySQL (or MySQL and MSSQL) data, and then promote it to the primary node. Some downtime is required during the promotion process.

  1. Provision and install GitHub Enterprise Server with a unique hostname on the replacement node.

  2. Using the administrative shell or DHCP, only configure the IP address of the replacement node. Don't configure any other settings.

  3. To connect to your GitHub Enterprise Server instance, SSH into any of your cluster's nodes. From your workstation, run the following command. Replace HOSTNAME with the node's hostname. For more information, see Accessing the administrative shell (SSH).

    Shell
    ssh -p 122 admin@HOSTNAME
    
  4. Open the cluster configuration file at /data/user/common/cluster.conf in a text editor. For example, you can use Vim. Create a backup of the cluster.conf file before you edit the file.

    Shell
    sudo vim /data/user/common/cluster.conf
    
  5. The cluster configuration file lists each node under a [cluster "HOSTNAME"] heading. Add a new heading for the node and enter the key-value pairs for configuration, replacing the placeholders with actual values.

    • Ensure that you include the mysql-server = true key-value pair.
    • If GitHub Actions is enabled in the cluster, you will have to include the mssql-server = true key-value pair as well.
    • The following section is an example, and your node's configuration may differ.
    ...
    [cluster "HOSTNAME"]
      hostname = HOSTNAME
      ipv4 = IPV4-ADDRESS
      # ipv6 = IPV6-ADDRESS
      consul-datacenter = PRIMARY-DATACENTER
      datacenter = DATACENTER
      mysql-server = true
      redis-server = true
      ...
    ...
    
  6. From the administrative shell of the node with the modified cluster.conf, run ghe-cluster-config-init. This will initialize the newly added node in the cluster.

  7. From the administrative shell of the node where you modified cluster.conf, run ghe-cluster-config-apply. The newly added node will become a replica MySQL node and any other configured services will run there.

    Note

    The previous snippet does not assume GitHub Actions is enabled in the cluster.

  8. Wait for MySQL replication to finish. To monitor MySQL replication from any node in the cluster, run ghe-cluster-status -v.

    If GitHub Actions is enabled in the cluster, you will have to wait for MSSQL replication to complete.

    Shortly after adding the node to the cluster, you may see an error for replication status while replication catches up. Replication can take hours depending on the instance's load, the amount of database data, and the last time the instance generated a database seed.

  9. During your scheduled maintenance window, enable maintenance mode. For more information, see Enabling and scheduling maintenance mode.

  10. Ensure that MySQL(or MySQL and MSSQL) replication is finished from any node in the cluster by running ghe-cluster-status -v.

    Warning

    If you do not wait for MySQL(or MySQL and MSSQL) replication to finish, you risk data loss on your instance.

  11. To set the current MySQL primary node to read-only mode, run the following command from the MySQL primary node.

    Shell
    echo "SET GLOBAL super_read_only = 1;" | sudo mysql
    
  12. Wait until Global Transaction Identifiers (GTIDs) set on the primary and replica MySQL nodes are identical. To check the GTIDs, run the following command from any cluster node.

    Shell
    ghe-cluster-each -r mysql -- 'echo "SELECT @@global.gtid_executed;" | sudo mysql'
    
    • To check that the global MySQL variable was set successfully, run the following command.
    Shell
     echo "SHOW GLOBAL VARIABLES LIKE 'super_read_only';" | sudo mysql
    
  13. If GitHub Actions is enabled in the cluster, SSH into the node that will become the new primary MSSQL node.

    Shell
    ssh -p 122 admin@NEW_MSSQL_NODE_HOSTNAME
    
    • From within a screen session run the following command to promote MSSQL to the new node.
    Shell
    /usr/local/share/enterprise/ghe-mssql-repl-promote
    

    This will attempt to access the current primary MSSQL node and perform a graceful failover

  14. After the GTIDs on the primary and replica MySQL nodes match, update the cluster configuration by opening the cluster configuration file at /data/user/common/cluster.conf in a text editor.

    • Create a backup of the cluster.conf file before you edit the file.
    • In the top-level [cluster] section, remove the hostname for the node you replaced from the mysql-master key-value pair, then assign the new node instead. If the new node is also a primary Redis node, adjust the redis-master key-value pair.
    • If GitHub Actions is enabled in the cluster, you will have to include the mssql-server = true key-value pair as well.
    [cluster]
      mysql-master = NEW-NODE-HOSTNAME
      redis-master = NEW-NODE-HOSTNAME
      primary-datacenter = primary
    ...
    
  15. In the administrative shell of the node where you modified cluster.conf, start a screen session and run ghe-cluster-config-apply. This command reconfigures the cluster, promoting the newly added node to the primary MySQL node and converting the original primary MySQL node into a replica.

    Note

    The previous snippet does not assume GitHub Actions is enabled in the cluster.

  16. Check the status of the MySQL(or MySQL and MSSQL) replication from any node in the cluster by running ghe-cluster-status -v.

  17. If GitHub Actions is enabled in the cluster, run the following command from the new MySQL and MSSQL node.

    Shell
    /usr/local/share/enterprise/ghe-repl-post-failover-mssql
    
  18. When the MySQL(or MySQL and MSSQL) replication is finished, from any node in the cluster, disable maintenance mode. See Enabling and scheduling maintenance mode.