Skip to main content

Deferring database seeding

You can speed up the process of adding a new MySQL replica node to your cluster by opting to defer database seeding.

Who can use this feature?

GitHub determines eligibility for clustering, and must enable the configuration for your instance's license. Clustering requires careful planning and additional administrative overhead. For more information, see "About clustering."

About deferring database seeding of a MySQL replica node

Note

The ability to defer database seeding was added in patch release 3.12.1 and is available as a public beta.

Adding a new MySQL replica node to your cluster when your primary node has more than seven days of data will normally trigger database seeding which can take several hours depending on the amount of data. You can choose to defer database seeding, allowing the config apply run to complete sooner, resulting in being able to open your appliance to traffic sooner.

You should only defer database seeding if you have already configured at least one MySQL replica, and you are adding an additional MySQL replica. Otherwise, there is no MySQL redundancy until the seeding is complete.

When you defer database seeding, the new MySQL replica will not be configured for replication nor have a copy of the data on your MySQL primary node until seeding is manually completed later.

Configuring deferral of MySQL seeding when adding a new MySQL node

  1. Provision and install GitHub Enterprise Server with a unique hostname on the replacement node.

  2. Using the administrative shell or DHCP, only configure the IP address of the replacement node. Don't configure any other settings.

  3. Create the cluster.conf entry for the new MySQL node and include the skip-data-setup = true field. The example below adds a new node with the hostname ghe-data-node-3 and the mysql-server role.

     ...
     [cluster "ghe-data-node-3"]
       hostname = ghe-data-node-3
       ipv4 = 192.168.0.9
       # ipv6 = fd12:3456:789a:1::7
       mysql-server = true
       skip-data-setup = true
     ...
     
  4. To initialize the new node in the cluster, from the administrative shell of the node with the modified cluster.conf, run ghe-cluster-config-init.

  5. To validate the configuration file, and also copy and configure each node according to the modified cluster.conf file, run ghe-cluster-config-apply.

Manually seeding the data

Once you have run ghe-cluster-config-apply, the MySQL service will be running on your new node but will not be configured as a replica nor will it be seeded with data from the MySQL primary node. To seed data from the MySQL primary node, you will need to configure replication manually.

  1. From the new node, to start manual replication of the MySQL primary node data, run the following command, replacing PRIMARY_IP with the IP address of the node running the MySQL primary.

    /usr/local/share/enterprise/ghe-mysql-repl-start PRIMARY_IP
    

The time required for database seeding is dependent on the size of data. For large datasets, we recommend running the above command in a screen session to ensure it survives SSH disconnects.