Monitoring in AEM TarMK cold standby setup


The feature exposes information using JMX or MBeans. Doing so you can inspect the current state of the standby and the master using the JMX console. The information can be found in an MBean of type org.apache.jackrabbit.oak:type="Standby" named Status.
Standby
Observing a standby instance you will expose one node. The ID is usually a generic UUID.
This node has five read-only attributes:
  • Running: boolean value indicating whether the sync process is running or not.
  • Mode: Client: followed by the UUID used to identify the instance. Note that this UUID will change every time the configuration is updated.
  • Status: a textual representation of the current state (like running or stopped).
  • FailedRequests: the number of consecutive errors.
  • SecondsSinceLastSuccess: the number of seconds since the last successful communication with the server. It will display -1 if no successful communication has been made.
There are also three invokable methods:
  • start(): starts the sync process.
  • stop(): stops the sync process.
  • cleanup(): runs the cleanup operation on the standby.
Primary
Observing the primary exposes some general information via a MBean whose ID value is the port number the TarMK standby service is using (8023 by default). Most of the methods and attributes are the same as for the standby, but some differ:
  • Mode: will always show the value primary.
Furthermore information for up to 10 clients (standby instances) that are connected to the master can be retrieved. The MBean ID is the UUID of the instance. There are no invokable methods for these MBeans but some very useful readonly attributes:
  • Name: the ID of the client.
  • LastSeenTimestamp: the timestamp of the last request in a textual representation.
  • LastRequest: the last request of the client.
  • RemoteAddress: the IP address of the client.
  • RemotePort: the port the client used for the last request.
  • TransferredSegments: the total number of segments transferred to this client.
  • TransferredSegmentBytes: the total number of bytes transferred to this client.

Applying Hotfixes to a Cold Standby Setup


The recommended way to apply hotfixes to a cold stanby setup is by installing them to the primary instance and then cloning it into a new cold standby instance with the hotfixes installed.
You can do this by following the steps outlined below:
  1. 1. Stop the synchronization process on the cold standby instance by going to the JMX Console and using the org.apache.jackrabbit.oak: Status ("Standby")bean. For more information on how to do this, see the section on Monitoring.
  2. 2. Stop the cold standby instance.
  3. 3. Install the hotfix on the primary instance. For more details on how to install a hotfix, see How to Work With Packages.
  4. 4. Test the instance for issues after the installation. 
  5. 5. Remove the cold standby instance by deleting its installation folder.
  6. 6. Stop the primary instance and clone it by performing a file system copy of its entire installation folder to the location of the cold standby.
  7. 7. Reconfigure the newly created clone to act as a cold standby instance. For additional details, see Creating an AEM TarMK Cold Standby Setup.
  8. 8. Start both the primary and the cold standby instances.

Failover procedures in TarMK cold standby setup


In case the primary instance fails for any reason, you can set one of the standby instances to take the role of the primary by changing the start runmode as detailed below:
Note:
The configuration files also need to be modified so that they match the settings used for the primary instance.
  1. 1. Go to the location where the standby instance is installed, and stop it.
  2. 2. In case you have a load balancer configured with the setup, you can remove the primary from the load balancer's configuration at this point.
  3. 3. Backup the crx-quickstart folder from standby installation folder. It can be used as a starting point when setting up a new standby.
  4. 4. Restart the instance using the primary runmode:
    java -jar quickstart.jar -r primary,crx3,crx3tar
  5. 5. Add the new primary to the load balancer.
  6. 6. Create and start a new standby instance. For more info, see the procedure above on Creating an AEM TarMK Cold Standby Setup.