Friday, April 24, 2015

CUCM and CUC Publisher Rebuilt

The CUCM and CUC Publisher in my lab was crashed due to a disk failure.  Luckily my subscribers are in different LUN, and at least I don't need to rebuild the whole cluster.  This is what I have done and I want to share my experience and hiccup during the rebuilt in this post.

UCM Publisher Rebuilt

For UCM I am following this guide and it is a well written one.  This is what I have done based on the guide.

1. Gather Cluster Data on Subscriber

2 commands – show network cluster and show version active to get the existing cluster info

2. Stop DB Replication on all subscribers

This is important, you will not want the new publisher sync the NEW database with your existing one in subscriber.  You want the other way round, so stop the dbreplication service.

3. Install the new CUCM Publisher with the same hostname, IP address, domain name, security passphrase, exact UCM version and installed COP files

Install it with a bootable media.

4. Update Processnode Values on the Publisher

I am running 10.5(2), therefore I need to issue the command "utils diaster_recovery prepare restore pub_from_sub" command on the new publisher CLI before adding nodes to System > Server


Retrieve the node list from the existing subscriber – run sql select name,description,nodeid from processnode


Go the the Publisher UCM Admin Page, add the node after you receive the node list.

5. Reboot Publisher

Using the command "utils system restart"

6. Verify Cluster Authentication

Do it on publisher after it restarts, make sure the cluster in the "authenticated" state.

7. Perform a new backup

Add a Backup Device, I am using a linux machine to store the backup.
Start a manual backup


8. Publisher Restore from the Subscriber DB

I have encountered an issue during restore with the error message -  "Unable to send network request to master agent.  This may be due to Master or Local Agent being down".

I have tried a few things
- Regenerate ipsec cert and restart DRF master and local agent – it doesn't work

Solution
  • Remove cup1 and cup2 in Server list on publisher UCM admin page.  Then it works.  DRF requires all host up and running in the server list.  One of my CUP node is not responding (due to my disk LUN failure)
Check the Publisher node check box (UCM1) and choose the subscriber DB from which restoration takes place, in my case UCM2, then click Restore.


9. Restore Status

When the restoration reaches the CCMDB component, the status text shows "Restoring Publisher from Subscriber Backup"


10. Run a Sanity Check on the Publisher DB

These 2 SQL statements will give you a gut feeling if the DB restore works or not.

11. Reboot the Cluster after restore

12. Verify Replication Setup




13. Post Restore

Activate services and install device packs

CUC Publisher Rebuilt

Steps for CUC Publisher Rebuilt are similar.

1. Gather Cluster Data


2. Stop Replication on All Subscribers



3. Install the CUC Publisher

4. Update Processnode Values on the Publisher



5. Reboot the Publisher Node


6. Verify Cluster Authentication

7. To Connect the Subscriber Server to the New Connection Cluster, and Replicate Data and Messages to the Publisher Server

This step is different.  We are not using DRS to do the DB restore.  Run the command "utils cuc cluster renegotiate" on subscriber


The publisher server will automatically restarts.

"show cuc cluster status" on subscriber to verify new cluster has been configured correctly.




Good luck!