Adding a Replica back into a SQL Server Availability Group

In one of my last blog postings I have already described a common pitfall when you try to add a new Replica into an Availability Group. In today’s blog posting I want to talk a little bit more about how you can add a Replica back into an Availability Group – when you have removed it previously.

When should you remove a Replica from an Availability Group?

First of all I want to talk a little bit about why you should remove a Replica from an Availability Group. Of course a Replica shouldn’t be removed during normal business from an Availability Group. But imagine that a Replica is down, and can’t be brought online again in a reasonable time.

Your Availability Group will still work without any problems, but SQL Server has to store the generated Transaction Log Records on the Primary Replica, and can’t clear the Transaction Log anymore. Therefore the Transaction Log on the Primary Replica (and also on all Secondary Replicas!) will grow. In such a case it would be much safer to remove the unhealthy Replica temporarily from the Availability Group, so that SQL Server can reuse the Transaction Log on the Primary Replica.

ALTER AVAILABILITY GROUP [TPC-E_AG]
REMOVE REPLICA ON N'SQL2016AG-NODE3'
GO

As soon as you remove a Replica from an Availability Group, all corresponding databases are moved into the Restoring State as you can see from the following picture.

Removing a Database from an Availability Group

Therefore you also can’t access the databases, because SQL Server hasn’t yet run the Undo Part of Crash Recovery. If you want to access a database, you have to bring it online with a RESTORE DATABASE TPC-E WITH RECOVERY. But this is not really a wise decision…

Adding a Replica back into a Availability Group

The problem with a RESTORE DATABASE TPC-E WITH RECOVERY is that the database is brought online again, and can’t be added back to the Availability Group anymore. If you want to join the Replica back into the Availability Group, you have to create it from scratch again (maybe through Database Seeding, which is new in SQL Server 2016).

A better option is to leave the databases as-it, because then you have the option to Join the Replica back into the Availability Group.

Joining a Replica back into an Availability Group

If you want to Join the Replica back into the Availability Group via T-SQL code, you can see in the following listing the necessary code:

-- Primary Replica	
ALTER AVAILABILITY GROUP [TPC-E_AG]
ADD REPLICA ON N'SQL2016AG-NODE3' WITH
(
   ENDPOINT_URL = N'TCP://10.10.1.3:5022',
   FAILOVER_MODE = MANUAL,
   AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT,
   BACKUP_PRIORITY = 50,
   SECONDARY_ROLE(ALLOW_CONNECTIONS = NO)
)
GO

-- New Secondary Replica
ALTER AVAILABILITY GROUP [TPC-E_AG] JOIN
GO

ALTER DATABASE [TPC-E] SET HADR AVAILABILITY GROUP = [TPC-E_AG]
GO

This approach works quite well, because all generated Transaction Log Records from the Primary Replica will be just send to the added Secondary Replica, and everything is fine again. But what happens if you have already done a Transaction Log Backup on the Primary Replica since you have removed a Secondary Replica from the Availability Group? Adding back the Replica into the Availability Group will just fail:

Msg 1412, Level 16, State 211, Line 24
The remote copy of database “TPC-E” has not been rolled forward to a point in time that is encompassed in the local copy of the database log.

This is not really a problem, because you have to restore the Transaction Log Backups on the Secondary Replica if you want to join it back to the Availability Group. If you restore a Transaction Log Backup on a temporarily removed Secondary Replica, you have to make sure to restore it with the NORECOVERY option, so that Crash Recovery doesn’t bring your database online again.

RESTORE LOG [TPC-E] FROM  DISK = N'E:\MSSQL13.MSSQLSERVER\MSSQL\Backup\TPC-E.trn'
WITH NORECOVERY
GO

If you have restored all Transaction Log Backups, you can again just Join the Replica back into the Availability Group – easy isn’t it?

Summary

Removing and adding back a Replica into an Availability Group is not a big deal, as long as you leave the databases in the Restoring State. As soon as you have brought a database on a removed Secondary Replica online, you have to reinitialize the Replica from scratch.

Please keep that in mind when you work with larger databases, and you have to remove them temporarily from an Availability Group.

Thanks for your time,

-Klaus

Leave a Comment

Your email address will not be published. Required fields are marked *