Monday, April 15, 2013

Lync Server 2013 Edge server replication issues on Windows Server 2012

Problem

You’ve completed a new greenfield deployment or successfully migrated from Lync Server 2010 to Lync Server 2013 with Windows Server 2012 servers as the base operating system but noticed that your Lync Edge servers are not replicating and executing the Invoke-CsManagementStoreReplicationStatus cmdlet then the Get-CsManagementStoreReplicationStatus display’s the following:

image

Note how the Lync front end server has True for UpToDate while the Edge server does not.

You’ve tried using the Lync Logging Tool on the front end server to log the following components:

  • XDS_File_Transfer_Agent
  • XDS_Master_Replicator
  • XDS_Replica_Replicator

image

… but could not capture any errors useful for the troubleshooting.

Deleting the RtcReplicaRoot folder on the Lync Edge server then running a repair on the Core Components also does not correct this issue.

Reviewing all of the application, system and Lync logs in the event viewer does not reveal any errors.

You’ve tried adding the SendTrustedIssuerList REG_DWORD registry key into HKLM\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL but that did not fix the issue:

clip_image001

Browsing the URL https://<lyncEdgeServer.someDomain.internal:4443/replicationwebservice loads the Windows Communicator Foundation service page properly with one abnormal behavior which is that you receive a Confirm Certificate prompt with the message:

Confirm this certificate by clicking OK. If this is not the correct certificate, click Cancel.

image

… clicking on OK brings you to the regular expected webpage:

image

Solution

This problem actually got me quite frustrated as I’ve done numerous deployments of Lync yet could not figure out why this particular environment gave me a problem I seemingly couldn’t find any clues that pointed me in the right direction so I opened up a support call with Microsoft.  The engineer spent almost a total of 2 days before he figured out what was wrong.

To make a long story short, Windows Server 2012 apparently is more stringent on performing certificate checks because Windows Server 2012 implements checks for a higher level of trust for certificate authentication.  A more detailed explanation can be found at the following KB:

Lync Server 2013 Front-End service cannot start in Windows Server 2012
http://support.microsoft.com/kb/2795828

First off, the reason why we were getting the strange Confirm Certificate prompt when we browse the https://<lyncEdgeServer.someDomain.internal:4443/replicationwebservice URL:

image

… is because I had a certificate in the Current User store:

image

You might be wondering why I had this certificate in there and it’s because I had to use our internal Enterprise CA’s enrollment webpage (/certsrv) to obtain the Lync Front End server’s certificate because we had RCC integration with the Avaya AES server and using the regular certificate tool in the Lync deployment wizard did not work.  This meant that I had to install it under the logged on user’s account, export it along with the private key from the current user store, then re-import it into the computer store.  I didn’t end up deleting the certificate so when I deleted it, closed Internet Explorer and navigated to the https://<lyncEdgeServer.someDomain.internal:4443/replicationwebservice URL, I was no longer prompted with the window landing directly onto the expected Windows Communication Foundation Service page:

image

Second, as per the KB article, Windows Server 2012 basically does not like certificates that are in the incorrect place.  We noticed that an intermediate certificate was placed in the Trusted Root Certification Authorities in the local computer store of the Edge server so it was removed:

image

We then proceeded to check for certificates in the Intermediate Certification Authorities and Trusted People stores to ensure there weren’t any that shouldn’t be in there.  Once we completed this, a restart of the replication services then followed by the Invoke-CsManagementStoreReplicationStatus cmdlet showed that the front end and Edge server began to replicate:

image

This was definitely one of the more difficult issues I’ve come across and seeing how I couldn’t find any helpful information on the internet, I hope this post will help anyone who may come across this in their environment.

15 comments:

Anonymous said...

Thanks a lot for this great post. It saved me a lot of time and frustration!

Anonymous said...

I've come across so many of your posts while deploying anything from Lync to VMware and I just wanted to tell you that you've done so much for the community without getting anything back (I noticed you don't have ads on your blog). I'm actually surprised that you're not a Lync MVP after all these years of contribution. Someone from Microsoft should recognize the effort you've put in to helping make their product successful. Thank you.

gunadaya said...

thanks....
Gunadaya.com

vertech perdana said...

pertamax.
Distributor Siemens

Anonymous said...

Thank you for this, saved me several hours for sure.

Jop Gommans said...

Great post Terence, very good information and still useful today!

Marco said...

Not for me. I still could not get a replica from FE.

Anonymous said...

I'm having the exact same issue but with my Lync 2010 Edge servers which are running Server 2008 R2.

Anonymous said...

Perfect! After reading so many other articles, this was the fix for my Lync 2013 Edge on 2012. We had certs expire in the spring. The team hastily resolved this. However, we suspect the server was never rebooted. Months later we discovered updates had not been applied due to a DMZ/FW configuration error. Once we fixed that, we rebooted last week and discovered a replication issue on one of the two edges. Following this post returned this edge to replication as we indeed found a duplicate cert that was already in the intermediate also in the trusted. Thank you so much!

Anonymous said...

Worked Thank you very much. You should sent this to MS so that they can put it on there TechNet database

Paul Bloem said...

Hi Thomas, Cheers for this. Found the suspect cert, removed it and replication started reporting back as a green check. You're still the expert :-) Regards, Paul

Андрей Алешенков said...

Cool! Issue was resolved. Your post made me happy))
Thanks a lot

slgray said...

Same issue here, it helps me a lot. Thanks.

DAVID SUNDAY said...

Thanks so much buddy.....Great! it helped me

Anonymous said...

Good call. Worked for me, looking for this answer for 2 day's now. Thanks. Did exactly all the steps you did above with the same problem. On Skype For business 2015 server. Finally found your post and solved the mystery.