Monday, September 16, 2013

New Lync Server 2013 deployment’s Front-End service on Windows Server 2012 fails to start with multiple errors in event logs

Problem

You’ve deployed a new Lync Server 2013 deployment on a Windows Server 2012 server but noticed that immediately after successfully installing the services, issuing and assigning certificates, the Lync Server Front-End service fails to start:

image

Reviewing the Lync Server event logs show the following events:

  • Error – 12308
  • Error – 32201
  • Information – 32189
  • Error – 30941
  • Error – 32175
  • Error – 32178
  • Warning – 32174
  • Information – 32189
  • Error – 32178
  • Information – 32189
  • Error – 32178
  • Information – 32189
  • Error – 30988
  • Error – 32178

… and so on:

image

The details to the events are as follows:

Event ID Error 12309:

A component could not be started. The service has to stop.

Component: Live Communications User Services Error code: 80004005!_HRX! (Unspecified error

!_HRM!)

image

Event ID Error 32201:

Failed to flush data to backup store.

Cause: This may indicate a problem with connectivity to local or backup database or some unknown product issue.

Resolution:

Ensure that connectivity to local and backup database is proper. If the error persists, please contact product support with server traces.

image

Event ID Error 32189:

The following Fabric service for routing groups have been closed:

{8EC325CB-B512-587D-9D03-E940E7CC1490}

{8EC325CB-B512-587D-9D03-E940E7CC1490}

{8EC325CB-B512-587D-9D03-E940E7CC1490}

{8EC325CB-B512-587D-9D03-E940E7CC1490}

{8EC325CB-B512-587D-9D03-E940E7CC1490}

{8EC325CB-B512-587D-9D03-E940E7CC1490}

{8EC325CB-B512-587D-9D03-E940E7CC1490}

.

image

Event ID Error 30941:

Initialize failure.

Error code: 80004005

image 

Event ID Error 32175:

Server is being shutdown because fabric pool manager could not complete initial placement of users.

Cause: This can happen if insufficient number of Front-Ends are available in the Pool.

Resolution:

Ensure that all the Front-Ends configured for this Pool are up and running. If multiple Front-Ends have been recently decommissioned, run Reset-CsPoolRegistrarState -ResetType QuorumLossRecovery to enable the Pool to recover from Quorum Loss and make progress.

image

Event ID Error 32178:

Failed to sync data for Routing group {8EC325CB-B512-587D-9D03-E940E7CC1490} from backup store.

Cause: This may indicate a problem with connectivity to backup database or some unknown product issue.

Resolution:

Ensure that connectivity to backup database is proper. If the error persists, please contact product support with server traces.

image

Event ID Warning 32174:

Server startup is being delayed because fabric pool manager has not finished initial placement of users.

Currently waiting for routing group: {8EC325CB-B512-587D-9D03-E940E7CC1490}.

Number of groups potentially not yet placed: 1.

Total number of groups: 1.

Cause: This is normal during cold-start of a Pool and during server startup.

If you continue to see this message many times, it indicates that insufficient number of Front-Ends are available in the Pool.

Resolution:

During a cold-start of a large Pool it can take upto an hour for the placement process to finish as it needs to populate all the Front-End databases with data from the Backup Store. If the Pool is running and the Front-End is just started, this is normal for some time. If this repeats for a long time, ensure that all the Front-Ends configured for this Pool are up and running. If multiple Front-Ends have been recently decommissioned, run Reset-CsPoolRegistrarState -ResetType QuorumLossRecovery to enable the Pool to recover from Quorum Loss and make progress

image

You’ve tried using the cmdlet Reset-CsPoolRegistrarState -ResetType QuorumLossRecovery but the front-end service continues to fail to start.

Solution

For those who have came across one of my previous posts:

Lync Server 2013 Edge server replication issues on Windows Server 2012
http://terenceluk.blogspot.com/2013/04/lync-server-2013-edge-server.html

Lync Server Access Edge service fails to start with: “… service-specific error code -2146762487”
http://terenceluk.blogspot.com/2013/05/lync-server-access-edge-service-fails.html

… will know that I’ve ran into a few challenges with Lync Server 2013 Edge servers on a Windows Server 2012 operating system.  As noted in the posts above, Windows Server 2012 is more stringent when it comes to trusted certificates and actions such as mistakenly putting an intermediate certificate in the trusted root certificate store can cause replication to stop working between the Edge and front end server.  What’s unfortunate about these issues with having certificates in the incorrect / wrong store is that the event logs doesn’t mention anything remotely suggesting that the issue has to do with certificates.  In this front-end server example, the issue was caused by legacy GPOs placing intermediate QuoVadis certificates into the incorrect store as shown in the following screenshot:

image

Note that the certificates such as QuoVadis Issuing Certification Authority 2 and the others highlighted in red are all Intermediate Certificates but placed into the Trusted Root Certification Authorities:

image image

Having worked with various clients’ Active Directory over the past few years, I’ve noticed that something like this happens quite often so the solution is to remove the GPO that is putting the certificate into the Trusted Root Certification Authorities store and then manually deleting or move the certificates on the Lync Server to the appropriate store.  The front end server will start once the certificate issue is resolved:

image

In case anyone is looking for a solution to automate removing these certificates from other servers, have a look at one of my old posts here:

How to remove a trusted Certificate Authority from “Trusted Root Certification Authorities” certificate store on workstations in an Active Directory domain
http://terenceluk.blogspot.com/2012/05/how-to-remove-trusted-certificate.html

7 comments:

Jason M Hindson said...

There is a KB article addressing the certificate issue you mentioned: http://support.microsoft.com/kb/2795828

The KB includes a useful PS one-liner to identify incorrectly located certs:

Get-Childitem cert:\LocalMachine\root -Recurse | Where-Object {$_.Issuer -ne $_.Subject} | Format-List * | Out-File "c:\computer_filtered.txt"

Jason M Hindson said...

There is a KB article addressing the certificate issue you mentioned: http://support.microsoft.com/kb/2795828

The KB includes a useful PS one-liner to identify incorrectly located certs:

Get-Childitem cert:\LocalMachine\root -Recurse | Where-Object {$_.Issuer -ne $_.Subject} | Format-List * | Out-File "c:\computer_filtered.txt"

Jason M Hindson said...

Here is another option which finally fixed my problem with the Lync Front End service not starting:

if not solved, you need to do a registry alteration. create the registry value as mentioned below

HKLM\System\CurrentControSet\Control\SecurityProviders\Schannel\

create a Key called "ClientAuthTrustMode" and the DWORD Value=2

reboot the server

Hector Bravo said...

I solved the problem adding the registry key in Windows Server 2012

HKLM\System\CurrentControSet\Control\SecurityProviders\Schannel\

create a Key called "ClientAuthTrustMode" and the DWORD Value=2


Thanks a lot...

Rob GaatJeNixAn said...

why are the certs needed in the Intermediate Store? On my server 2012R2 with Lync 2013 the certs are only in Personal store. Works fine....

Selma said...

I've tried everything including the resolution stated in Microsoft KB2795828 and Reset-CsPoolRegistrarState ..., etc. on my Lync 2013 Enterprise server running on Win2012R2, but none of them worked - the Front-End service was stuck on "Starting". Finally it's adding the DWORD registry key "ClientAuthTrustMode" with Value=2 in
HKLM\System\CurrentControSet\Control\SecurityProviders\Schannel\
solved the problem.

Boxxer said...

Encountered this issue as well on our new installation.
Lync Server 2013 on Server 2012 R2.

At the end, the solution to get the serive running was as described by Jason and Hector,..
Adding a registry key called "ClientAuthTrustMode" and the DWORD Value=2
in HKLM\System\CurrentControSet\Control\SecurityProviders\Schannel\
and a Server reboot solved this issue.
Thanks a lot...