Monday, February 21, 2011

Microsoft Lync Server 2010 outbound calls through Mediation server returns fast busy with: “CONNECTION: Failed to complete outbound connection” and “DIAGNOSTIC: Message was not sent because the connection was closed” in Snooper logs

I received a call from a client the other day that their Lync Server 2010 environment could not place outbound calls to the PSTN network.  While on the phone with the administrator I had him do a few tests for me and it appears that Lync calls were working fine but as soon as they make a call that needs to go through the mediation server, they would get a fast busy immediately.  Since it was hard to diagnose the problem without actually having access to the servers, I ended up having the administrator share out his desktop so I can perform a trace with Snooper.  What I saw was the following (note that I had to replicate this in our internal environment because I didn’t do any screenshots during the troubleshooting):

Problem

When a user makes a call to the PSTN network, they would get a fast busy with the error:

Operation was unsuccessful.

clip_image001

Tracing the call with Snooper will show messages similar to the following:

TL_INFO(TF_PROTOCOL) [0]09F4.17FC::02/15/2011-02:22:29.215.00037882 (SIPStack,SIPAdminLog::TraceProtocolRecord:SIPAdminLog.cpp(125))$$begin_record
Trace-Correlation-Id: 3173659191
Instance-Id: 000009ED
Direction: outgoing;source="local"
Peer: 172.20.13.52:3884
Message-Type: response
Start-Line: SIP/2.0 101 Progress Report
From: "Terence Luk"<sip:tluk@domain.com>;tag=1398b83f90;epid=b7128e13bc
To: <sip:+12892596319@domain.com;user=phone>
CSeq: 1 INVITE
Call-ID: ea2c902621034902abcc41f1693207dc
Authentication-Info: TLS-DSK qop="auth", opaque="BB5872E3", srand="97793C11", snum="111", rspauth="2fa796b41f58a0581fb4fc7631dcc1e031e34424", targetname="lync01.domain.com", realm="SIP Communications Service", version=4
Content-Length: 0
Via: SIP/2.0/TLS 172.20.13.52:3884;ms-received-port=3884;ms-received-cid=4E00
ms-diagnostics: 12006;reason="Trying next hop";source="lync01.domain.com";PhoneUsage="Default Usage";PhoneRoute="GW1";Gateway="172.25.4.19";appName="OutboundRouting"
Server: OutboundRouting/4.0.0.0
Message-Body: –
$$end_record

image

TL_ERROR(TF_CONNECTION) [1]09F4.17FC::02/15/2011-02:22:30.216.000378fa (SIPStack,SIPAdminLog::TraceConnectionRecord:SIPAdminLog.cpp(160))$$begin_record
LogType: connection
Severity: error
Text: Failed to complete outbound connection
Peer-IP: 172.25.1.40:5070
Peer-FQDN: ocspool3.domain.com
Connection-ID: 0x6401
Transport: TLS
Result-Code: 0x8007274d WSAECONNREFUSED
Data: fqdn="ocspool3.domain.com";peer-type="InternalServer";winsock-code="10061"
$$end_record

image

TL_ERROR(TF_DIAG) [1]09F4.17FC::02/15/2011-02:22:30.216.00037928 (SIPStack,SIPAdminLog::TraceDiagRecord:SIPAdminLog.cpp(143))$$begin_record
LogType: diagnostic
Severity: error
Text: Message was not sent because the connection was closed
SIP-Start-Line: INVITE sip:+12892596319@172.25.4.19:5070;user=phone;maddr=ocspool3.domain.com SIP/2.0
SIP-Call-ID: ea2c902621034902abcc41f1693207dc
SIP-CSeq: 1 INVITE
Peer: ocspool3.domain.com:5070
$$end_record

image

TL_WARN(TF_DIAG) [1]09F4.17FC::02/15/2011-02:22:30.217.00037e80 (SIPStack,SIPAdminLog::TraceDiagRecord:SIPAdminLog.cpp(145))$$begin_record
LogType: diagnostic
Severity: warning
Text: Routing error occurred; check Result-Code field for more information
Result-Code: 0xc3e93c7f SIPPROXY_E_ROUTING_MSG_SEND_CLOSED
SIP-Start-Line: INVITE sip:+12892596319@172.25.4.19:5070;user=phone;maddr=ocspool3.domain.com SIP/2.0
SIP-Call-ID: ea2c902621034902abcc41f1693207dc
SIP-CSeq: 1 INVITE
Peer: ocspool3.domain.com:5070
$$end_record

image

TL_INFO(TF_DIAG) [1]09F4.1074::02/15/2011-02:22:30.219.00038d8e (SIPStack,SIPAdminLog::TraceDiagRecord:SIPAdminLog.cpp(147))$$begin_record
LogType: diagnostic
Severity: information
Text: Response successfully routed
SIP-Start-Line: SIP/2.0 504 Server time-out
SIP-Call-ID: ea2c902621034902abcc41f1693207dc
SIP-CSeq: 1 INVITE
Peer: 172.20.13.52:3884
Data: destination="tluk@domain.com"
$$end_record

image

TL_INFO(TF_PROTOCOL) [1]09F4.1074::02/15/2011-02:22:30.219.00038dd3 (SIPStack,SIPAdminLog::TraceProtocolRecord:SIPAdminLog.cpp(125))$$begin_record
Trace-Correlation-Id: 3173659191
Instance-Id: 000009EE
Direction: outgoing;source="local"
Peer: 172.20.13.52:3884
Message-Type: response
Start-Line: SIP/2.0 504 Server time-out
From: "Terence Luk"<sip:tluk@domain.com>;tag=1398b83f90;epid=b7128e13bc
To: <sip:+12892596319@domain.com;user=phone>;tag=52AB317D6C67AB3ED7A790C81F68F7DF
CSeq: 1 INVITE
Call-ID: ea2c902621034902abcc41f1693207dc
Authentication-Info: TLS-DSK qop="auth", opaque="BB5872E3", srand="6809B92E", snum="112", rspauth="7ba9c8aba1740720fa76e2d5fc5fb94d89caa601", targetname="lynC01.domain.com", realm="SIP Communications Service", version=4
Via: SIP/2.0/TLS 172.20.13.52:3884;ms-received-port=3884;ms-received-cid=4E00
ms-diagnostics: 1038;reason="Failed to connect to a peer server";WinsockFailureCode="10061(WSAECONNREFUSED)";WinsockFailureDescription="The peer actively refused the connection attempt";Peer="ocspool3.domain.com";Port="5070";source="lynC01.domain.com"
Server: OutboundRouting/4.0.0.0
Content-Length: 0
Message-Body: –
$$end_record

image

Solution

The reason why no PSTN calls were working was actually quite simple and that’s because the mediation service was stopped. 

image

What tipped me off in the Snooper logs was the reference to port 5070 and the pool:

Peer="ocspool3.domain.com";Port="5070";source="lynC01.domain.com"

Referencing the purpose of different ports used as shown in the following TechNet article: http://technet.microsoft.com/en-us/library/gg398833.aspx

image

Port 5070 is “Used by the Mediation Server for incoming requests from the Front End Server to the Mediation Server.” and seeing how the logs show that the front-end server is unable to connect to itself on port 5070, it was obvious that there’s something wrong with the service so starting the mediation server fixed the problem:

image

No comments: