Tuesday, February 1, 2011

“The trust relationship between this workstation and the primary domain failed.” after P2V / converting a physical server to a virtual machine

I was a part of a datacenter virtualization project a few months ago where we were p2v / cloning old physical hosts to virtual machines that was hosted by a new vSphere environment we have built and I received a call during one of the days when I wasn’t working on the project by my colleague about receiving the following error message when trying to log into the newly virtualized VM:

The trust relationship between this workstation and the primary domain failed.

image

Rather than jumping into what we did for the situation, let me list out some situations that could lead to this:

Scenario #1

Method: You’ve lived / hot cloned a physical server using vCenter Converter’s agent.

Reason: When you perform a live / hot clone a physical server, the source’s data drives are essentially snapshot-ed so the agent can copy data from a static source.  This means that the server is actually still operating and can change during the cloning process.  With respect to Active Directory, every computer joined to the domain actually has a computer password that we do not see and these passwords get reset over a certain amount of days (30 days is usually the default).  This ultimately means that there is a small chance that during the server’s computer password could have changed during cloning process because it has reached the 30 days.  This is the equivalent of taking your domain joined laptop and trying to use a Windows Vista or 7 restore to a year ago.  If you ever did that, you would most likely not be able to log onto your corporate domain anymore because you have restored your laptop is not using a computer password that has been changed.  For a bit more information about this, see one of my earlier posts: http://terenceluk.blogspot.com/2010/12/what-will-happen-to-laptops-computers.html

Scenario #2

Method: You’ve cold cloned a physical server and had booted up the physical server again with access to the network after the cloning completed

Reason: The same reason applies to this scenario as it does with scenario #1.  The difference is that you’ve booted up your cold cloned physical server after the p2v process and there is a small chance that the server’s computer password age has reached end of life and therefore the physical server’s password changed.  This is why I always disconnect the NIC connections on the physical server if I ever had to boot the original server back up to, say, validate some settings.

Whether it’s scenario #1 or #2, keep in mind that the chances of this happening is extremely slim and throughout all the years I’ve been involved with cloning servers, it has only happened maybe 10 times.  With all that being said, I still try my best to always cold clone and not put cloned servers back on the network because other than something like this happening, there is always a chance that the server begins to serve its services and you may end up having users work off of a to-be-decommissioned server.

Resolutions

Whenever I come across such a problem, I usually propose 2 solutions:

Resolution #1

Method: Simply re-clone the server if time permits.

Reason: This will ensure that all the data is up-to-date.

Resolution #2

Method: If the server does not have any services or applications dependent on the domain, reset the computer account then disjoin and rejoin the domain.

Reason: I usually prefer not to do this because there will be applications out there that can break if you disjoin and rejoin a server to the domain.  If you do decide to opt for this method, I would like to make it clear that you should RESET the computer account and not DELETE the computer account in Active Directory.  The reason is because when you reset a computer account, the server that is rejoined to the domain will retain the same GUID and SID while if you delete a computer account, the server you rejoin to the domain will create a new object and therefore have a new SID and GUID. 

Once of the other things you’ll most likely notice that you can’t log onto the domain with your domain account because your domain controllers no longer trust this server.  If you don’t have the local administrator password, one of the ways to get around this is to disconnect your NIC connections for this virtual machine which would look like this if it was a VMware ESXi virtual machine:

image

Once you’ve disconnected the NIC, you should now be able to log onto the virtual machine with any account that you’ve used to log onto it before because of cached credentials.  This method will not work if your domain has set a policy to not allow servers to cache credentials so if that’s the case, use a password reset CD like Hirens to reset the local administrator password.

2 comments:

Anonymous said...

Brilliant article. We had exactlt this issue but there are no VCP`s here so the simple fix of disconnecting the Virtual NIC sorted us.

PowerFET said...

Thank you Terence,

I read your post a little too late and allready had reset the computer account on the domain controller. So the original Server was without connection as well...

As my Server with the broken trust was a Exchange Server 2010 a removal from the domain was not an option for me (or at least not the first :-).
I searched arround a bit on the reset computer account problem and found a solution using the command NETDOM /RESETPWD
As long as the SID of the server didn't change and the computer account wasn't deleted, that should help as well in case your two sugestions are not applicable.

Maybe this helps as addition to your great post. Thanks again.

Cheers