Running a virtual Domain Controller

One of the things I’ve been prevaricating about with the new Hyper-V system at home is whether or not to join the host server to the virtual DC which in turn hosts the Active Directory forest and domain for the lab environment.

Running a virtual DC in either a production or a lab environment isn’t difficult, but there are a few gotchas.  I wrote an in-depth piece about this on 4sysops.com, so I won’t repeat it all here.  The only slight difference is that the other domain-joined virtual servers on the Hyper-V host have a start delay of 180 seconds.  120 should be sufficient, but I’m planning on some relatively complex setups so I’m just playing it safe.

Brien Posey over at VirtualizationAdmin.com wrote a very good series of articles about the various pitfalls of setting up an AD environment which includes virtual DCs.  The series is well worth a read (particularly the bit where his entire lab environment gets fried by a lightning strike!) but one of the key things he mentions is the limitation of having a virtual domain sat on top of a Hyper-V host which isn’t part of that domain – namely, backup and management.

If the Hyper-V parent partition is excluded from the domain, backup programs like DPM are not able to protect that partition.  Additionally, it makes remote management a lot more difficult to set up and as for using SCVMM – forget it.  These are the show-stoppers for me, as many of the labs I’m planning revolve around the System Center suite of products, so it looked like there was no option but to get the parent partition on the domain.

Ben Armstrong (Virtual PC Guy) summarises this problem very well in this post, and in a later post details some very useful tips for streamlining the process so that if you have to restart the parent partition, you’re not waiting for ages before you can log in, wondering whether it’s just a waiting game or whether something is rotten in the state of hypervisor…

Ben’s solution is two-fold:

1 – Disable the use of cached credentials – cached credentials are nice things, but they can mask a serious problem.  Basically, just because you’ve been able to log into the parent partition using AD credentials, doesn’t mean that the parent partition has actually authenticated to the DC.  By disabling cached credentials, you’re ensuring that every successful logon attempt has only been successful because the DC has handled the attempt.

To do this, open REGEDIT on the parent partition and navigate to HKEY_LOCAL_MACHINESOFTWAREMicrosoftWindows NTCurrentVersionWinlogon.  Find the string called CachedLogonsCount and change the value to 0 (the default is 10).

2 – Force re-registration of the DC in DNS – The DNS service on the DC generally comes up after the AD services.  Therefore, the DC hasn’t registered itself in DNS and no domain-joined machines can find it, hence no logon.  by default, it will attempt to re-register itself in 10 minutes, but that’s another 10 minutes of inactivity, which is boooooring.

To get around this, Ben recommends forcing the DC to re-register itself in DNS on system startup by creating a batch file which does all the necessary work and running it as a Scheduled Task on the DC using the domain Administrator credentials.  The batch file contains the following lines:

  • ipconfig /flushdns
  • ipconfig /registerdns
  • nltest /dsregdns

Create an appropriate task in Scheduled Tasks and give it a test run.

From my own experience on the lab system, everything ran perfectly and after its first restart from joining the domain, I was able to log into the parent partition successfully within a couple of minutes of the OS firing up.

The big question is: is this architecture acceptable in a production environment?

My gut reaction is to say Yes. I understand that many IT pros are nervous with the idea of running all the DCs in a forest on virtual platforms, the fear being that should something go wrong, it’s much harder to manage your way to a resolution.  My feeling is that even you have a DC running on physical hardware, you’re not really minimising the risk.  If it’s all about risk mitigation that you can run up multiple DCs spread across different physical hosts, make use of a DR site or even spool something up on a cloud platform.

Also, it’s not really that difficult to manage a domain-joined hypervisor host if its domain is down – you just need console access and the local Administrator account, or in the case of VMware, the vSphere client and the root password.  All of these disaster scenarios can be recovered from, and my feeling is that a fully virtual environment gives you more opportunity to protect the systems and thereby recover faster, not slower.

So – all is on the domain, so now we move onto remote management from my non-domain-joined workstation. Fun!

    5 comments to Running a virtual Domain Controller

    • Good article but one very important thing to remember.

      From the host, if you’re using wbadmin to do the backup of your Host + Guests this will not backup the System State of the Domain Controller in a Guest. You MUST schedule a separate job from the Guest to backup the System State otherwise you’ll have a very sick DC with no system state to restore which is a nightmare!

      The normal command to schedule this from a Physical (Host) is
      wbadmin start systemstatebackup -backuptarget::

      But, to do this from a Virtual (Guest) can be hard, as you do not have a direct attached drive to do this to. So, you must use:
      wbadmin start backup -backupTarget:: OR -backupTarget:hostshare -allCritical

      For Windows°7 and Windows Server 2008 R2, creates a backup that includes the system state in addition to any other items that you specified with the -include parameter. The system state contains boot files (Boot.ini, NDTLDR, NTDetect.com), the Windows Registry including COM settings, the SYSVOL (Group Policies and Logon Scripts), the Active Directory and NTDS.DIT on Domain Controllers and, if the certificates service is installed, the Certificate Store. If your server has the Web server role installed, the IIS Metadirectory will be included. If the server is part of a cluster, Cluster Service information will also be included.

      You can also use:
      wbadmin start backup -backupTarget:: OR -backupTarget:hostshare -systemState

      More information on wbadmin can be found here: //technet.microsoft.com/en-us/library/cc754015(WS.10).aspx

      Cheers,

      Greg Lipschitz
      The Summit Group (Australia) Pty Ltd

    • James Bannan

      Thanks for the extra info Greg. I’m definitely running an extra backup job on the DC using Windows Backup to regularly back up the entire system, including system state. Yay for paranoia 🙂

    • mdexch

      My physical host system (say foo) is running Windows 2008 R2 EE and is member of my company’s domain. On foo.mycompany.com, I’ve enable hyper-v and created a new forest, test.local. This forest/doamin has only one DC dc1.test.local running as vm on foo.mycompany.com. dc1.test.local is also running DNS. What I’m seeing is that when ever I have dc1.test.local running, I see some network problem on foo.mycompany.com, like access to other server over network in LAN is slow. If I’ve remote desktop to other servers, response becomes slows. If I power off dc1, my network glitches go away.

      So my question is — Is my configuration OK? Host in different domain than hyper-v VM.

      I started capturing packet using netmon 3.4 on foo.mycompany.com with and without running dc1, but don’t see any thing unusual. I also tried disbaling forwarders in DNS on dc1.test.local. But nothing seems to work.

    • mdexch

      Something I did on dc1.test.local domain helped me resolve THIS issue (but created another one).
      Some more details on behavior. If the server is actively accessing some server over the network (company lan but different subnet), I would see network issue. RDP (from foo.mycompany.com, my host server) would be sluggish and drop connection momentarily. In general network access slow.

      So on dc1.test.local (DC of domain test.local and also running DNS), I changed a property of “Microsoft virtual machine Bus Network Adapter”. I changed the “IPv4 checksum Offload” from “Rx & Tx Enabled” to “Disabled” (Advanced page). Once I did that my network sluggishness disappeared. But it created new problem.

      With this setting, when I tried to join a server to test.local domain, it failed. “AD DC for domain couldn’t be located …” was the error.

    • Greg

      Hi,

      I am a complete newbie when it comes to virtualization. Is it possible to have the Host Server act as a BDC when having a PDC running as a VM? My thoughts are if the Host Server also has a copy of AD and DNS then there would be no issue with start ups…am I correct in thinking that?

      Cheers

    Leave a Reply to Greg Lipschitz Cancel reply

    You can use these HTML tags

    <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>