Thursday, October 18, 2018

Step-by-Step how PXE boots a machine using SCCM OSD

I had a problem that was quite difficult to work through and in order to figure it out I had to go really deep on PXE.  Figured this step-by-step might help someone else in the future.

  1. The network boot client computer sends a broadcast to entire network with option 60 (on any normal network this will only actually broadcast on the local subnet but IP helpers generally get it to the DHCP server).
  2. Both DHCP and the WDS server get the broadcast (either both are assigned as DHCP servers with IP helpers or DHCP options are set to forward the request to the WDS).
  3. DHCP offers an IP address to the client (keyword "offers", this hasn't been accepted yet).
  4. Before the client machine accepts the IP address it waits for a signal from the WDS server WDS.  Before sending the signal back to the client the WDS sever runs a stored procedure, LOOKUPDEVICE, against the SCCM database.  If the client machine is found in SCCM or if there is an advertisement for "Unknown Machines" collection then WDS signals the client to proceed with the PXE boot.
  5. The client machine now accepts the IP offered by DHCP.
  6. DHCP DORA finally completes when the DHCP server acknowledges the client IP assignment.  The client machine now has an IP address and is ready to proceed.
  7. The client machine downloads WDSNBP.COM from PXE server to detect the hardware architecture (x86 or x64)
  8. The client downloads the PXEBOOT.COM boot files for its architecture from PXE server.  The file downloaded at this step is controlled/ monitored by SMSPXE.
  9. SMSPXE runs a stored procedure called getbootaction and depending on the result, it gives the PXE boot files to client.
  10. The client machine now downloads the Boot image, bootmgr.exe and BCD store.  This is an SMB file transfer, all previous file transfers were TFTP. Boot image downloaded here would be dependent on the result of the architecture detection done earlier by WDSNBP file.
  11. Once the Boot image and the other two files are downloaded completely BootMGR and BCD store are used to initialize the WINPE environment.
For my particular problem it turned out to be a bad switch dropping some packets.  It didn't really present itself with the tiny little TFTP (UDP) downloads but as soon as we hit the first SMB file transfer (TCP) things failed.  Made it look like a DHCP handoff problem when it was actually a file transfer problem.  Would never have found it without understanding how this works.

Enjoy!

Another good reference on network boot process:
https://blogs.technet.microsoft.com/dominikheinz/2011/03/18/sccm-pxe-network-boot-process