June 19, 2008, 11:54 AM — This week I was busy with a number of things, including trying to deploy a distributed file system (DFS) namespace using several servers running Windows Server 2008. Setting up the namespace was simple enough: I created the namespace root, added namespace folders, and configured various folder targets. Then I tried to configure replication between the folder targets for one of the namespace folders, but the replication wizard wouldn't start -- it said something about "RPC Server unavailable" and that the shared folders underlying the folder targets couldn't be found.
I logged onto each file server and verified that the shared folders were present and had appropriate permissions. I checked the services running on each file server and everything seemed OK there too. The event logs told me nothing more,just "RPC Server unavailable." I rebooted each server in the namespace and tried the replication wizard again, but still got the same error.
I tried disabling Windows Firewall on each server but still the connection problems failed to resolve themselves. Growing frustrated, I tried running Dfsdiag.exe on the namespace servers, and what I saw didn't encourage me. The domain controller test passed and the namespace configuration test passed, but the namespace integrity test indicated that one of the namespace servers couldn't be contacted. How could that be?
I had set up the namespace using the DFS Management console running on my administrative workstation and I never encountered any problems connecting with the various file servers when I was setting up the file servers, except for the annoying fact that the DFS Management console seemed to take 30 seconds or more occasionally to validate a connection with a server. That should have provided me with a clue, but when you're in the midst of trying to resolve an issue you sometimes fail to notice what's staring you in the face.
Finally, I thought "Let's go back and start from the beginning. Can I ping each of the file servers from my management station?" I tried pinging each server by its IP address and immediately got a response. Then I tried pinging them using their NetBIOS names and one of the servers failed to respond. In fact, the output from the ping command looked something like this (I've obfuscated the actual domain name and IP addresses):
C:\>ping WPG-SRV2
Pinging WPG-SRV2.<domain> [172.25.16.33] with 32 bytes of data:
Destination Host Unreachable
What I noticed right away was that the address for the server was wrong in the above command output. Server WPG-SRV2 actually has address 172.25.16.12 and I verified that by logging on to the server and running ipconfig.
So why did ping think the address of the server was 172.25.16.33? Obviously the DNS server must be giving it the wrong information, so I opened the DNS console and checked the A record for WPG-SRV2 and sure enough it was the wrong address. How could that have happened?
Then I recalled that I had reinstalled that particular server last week and had failed to delete the DNS entry. I fixed the A record immediately, ran ipconfig /flushdns on my management station, and verified that I could ping WPG-SRV2successfully. Then I tried running the DFS replication wizard again and this time it worked.
What's the moral of the story? One might be tempted to fault Windows itself for throwing such a mysteriously worded error as "RPC Server unavailable" but actually the error tells you just what it's supposed to, namely that the server is unavailable for some reason. And whenever a server is unavailable, you should start from the beginning and check network connectivity followed by name resolution. And only then if those checks fail to shed any light on the matter should you start troubleshooting at the level of the application issue itself.















