Getting Distributed Cache started on SharePoint 2016 Beta 2

First of all I’d like to thank mvan for the post located here: http://www.sharepointconsultant.ch/2013/03/07/adding-a-local-sharepoint-2013-development-server-as-a-cache-host-to-appfabrics-cache-cluster/ please read this before continuing, as the following post assumes you’ve read mvan’s issue.

The distributed cache would not start

During a recent install of SharePoint 2016 Beta 2, the Distributed Cache would not start and did not show up on the servers in farm ( _admin/FarmServers.aspx) page, even though it had not been accidentally excluded from the connect-spconfigurationdatabase action (-skipregisterASdistributedcachehost).  Hopefully, this will not happen in the RTM and it might’ve been due to this farm I built only had one of each of the Minrole servers deployed (e.g. one app, one front end, one search, and one distributed cache).  So in case you face this, here’s what I encountered and what steps I took to overcome:

Couldn’t see cluster

When I tried to run Use-CacheCluster, I was getting a message about cluster configuration not being present.  I was logged onto the correct vm.troubleshooting-distribcache4

Registry Modification and rejoin and resize fixed it

I added the following two registry entries at this location HKLM\SOFTWARE\Microsoft\AppFabric\V1.0\Configuration

The Connection String was empty, so it was modified to:

Data Source=SharePointAlias;Initial Catalog=PRD_SharePoint_ConfigDB;Integrated Security=True;Enlist=False;Pooling=True;Min Pool Size=0;Max Pool Size=100;Connect Timeout=15;Application Name=SharePoint[psconfigui][1][PRD_SharePoint_ConfigDB]

 

troubleshooting-distribcache5

Where “SharePointAlias” = the value used in cliconfg.exe for the SharePoint alias, so that all SharePoint databases can easily be ported from one SQL to another. In an environment that is not using an Alias for SQL on the client, you’d use the name of your SQL cluster, or Database\instance name. And, “PRD_SharePoint_ConfigDB” is the name of the farms configuration database.

Provider was empty, so it was modified to

SPDistributedCacheClusterProvider

After the machine was joined to the farm using this powershell one-liner:

Connect-SPConfigurationDatabase -DatabaseName PRD_SharePoint_ConfigDB -DatabaseServer SharePointAlias -Passphrase (ConvertTo-SecureString "SomeSuperHardPassWord" -AsPlainText -Force) -localserverrole distributedCache

 

The Cache still did not start, but at least the App Fabric service inside of windows services did not show as disabled.

distributedCacheUPDown

 

Next I attempted to register the cluster and received a message that the Host name was already present in the cluster configuration, yet the cache still would not start and was showing as down, when running

restart-cachecluster

 

followed by

get-cachehost.

 

distributedCacheregister

Resize it

I ran

Update-SPDistributedCacheSize –CacheSizeinMB 819

 

(since I had 2 GB of Memory, aka 40% of installed memory) then when ran get-cachehost, the host was started and when looking in the servers in the farm (_admin/FarmServers.aspx) all was well with the world again.

 

On a separate occasion,

the above steps didnt work and I had to run Remove-CacheHost to unconfigure

Then took a look at the cache cluster to see that it was initialized.

troubleshooting-distribcache6

After which time I ran add-cachehost

e.g.

add-cachehost -Provider SPDistributedCacheClusterProvider -ConnectionString "Data Source=SPAlias;Initial Catalog=PRD_SharePoint_ConfigDB;Integrated Security=True;Enlist=False;Pooling=True;Min Pool Size=0;Max Pool Size=100;Connect Timeout=15;Application Name=SharePoint[psconfigui][1][PRD_SharePoint_ConfigDB]" -Account "Tailspintoys\svc_install"

 

and register-cachehost

What was the reason for that???

 

the connection string had been entered wrong, the data source had a different alias (e.g. it had SharePointAlias, versus SPAlias)  if you’re not sure where to find the alias, run cliconfg from your run bar, then look at the alias tab.  If the alias tab is not configured, you should look into that, but dont really need to worry about it for this situation, just go to servers in your farm and get the name of the SQL server, and then you’ll need to login and look to see if there is an instance that you need to include.

 

And finally, on yet another round with this service, the connection string had to be modified to:

 

Data Source=SharePointSQL;Initial Catalog=Config;Integrated Security=True;Persist Security Info=False;Enlist=False;Pooling=True;Min Pool Size=0;Max Pool Size=100;Asynchronous Processing=False;Connection Reset=True;MultipleActiveResultSets=False;Replication=False;Connect Timeout=15;Encrypt=False;TrustServerCertificate=False;Load Balance Timeout=0;Packet Size=8000;Type System Version=Latest;Application Name=".Net SqlClient Data Provider";User Instance=False;Context Connection=False;Transaction Binding="Implicit Unbind";ApplicationIntent=ReadWrite;MultiSubnetFailover=False;TransparentNetworkIPResolution=True;ConnectRetryCount=1;ConnectRetryInterval=10;Column Encryption Setting=Disabled

ServersInFarm2016

 

Things to try if still unsuccessful:

Make sure Windows Server and SharePoint products have all important and critical patches applied

Make sure the windows service “Remote Registry” is started

Create a new Distributed Cache server

Check for possible CU for app fabric (doubtfull that this will work as of the time of this post, since 2016 is already at CU7 https://support.microsoft.com/en-us/kb/3092423)

 

Cheers,