Fixing degraded search index replica’s in a ginormous search farm

Credit for this post:  Scott Fawley, SharePoint Guru

Here is a handy script that MS sent while troubleshooting a Search issue in a customer’s 2013 Prod farm.  This will show you the state of all partitions and replicas along with the generation that each replica is on.  If one of the replicas is on an older generation then that indicates that it has not been updated properly from the Primary index replica and should be cleaned up (see below for info on doing this).

First create a GetIndexStatus.ps1 using the following code:

# =========================================
param (
    [boolean]$quiet = $false
)
 
$ssaName = $args[0]
$ssa = Get-SPEnterpriseSearchServiceApplication -Identity $ssaName
 
if ($ssa -eq $null)
{
    Write-Host -ForegroundColor Yellow "No valid SSA specified, using default one"
    $ssa = Get-SPEnterpriseSearchServiceApplication
}
 
if ($ssa -eq $null)
{
    Write-Host -ForegroundColor Red "No valid SSA found"
    return
}
 
if ($quiet -eq $false) { Write-Host "Checking"$ssa.Name }
 
$nonActive = Get-SPEnterpriseSearchStatus -SearchApplication $ssa | where {$_.Name.StartsWith("IndexComponent") -and ($_.state -ne "Active")} 
 
if ($quiet -eq $false) {
    if ($nonActive -eq $null) {
        Write-Host -ForegroundColor Green "All indexers are active"
    }
    else {
        Write-Host -ForegroundColor Yellow "Indexers not active:"
        foreach($indexer in $nonActive) {
            Write-Host -ForegroundColor Yellow $indexer.Name"`t" $indexer.state"`t"$indexer.Details
        }
    }
    Write-Host -nonewline "Finding index state "
}
 
$running = Get-SPEnterpriseSearchStatus -SearchApplication $ssa | where {$_.Name.StartsWith("IndexComponent") -and ($_.state -ne "Unknown")}  
$numericDetails = @("Partition")
 
foreach ($indexer in $running) {
    if ($quiet -eq $false) { Write-Host -nonewline "."}
    $status = Get-SPEnterpriseSearchStatus -HealthReport -Component $indexer.Name -SearchApplication $ssa 
    $out = New-Object PSObject
    $out | Add-Member IndexComponent $indexer.Name
    $out | Add-Member State $indexer.State
    foreach($key in $indexer.Details.Keys) {
        if ($numericDetails -contains $key) {
            $value = [long]$indexer.Details.Item($key)
        } else {
            $value = $indexer.Details.Item($key)
        }
        $out | Add-Member $key $value
    }
 
    $stat = ($status | where {$_.Name.StartsWith("plugin: newest generation id")})
    if ($stat -eq $null) { $outtxt = 0 } else { $outtxt = [long]$stat.message }
    $out | Add-Member Generation $outtxt
 
    $stat = ($status | where {$_.Name.StartsWith("plugin: number of documents")})
    if ($stat -eq $null) { $outtxt = 0 } else { $outtxt = [long]($stat.message) }
    $out | Add-Member Items $outtxt
    
    $stat = ($status | where {$_.Name.StartsWith("plugin: size of newest checkpoint")})
    if ($stat -eq $null) { $outtxt = 0 } else { $outtxt = [long]($stat.message) }
    $out | Add-Member CheckpointSize $outtxt
    
    $stat = ($status | where {$_.Name.StartsWith("plugin: initialized")})
    if ($stat -eq $null) { $outtxt = $false } else { $outtxt = ($stat.message.ToLower() -eq 'true') }
    $out | Add-Member Initialized $outtxt
        
    $stat = ($status | where {$_.Name.StartsWith("plugin: master merge running")})
    if ($stat -eq $null) { $outtxt = $false } else { $outtxt = ($stat.message.ToLower() -eq 'true') }
    $out | Add-Member MasterMerge $outtxt
 
    $stat = ($status | where {$_.Name.StartsWith("plugin: unrecoverable error detected")})
    if ($stat -eq $null) { $outtxt = $false } else { $outtxt = ($stat.message.ToLower() -eq 'true') }
    $out | Add-Member Unrecoverable $outtxt
 
    $stat = ($status | where {$_.Name.StartsWith("plugin: shutting down")})
    if ($stat -eq $null) { $outtxt = $false } else { $outtxt = ($stat.message.ToLower() -eq 'true') }
    $out | Add-Member Shutdown $outtxt
    
    $stat = ($status | where {$_.Name.StartsWith("part: number of documents including duplicates")})
    if ($stat -eq $null) { $outtxt = $false } else { $outtxt =  $stat | foreach {[long]$_.Message} | Measure-Object -Sum | foreach {$_.Sum}}
    $out | Add-Member ItemsAllGroups $outtxt
    
    $stat = ($status | where {$_.Name.StartsWith("part: number of exlisted documents")})
    if ($stat -eq $null) { $outtxt = $false } else { $outtxt = $stat | foreach {[long]$_.Message} | Measure-Object -Sum | foreach {$_.Sum}}
    $out | Add-Member ExcludedAllGroups $outtxt
    
 
    Write-Output $out
}
if ($quiet -eq $false) {Write-Host " Done"}
# =================================================

Then run:

.\GetIndexStatus.ps1 | sort partition | ft Host,State,Partition,catch_up,generation -auto

It should give you an output that looks something like this:

Checking SharePoint Search Service
Indexers not active:
IndexComponent6          Degraded        [catch_up, True] [Partition, 1] [Host, Search-Query1]
Finding index state ........ Done
 
Host           		 State    Partition 	catch_up 	Generation
----           		 -----    --------- 		--------		 ----------
Search-Query3 		Active           0              		116983
Search-Query2		Active           0              		116983
Search-Query1 		Active           0              		116983
Search-Query4 		Active           0              		116983
Search-Query4 		Active           1             			116983
Search-Query2 		Degraded      1 		True      	115079
Search-Query1 		Active           1              		116983
Search-Query3 		Active           1              		116983
 

Notice that the generation value us substantially less on this one partition replica than the others.  This was resulting in the customer getting some stale entries in their search results.

 

To clean this up do the following on the server that the replica lives on:

  1. Stop the following services via Services.msc
  • SharePoint Search Host Controller
  • SharePoint Timer Service
  1. Drill down to the folder where the replica lives.  In this case it was E:\Microsoft Office Servers\15.0\Data\Office Server\Applications\Search\Nodes\7D70D3\IndexComponent6.  Continue to drill down further until you see the data folder. (E:\Microsoft Office Servers\15.0\Data\Office Server\Applications\Search\Nodes\7D70D3\IndexComponent6\storage\data)
  2. Delete the Data folder
  3. Start the SharePoint Search Host Controller and SharePoint Timer Services.
  4. It will take some time for search to connect back to the replica and for it to be rebuilt so just let it run until it is back to healthy.

 

You may find that there are multiple replicas that are out of sync so just repeat the above on those as well.

 

Please let me know if you have any questions on the above.