In one of my previous articles I showed how we may exclude system pages like AllItems.aspx from search results: Exclude AllItems.aspx from search results in Sharepoint 2013. In this post I will show how to create search crawl rules via PowerShell. It may be useful when you need to exclude a lot of contents from search crawling and doing it manually would mean a lot of work (e.g. when you restored large content database from production, but don’t need to crawl all sites). Here is the script:
1:
2: # Ensure SharePoint PowerShell Snapin
3: if ((Get-PSSnapin "Microsoft.SharePoint.PowerShell" -ErrorAction SilentlyContinue) -eq $null)
4: {
5: Add-PSSnapin "Microsoft.SharePoint.PowerShell"
6: }
7:
8: [xml]$xmlinput=(Get-Content "CrawlRules.xml")
9:
10: foreach($WebApplication in $xmlinput.SelectNodes("Build/WebApplication"))
11: {
12: foreach($SearchService in $WebApplication.SelectNodes("SearchService"))
13: {
14: #Get search service
15: $strServiceName=$SearchService.Name;
16: $spService=Get-SPEnterpriseSearchServiceApplication -Identity $strServiceName;
17:
18: #Clear rules if needed
19: $Rules=$SearchService.SelectNodes("Rules");
20: $strClearRules=$Rules.ItemOf(0).Clear;
21: if ($strClearRules -eq "True")
22: {
23: $spRules=Get-SPEnterpriseSearchCrawlRule -SearchApplication $spService;
24: foreach ($spRule in $spRules)
25: {
26: if ($spRule -ne $null)
27: {
28: Write-Host "Deleting rule:" $spRule.Path -ForegroundColor Yellow
29: $spRule.Delete();
30: }
31: }
32: }
33:
34: #Add new rules
35: foreach($CrawlRule in $SearchService.SelectNodes("Rules/Rule"))
36: {
37: $FollowComplexUrls=$false;
38: if($CrawlRule.FollowComplexUrls -eq "True")
39: {
40: $FollowComplexUrls=$true;
41: }
42:
43: if ($CrawlRule.Type -eq "ExclusionRule")
44: {
45: #In exclusion FollowComplexUrls actually means "Exclude complex URLs"
46: $FollowComplexUrls=!$FollowComplexUrls;
47: New-SPEnterpriseSearchCrawlRule -Path $CrawlRule.URL -SearchApplication
48: $spService -Type $CrawlRule.Type -FollowComplexUrls $FollowComplexUrls
49: }
50: else
51: {
52: $CrawlAsHttp=$false;
53: if($CrawlRule.CrawlAsHttp -eq "True")
54: {
55: $CrawlAsHttp=$true;
56: }
57:
58: $SuppressIndexing=$false;
59: if($CrawlRule.SuppressIndexing -eq "True")
60: {
61: $SuppressIndexing=$true;
62: }
63:
64: New-SPEnterpriseSearchCrawlRule -Path $CrawlRule.URL -SearchApplication
65: $spService -Type $CrawlRule.Type -FollowComplexUrls $FollowComplexUrls -CrawlAsHttp
66: $CrawlAsHttp -SuppressIndexing $SuppressIndexing
67: }
68: }
69: }
70: }
Rules are defined in CrawlRules.xml file which has the following structure:
1:
2: <?xml version="1.0" encoding="utf-8"?>
3: <Build>
4: <WebApplication>
5: <SearchService Name="Search Service Application">
6: <Rules Clear="True">
7: <Rule URL="*://*/_layouts/*" Type="ExclusionRule" FollowComplexUrls="False" />
8: <Rule URL="*://*/_catalogs/*" Type="ExclusionRule" />
9: <Rule URL="*://*/_vti_bin/*" Type="ExclusionRule" />
10: <Rule URL="*://*/forms/AllItems.aspx*" Type="ExclusionRule" />
11: <Rule URL="*://*/forms/DispForm.aspx*" Type="ExclusionRule" />
12: <Rule URL="*://*/forms/EditForm.aspx*" Type="ExclusionRule" />
13: <Rule URL="*://*/forms/NewForm.aspx*" Type="ExclusionRule" />
14: </Rules>
15: </SearchService>
16: </WebApplication>
17: </Build>
As result it will create exclusion rules for layouts pages, also for pages from _catalogs and _bti_bin and for list forms AllItems.aspx, DispForm.aspx, EditForm.aspx and NewForm.aspx. You may generate this xml file programmatically if you have a lot of sites which should be excluded and then pass it to the script above. It will simplify administrative work, which is not needed to be done manually.
No comments:
Post a Comment