Saturday, September 14, 2013

Problem with not crawled managed metadata fields in Sharepoint 2013

In this post I would like to describe one interesting problem with search crawler and managed metadata fields in Sharepoint 2013. In some situations you may face with the problem that your taxonomy fields are not crawled. Let’s assume that we provisioned the site and created several managed metadata fields using the following declarations:

   1: <Field Type="Note"
   2:   DisplayName="MyManagedMetadataField_0"
   3:   MaxLength="255"
   4:   Group="My Fields"
   5:   ID="{4FCE0732-EF53-4b43-B678-3D2FC28D9A29}"
   6:   StaticName="MyManagedMetadataField_0"
   7:   Name="MyManagedMetadataField_0"
   8:   Hidden="TRUE"
   9:   ShowInViewForms="FALSE"
  10:   Description="" />
  11: <Field ID="{8D3A1B9D-A50E-47e4-8778-53E8717E4FC6}"
  12:   SourceID="http://schemas.microsoft.com/sharepoint/v3"
  13:   Type="TaxonomyFieldType"
  14:   DisplayName="My Managed Metadata field"
  15:   ShowField="Term1033"
  16:   Required="FALSE"
  17:   EnforceUniqueValues="FALSE"
  18:   Group="My Fields"
  19:   StaticName="MyManagedMetadataField"
  20:   Name="MyManagedMetadataField"
  21:   Hidden="FALSE"
  22:   Mult="TRUE">
  23:   <Default></Default>
  24:   <Customization>
  25:     <ArrayOfProperty>
  26:       <Property>
  27:         <Name>IsPathRendered</Name>
  28:         <Value xmlns:q7="http://www.w3.org/2001/XMLSchema" p4:type="q7:boolean" xmlns:p4="http://www.w3.org/2001/XMLSchema-instance">
  29:           true
  30:         </Value>
  31:       </Property>
  32:       <Property>
  33:         <Name>TextField</Name>
  34:         <Value xmlns:q6="http://www.w3.org/2001/XMLSchema" p4:type="q6:string" xmlns:p4="http://www.w3.org/2001/XMLSchema-instance">
  35:           {4FCE0732-EF53-4b43-B678-3D2FC28D9A29}"
  36:         </Value>
  37:       </Property>
  38:     </ArrayOfProperty>
  39:   </Customization>
  40: </Field>
Here as always we provision 2 fields: hidden Note field and managed metadata field itself, which has reference to the Note field. After that we create some content in the doclib or list using content type which this field (this is important step: without at least 1 item with non-empty values in managed metadata fields, crawled and managed properties for them won’t be created) and run full crawl of our site. During the crawling if everything went properly Sharepoint creates 2 crawled properties and 1 managed property for each managed metadata field (actually it creates them also for other field types, but in this article we are talking only about managed metadata):

image

As shown on the picture above the following crawled and managed are created automaically:

Crawled property Mapped managed property
ows_{field name} -
ows_taxId_{field name} owstaxId{field name}

One crawled field (ows_MyManagedMetadataField) is not mapped to managed property initially (after first full crawl), another (ows_taxId_MyManagedMetadataField) has mapping to managed property (owstaxIdMyManagedMeatadataField), which is also created automatically during crawl. After that you should create 2nd managed property and map it to the crawled property without mapping. In all your queries, content by search web parts, search result sources, display templates, etc. you should use this second managed property which you map manually, not the one which was created automatically (for manually created managed property you may set it’s properties like Searchable, Queryable, etc. how you need, while for automatically created property Sharepoint set them and it is better to not change it). After that run full crawl again.

This is how search schema should look like if everything went correct. However if you provisioned managed metadata fields using the code shown above, you will have the following problem: after crawling crawled properties won’t be created at all or only 1 crawled property, which doesn’t have mapping to managed property, will be created. This fact it self is not critical. The problem however is that after that your KQL queries which filter the data based on managed metadata fields won’t return any data. It indicates that something went wrong during the crawling.

The problem is caused by the way how managed metadata field is provisioned. As you can see above it has the following name: MyManagedMetadataField_0, i.e. it uses format {managed metadata field name}_0. But as it turned out in order to have managed metadata fields correctly crawled it should use another format:

{managed metadata field name}TaxHTField

i.e. it should be named MyManagedMetadataFieldTaxHTField:

   1: <Field Type="Note"
   2:   DisplayName="MyManagedMetadataFieldTaxHTField"
   3:   MaxLength="255"
   4:   Group="My Fields"
   5:   ID="{4FCE0732-EF53-4b43-B678-3D2FC28D9A29}"
   6:   StaticName="MyManagedMetadataFieldTaxHTField"
   7:   Name="MyManagedMetadataFieldTaxHTField"
   8:   Hidden="TRUE"
   9:   ShowInViewForms="FALSE"
  10:   Description="" />

This is not documented fact, but this is how Sharepoint provisions its own managed metadata fields (e.g. Enterprise keywords). If you will check Sharepoint search assemblies in reflector for “TaxHTField” you will see usages of this suffix in the code, i.e. search crawler really demands on it. This is the only change in the above example needed for making MyManagedMetadataField properly crawlable. Now you know how to properly provision managed metadata fields declaratively :).

8 comments:

  1. This is very strange behavior indeed! Thanks - I tried to create a result type to display a custom icon based off a managed metadata column I created through the term store (not in code). Obviously not knowing what you described above, I mapped the unmapped crawled property to the taxID managed property and could not, for the life of me, get the condition in the result type to work! I read this blog, and removed the 2nd mapping and created my own managed property and low and behold - it was able to filter. What a nightmare :)

    ReplyDelete
  2. one of the cases when only experience matters.

    ReplyDelete
  3. Thanks for sharing! I've really stuck with the indexing of MMS fields and your post just saved my time.

    ReplyDelete
  4. Hey Alex,
    This didn't seem to work for us, and I can see that this approach can lead to other issues..

    Note that Internal field Names should not go over 32 characters, as this is a limit in SharePoint for a number of things (i.e. View field names). By creating a field with a name "{managed metadata field name}TaxHTField" those 32 character limit might be hit..

    Note however that we were able to fix this issue by using an alternate naming convention (which SharePoint OOTB also uses) for the Note field:
    "i{GUID without dashes and first character}", where GUID is the managed metadata field ID.

    This naming convention will always have exactly 32 characters, which does not break any SharePoint limits.

    And it works well! :)

    Cheers,
    João Lopes

    ReplyDelete
    Replies
    1. One correction to the above.. The [NoteField].InternalName should be almost the same as the [CustomTaxonomyField].Id without the dashes and brackets (-{}), and the first character of the [NoteField].InternalName has to be a letter; however that letter is not always "i" (as mentioned above).

      If the first Guid character of [CustomTaxonomyField].Id is between [a-f], the [NoteField].InternalName should have the same first character of that Guid; however if the first character of the Guid is between [0-9], then the first letter of the Note field should be one of the next available letters after "f".

      Example:
      - If you declare a TaxonomyFieldType field with Id="{d0e43df7-450f-40cf-ab24-80e7dcdb4a5e}", then the corresponding Note field InternalName and StaticName should be InternalName="d0e43df7450f40cfab2480e7dcdb4a5e".
      - If Guid is "{006b136a-18cf-4fae-945d-312d19bcbad2}" the InternalName="g06b136a18cf4fae945d312d19bcbad2";
      - If Guid is "{1f773c91-3376-4510-b197-c66be07c2b82}" the InternalName="hf773c9133764510b197c66be07c2b82"
      - etc.. (0:g, 1:h, 2:i, 3:j, 4:k, 5:l, 6:m, 7:n, 8:o, 9:p)

      Delete
    2. interesting, thank you for sharing!

      Delete
  5. João hi,
    thanks for sharing your solution. However approach with {...}TaxHTField is also used by Sharepoint. It may be so that it uses both approaches. What worries me about second approach is that "TaxHTField" constant is used in the Sharepoint assembly, i.e. at least in some places it assumes that hidden field should be named like that.
    About view fields: we are talking here about hidden system field, which most probably won't be used in list views, i.e. this is not big problem. However this is notable limitation.

    ReplyDelete