Monday, February 28, 2011

Problem with SPSiteDataQuery and TaxonomyFieldTypeMulti

Recently I encountered with problem in Sharepoint with looks like a bug: SPSiteDataQuery doesn’t work properly with TaxonomyFieldTypeMulti fields, i.e. with taxonomy fields which allow multiple values. As you probably know SPSiteDataQuery class can be used for retrieving data from Sharepoint lists and document libraries from several sites at once (you can retrieve data from one web site, from web sites and all of its subsites and from whole site collection). It gives you very good performance so it is widely used in cross site queries scenarios.

In order to retrieve all documents from site collection you can use the following code (in my examples I will use Camlex.NET library):

   1: var q = new SPSiteDataQuery();
   2: q.ViewFields = Camlex.Query().ViewFields(x => x["FileRef"]);
   3: q.Webs = "<Webs Scope='SiteCollection' />";
   4: q.Lists = "<Lists BaseType='1' />";
   5: q.Query = Camlex.Query().OrderBy(x => x["FileRef"]).ToString();
   6: var data = web.GetSiteData(q);

Which is equivalent of the following:

   1: var q = new SPSiteDataQuery();
   2: q.ViewFields = "<FieldRef Name=\"FileRef\" />";
   3: q.Webs = "<Webs Scope='SiteCollection' />";
   4: q.Lists = "<Lists BaseType='1' />";
   5: q.Query = "<OrderBy>" +
   6:               "<FieldRef Name=\"FileRef\" />" +
   7:           "</OrderBy>";
   8: var data = web.GetSiteData(q);

Now suppose that we have custom Taxonomy field “Language” which allows multiple values:

   1: <Field Type="Note"
   2:   DisplayName="Language_0"
   3:   MaxLength="255"
   4:   Group="Custom"
   5:   ID="{7211375B-BC77-49F2-9B3F-1835DD479502}"
   6:   StaticName="Language_0"
   7:   Name="Language_0"
   8:   Hidden="TRUE"
   9:   ShowInViewForms="FALSE"
  10:   Description="" />
  11: <Field ID="{0E70EA75-C6FA-4549-832D-F2CDBE8DB1FF}"
  12:   Type="TaxonomyFieldType"
  13:   DisplayName="Language"
  14:   ShowField="Term1033"
  15:   Required="TRUE"
  16:   EnforceUniqueValues="FALSE"
  17:   Group="Custom"
  18:   StaticName="Language"
  19:   Name="Language"
  20:   Hidden="FALSE"
  21:   Mult="TRUE">
  22:   <Default></Default>
  23:   <Customization>
  24:     <ArrayOfProperty>
  25:       <Property>
  26:         <Name>IsPathRendered</Name>
  27:         <Value xmlns:q7="" p4:type="q7:boolean" xmlns:p4="">
  28:           false
  29:         </Value>
  30:       </Property>
  31:       <Property>
  32:         <Name>TextField</Name>
  33:         <Value xmlns:q6="" p4:type="q6:string" xmlns:p4="">
  34:           {7211375B-BC77-49F2-9B3F-1835DD479502}
  35:         </Value>
  36:       </Property>
  37:     </ArrayOfProperty>
  38:   </Customization>
  39: </Field>

I added here also supplementary field of Note type as each taxonomy field should be defined with such fields. Suppose that we added this field to OTB Document content type. Our field is required so we need to specify value for each uploaded document now. Now we need to modify our cross site CAML query showed above in order to retrieve also values of Language field of each document. The first solution which you can think about may be to add additional column in ViewFields property:

   1: q.ViewFields = Camlex.Query().ViewFields(x => new object[] { x["FileRef"], x["Language"] });

or its equivalent:

   1: q.ViewFields = "<FieldRef Name=\"FileRef\" /><FieldRef Name=\"Language\" />";

All other code is the same. I was surprised when this code didn’t return to me any items. After some investigation I found that ViewFields may affect number of returned items from SPSiteDataQuery (see SPSiteDataQuery.ViewFields):

By default, if a list included in the query does not contain one of the fields specified in the ViewFields tag, no items from that list appear in the results. To return an empty value for a field on items in lists that do not contain that field, set the Nullable attribute to TRUE on that FieldRef tag.

So I tried to add Nullable=“TRUE” attribute to the FieldRef for Language field:

   1: q.ViewFields = "<FieldRef Name=\"FileRef\" /><FieldRef Name=\"Language\" Nullable=\"TRUE\" />";

After that code returned the same amount of documents as before. It means that SPSiteDataQuery always treats fields of type TaxonomyFieldTypeMulti as empty fields (remember that Language field is mandatory and all documents have some values in it). It looks like a bug for me.

As workaround I used regular SPQuery class and make a loop through all sites in site collections manually using recursive function call. Surprisingly that SPQuery works with TaxonomyFieldTypeMulti properly, e.g. if some document has 3 values in Language fields SPQuery will return the following values in this field:

   1: English|84325810-eb21-4a5b-b9ee-533726450674;Finnish|239d8804-b7d7-4193-8bc1-5fdf9f98
   2: 8f7c;Russian|547e35b7-4d8d-4bb0-904d-93e4be9386f7

I found information in internet about problems with SPSiteDataQuery and LookupMulti fields, but not with Taxonomy fields, so probably this information will be helpful.

Update 2011-04-20: colleague snaggywolf shared solution: TaxonomyFieldTypeMulti and SPSiteDataQuery work well if in ViewFields you will use hidden text field instead of taxonomy field. Although I didn’t test it by myself, it can be useful to mention.


  1. Excellent post! For those interested in the details of the workaround snaggywolf mentions:

    1) Get the Hidden Textfield of the TaxonomyFieldMulti
    2) Do a SiteDataQuery using that field

    public static string GetTaxonomyTypeHiddenField(string siteUrl, string aFieldName)
    SPField taxonomyHiddenField = null;
    string aTaxHiddenFieldName = "";

    using (SPSite site = new SPSite(siteUrl))
    using (SPWeb web = site.OpenWeb())
    var taxonomyField = (TaxonomyField)site.RootWeb.Fields.TryGetFieldByStaticName(aFieldName);

    if (taxonomyField != null)
    taxonomyHiddenField = site.RootWeb.Fields[taxonomyField.TextField];
    aTaxHiddenFieldName = taxonomyHiddenField.InternalName;

    catch (Exception ex)
    return null;

    return aTaxHiddenFieldName;

    //launch one request on the site collection and all sites, searching Pages lib which has "TagColumnName" column
    protected DataTable LaunchQuery(int listType)
    string lists = String.Format("", listType);
    string viewFields = String.Format("", InternalNameNotesField);
    string webs = "";
    CrossListQueryInfo query = new CrossListQueryInfo();
    query.RowLimit = 10000;
    query.WebUrl = SPContext.Current.Site.ServerRelativeUrl;
    query.Lists = lists;
    query.Webs = webs;
    query.Query = String.Format("", InternalNameNotesField);
    query.ViewFields = viewFields;
    query.UseCache = true; //use the cache for optimize results

    CrossListQueryCache cache = new CrossListQueryCache(query);

    return cache.GetSiteData(SPContext.Current.Site);

  2. Thanks for the post. Can you please elaborate on how to use taxanomyHiddenField Internal name while querying?

  3. Priyanks,
    at first you need to add it to ViewFields:
    q.ViewFields = "[FieldRef ID=\"...\" Nullable=\"TRUE\" /]" + ...;

    then use simple Contains condition:

    [FieldRef ID="... /]
    [Value Type="Text"]Test|[/Value]

    (In the examples above I used [] instead of tags angle brackets because blogger won't show them otherwise)
    Note about trailing "|" symbol in Value tag. You need to add it because Note field contains value in the following format: Name of term|GUID of term.

  4. Hi Alex...Thanks for the response. It's still not returning any result. I am trying to traverse site collections, looking for document libraries with a certain content type. I am trying to display all of the documents in a grid which matches certain condition. I was using SPSiteDataQuery for doing the same. But my problem is that the columns which are managed metadata are not showing any items.

  5. well, I used it by myself in the post project and can confirm that it works. Try to simplify your query - just ensure that it works with simple condition e.g. on Title field. Then add reference to hidden managed metadata note field to the ViewFields and to the query as I showed above.
    Notice that you should use Note field in your code instead of managed metadata field (e.g. for binding to the grid column), because real managed metadata field will be empty.

  6. Thanks Alex. I will try that again. By Note field, do you mean aTaxHiddenFieldName as shown in the above example. Just wondering, what value shall I use in the query? I want to get all of the documents. In the above example, you are using 'Test'.

  7. yes, in the example above I used value "Test|" (see my comment above why do you need extra "|" symbol). This query should return all documents which are tagged with "Test" term.

  8. Hi Alex,

    Just a note that this also affects ContentByQyeryWebPart, as to my understading it uses SPSiteDataQuery internally.

    In one of my projects, I had to query for a field of type TaxonomyFieldTypeMulti, and using the managed metadata field directly in the element resulted in no items being returned.

    I was able to work around the issue by querying for the corresponding taxonomy hidden field instead of the actual managed metadata field, and then parsing the returned value in my custom xsl.

  9. Hi Jarno,
    thanks for comment, it is useful to know that the same problem exists in CQWP.

  10. You should use TermGuid if you're comparing using CONTAINS, otherwise you'll end up having wrong items if you have many terms with same label.