Recently I encountered with problem in Sharepoint with looks like a bug: SPSiteDataQuery doesn’t work properly with TaxonomyFieldTypeMulti fields, i.e. with taxonomy fields which allow multiple values. As you probably know SPSiteDataQuery class can be used for retrieving data from Sharepoint lists and document libraries from several sites at once (you can retrieve data from one web site, from web sites and all of its subsites and from whole site collection). It gives you very good performance so it is widely used in cross site queries scenarios.
In order to retrieve all documents from site collection you can use the following code (in my examples I will use Camlex.NET library):
1: var q = new SPSiteDataQuery();
2: q.ViewFields = Camlex.Query().ViewFields(x => x["FileRef"]);
3: q.Webs = "<Webs Scope='SiteCollection' />";
4: q.Lists = "<Lists BaseType='1' />";
5: q.Query = Camlex.Query().OrderBy(x => x["FileRef"]).ToString();
6: var data = web.GetSiteData(q);
Which is equivalent of the following:
1: var q = new SPSiteDataQuery();
2: q.ViewFields = "<FieldRef Name=\"FileRef\" />";
3: q.Webs = "<Webs Scope='SiteCollection' />";
4: q.Lists = "<Lists BaseType='1' />";
5: q.Query = "<OrderBy>" +
6: "<FieldRef Name=\"FileRef\" />" +
7: "</OrderBy>";
8: var data = web.GetSiteData(q);
Now suppose that we have custom Taxonomy field “Language” which allows multiple values:
1: <Field Type="Note"
2: DisplayName="Language_0"
3: MaxLength="255"
4: Group="Custom"
5: ID="{7211375B-BC77-49F2-9B3F-1835DD479502}"
6: StaticName="Language_0"
7: Name="Language_0"
8: Hidden="TRUE"
9: ShowInViewForms="FALSE"
10: Description="" />
11: <Field ID="{0E70EA75-C6FA-4549-832D-F2CDBE8DB1FF}"
12: Type="TaxonomyFieldType"
13: DisplayName="Language"
14: ShowField="Term1033"
15: Required="TRUE"
16: EnforceUniqueValues="FALSE"
17: Group="Custom"
18: StaticName="Language"
19: Name="Language"
20: Hidden="FALSE"
21: Mult="TRUE">
22: <Default></Default>
23: <Customization>
24: <ArrayOfProperty>
25: <Property>
26: <Name>IsPathRendered</Name>
27: <Value xmlns:q7="http://www.w3.org/2001/XMLSchema" p4:type="q7:boolean" xmlns:p4="http://www.w3.org/2001/XMLSchema-instance">
28: false
29: </Value>
30: </Property>
31: <Property>
32: <Name>TextField</Name>
33: <Value xmlns:q6="http://www.w3.org/2001/XMLSchema" p4:type="q6:string" xmlns:p4="http://www.w3.org/2001/XMLSchema-instance">
34: {7211375B-BC77-49F2-9B3F-1835DD479502}
35: </Value>
36: </Property>
37: </ArrayOfProperty>
38: </Customization>
39: </Field>
I added here also supplementary field of Note type as each taxonomy field should be defined with such fields. Suppose that we added this field to OTB Document content type. Our field is required so we need to specify value for each uploaded document now. Now we need to modify our cross site CAML query showed above in order to retrieve also values of Language field of each document. The first solution which you can think about may be to add additional column in ViewFields property:
1: q.ViewFields = Camlex.Query().ViewFields(x => new object[] { x["FileRef"], x["Language"] });
or its equivalent:
1: q.ViewFields = "<FieldRef Name=\"FileRef\" /><FieldRef Name=\"Language\" />";
All other code is the same. I was surprised when this code didn’t return to me any items. After some investigation I found that ViewFields may affect number of returned items from SPSiteDataQuery (see SPSiteDataQuery.ViewFields):
By default, if a list included in the query does not contain one of the fields specified in the ViewFields tag, no items from that list appear in the results. To return an empty value for a field on items in lists that do not contain that field, set the Nullable attribute to TRUE on that FieldRef tag.
So I tried to add Nullable=“TRUE” attribute to the FieldRef for Language field:
1: q.ViewFields = "<FieldRef Name=\"FileRef\" /><FieldRef Name=\"Language\" Nullable=\"TRUE\" />";
After that code returned the same amount of documents as before. It means that SPSiteDataQuery always treats fields of type TaxonomyFieldTypeMulti as empty fields (remember that Language field is mandatory and all documents have some values in it). It looks like a bug for me.
As workaround I used regular SPQuery class and make a loop through all sites in site collections manually using recursive function call. Surprisingly that SPQuery works with TaxonomyFieldTypeMulti properly, e.g. if some document has 3 values in Language fields SPQuery will return the following values in this field:
1: English|84325810-eb21-4a5b-b9ee-533726450674;Finnish|239d8804-b7d7-4193-8bc1-5fdf9f98
2: 8f7c;Russian|547e35b7-4d8d-4bb0-904d-93e4be9386f7
I found information in internet about problems with SPSiteDataQuery and LookupMulti fields, but not with Taxonomy fields, so probably this information will be helpful.
Update 2011-04-20: colleague snaggywolf shared solution: TaxonomyFieldTypeMulti and SPSiteDataQuery work well if in ViewFields you will use hidden text field instead of taxonomy field. Although I didn’t test it by myself, it can be useful to mention.