Tuesday, June 7, 2011

ContentIterator for Sharepoint server

In this post I would like to describe useful class which can be helpful in your every day Sharepoint development: ContentIterator. This class is defined in Microsoft.Office.Server.dll assembly and helps address common tasks like iteration through list items (SPListItem), files (SPFile), sites (SPWeb), etc. The cool thing is that it is implemented via SPQuery and so it has very good performance on the big lists. Also as documentation says:

SharePoint Server provides a new API, ContentIterator, to help with accessing more than 5,000 items in a large list without hitting a list throttling limit and receiving an SPQueryThrottleException

E.g. suppose that we have reference on SPFolder and want to iterate through all items in this folder. We can use ContentIterator.ProcessFilesInFolder method:

   1: SPFolder folder = web.GetFolder(...);
   2: SPList doclib = web.Lists[...];
   3: ContentIterator contentIterator = new ContentIterator();
   4:  
   5: bool isFound = false;
   6: contentIterator.ProcessFilesInFolder(doclib, folder, false,
   7:     f =>
   8:         {
   9:             // this is item iteration handler
  10:             ...
  11:         },
  12:     (f, e) =>
  13:         {
  14:             // error handler
  15:             ...
  16:         });

So we passed doclib and folder where we want to perform iteration, Third parameter specifies should iteration be recursive or not. Fourth parameter is the function which receives actual SPFile instance which was iterated. And last parameter is the error handler.

Lets see how it is implemented. In the underlying calls it uses another method ContentIterator.ProcessListItems. At first it constructs SPQuery object:

   1: public void ProcessListItems(SPList list, string strQuery, uint rowLimit,
   2:     bool fRecursive, SPFolder folder, ItemsProcessor itemsProcessor,
   3:     ItemsProcessorErrorCallout errorCallout)
   4: {
   5:     ...
   6:     SPQuery query = new SPQuery();
   7:     if (!string.IsNullOrEmpty(strQuery))
   8:     {
   9:         query.Query = strQuery;
  10:     }
  11:     query.RowLimit = rowLimit;
  12:     if (folder != null)
  13:     {
  14:         query.Folder = folder;
  15:     }
  16:     if (fRecursive)
  17:     {
  18:         query.ViewAttributes = "Scope=\"RecursiveAll\"";
  19:     }
  20:     this.ProcessListItems(list, query, itemsProcessor, errorCallout);
  21: }

String strQuery used here is calculated on the above call stacks:

   1: public static string ItemEnumerationOrderByPath
   2: {
   3:     get
   4:     {
   5:         return "<OrderBy Override='TRUE'><FieldRef Name='FileDirRef' /><FieldRef Name='FileLeafRef' /></OrderBy>";
   6:     }
   7: }

And the most interesting implementation of ProcessListItems method overload:

   1: public void ProcessListItems(SPList list, SPQuery query, ItemsProcessor itemsProcessor, ItemsProcessorErrorCallout errorCallout)
   2: {
   3:     string str2;
   4:     SPListItemCollection items;
   5:     ...
   6:     if (!list.get_HasExternalDataSource() && (list.ItemCount == 0))
   7:     {
   8:         return;
   9:     }
  10:     if (list.get_HasExternalDataSource() && (query.RowLimit == 0))
  11:     {
  12:         query.RowLimit = 0x7fffffff;
  13:     }
  14:     else if ((query.RowLimit == 0) || (query.RowLimit == 0x7fffffff))
  15:     {
  16:         query.RowLimit = string.IsNullOrEmpty(query.ViewFields) ? 200 : 0x7d0;
  17:     }
  18:     if (!list.get_HasExternalDataSource() && this.StrictQuerySemantics)
  19:     {
  20:         query.set_QueryThrottleMode(2);
  21:     }
  22:     string strListId = list.ID.ToString("B");
  23:     this.ResumeProcessListItemsBatch(strListId, out str2);
  24:     if (!string.IsNullOrEmpty(str2))
  25:     {
  26:         query.ListItemCollectionPosition = new SPListItemCollectionPosition(str2);
  27:     }
  28:     int batchNo = 0;
  29: Label_012B:
  30:     items = list.GetItems(query);
  31:     int count = items.Count;
  32:     batchNo++;
  33:     try
  34:     {
  35:         itemsProcessor(items);
  36:         this.OnProcessedListItemsBatch(strListId, items, batchNo, count);
  37:     }
  38:     catch (Exception exception)
  39:     {
  40:         if ((errorCallout == null) || errorCallout(items, exception))
  41:         {
  42:             throw;
  43:         }
  44:     }
  45:     if (!this.ShouldCancel(IterationGranularity.Item))
  46:     {
  47:         query.ListItemCollectionPosition = items.ListItemCollectionPosition;
  48:         if (query.ListItemCollectionPosition != null)
  49:         {
  50:             goto Label_012B;
  51:         }
  52:     }
  53: }

It sets row limit (to 200 or 2000 depending on the is ViewFields property specified or not). Then it sets SPQuery.QueryThrottleMode property to Strict. According to documentation it means:

Throttling for both the number of items and for number of Lookup, Person/Group, and Workflow Status fields will apply to the query regardless of user permissions.

Also it uses SPQuery.ListItemCollectionPosition property for retrieving items by batches using RowLimit as number of items per batch.

As you can see ContentIterator makes a lot of hidden infrastructure work for you. So it can economy your time and allow to concentrate on the business tasks.

No comments:

Post a Comment