Thursday, March 28, 2013

Drawbacks of using search-driven approach in Sharepoint 2013

On the SPC 2012 where Sharepoint 2013 was officially presented to the public (although it was release a few weeks before) one of the raised question was: why to not use search-driven approach? The idea of this approach is quite simple: there is authoring site where content owners create content. This site is not available for regular visitors. Content from authoring site is crawled by search crawler and stored in the search index. Also there is publishing site, where content producers define how content from authoring site should look like. They use new standard Content by search web parts, Catalogs, Cross site publishing, Catalog item reuse web parts for that. In the web parts designers define display templates for the content. This concept is shown on the following picture from technet:

image

In conjunction with the new continuous crawl feature of the Sharepoint search, which allows to decrease the gap between publishing of the content and the moment when it become available in the search index (and those shown to end users), this technique is really quite powerful. It also gives performance advantages comparing with Content by query web parts: in the last case queries to content database are performed synchronously when user loads the page with web part, which consumes server resources, while in search-driven approach queries are done to the search index which contains crawled content already (although queries to the search index is also not a panacea. In Sharepoint 2010 queries by all fields except standard rank property could be very slow, because on the search database level index was built only for the rank property. I need to verify how it will work with search in Sharepoint 2013). Crawling itself is often performed on separate application server with dedicated WFE, which is not used by the load balancer, i.e. which doesn’t process user requests.

Advantages are good, but are there any problems with this approach? At the current moment we already have experience of using Sharepoint 2013 in the real projects and I may share some information about drawbacks of search-driven approach.

The first classical problem is delay, because even with continuous crawl there is no guarantee that data will be crawled immediately. For some business cases it is crucial requirements, e.g. when company publishes annual report on the public site and it should become available exactly at the same time when it is published in other media.

The second problem affects general content creation model. With search-driven approach content owners can’t use web parts on the authoring site. Or to be more precise: they can use it, but content inside them won’t be shown on the publishing site. Crawler will crawl content fields of the pages on the authoring site and these fields will appear in crawled properties list, which then can be mapped to managed properties. On the publishing site Content by search web parts will contain queries which will use these properties in order to display content. So unless you show whole page as plain text, you won’t be able to display web parts content on the publishing site.

As you probably know starting with Sharepoint 2010 it is possible to add web parts not only to web part zones, but also to the rich html fields of the publishing pages. So may be this would be solution? Unfortunately no: we tested this scenario and all web parts which were added to the content field were lost in the search index.

Problem with web parts can be very significant for the users who worked with Sharepoint 2007 and 2010 before, because it means that they can’t use classic working model with Sharepoint, where it is possible to create pages and put web parts on them. Instead they should use fields controls and put content inside them. In turn it means lose of WYSIWYG. Although it is possible to create page layouts with field controls defined by the same way as they will be displayed on the publishing site, but in this case you will need to have predefined lists of page layouts on the authoring site and publishing sites which will correspond to each other (e.g. on authoring site page layouts should use field controls, while on publishing site page layouts may still use web part zones).

There is also another problem related with displaying correct targeted content in the site search when search-driven approach is used with cross-site publishing and managed metadata navigation. But at the moment we are still looking for the solution of this problem and doing more experiments. Once I will have more details about this problem I will update this post.

Hope that knowing drawbacks of the search-driven approach will allow you to make correct decisions. If you faced with another problems, please share them in comments.

7 comments:

  1. Hi, Nice post. I implemented cross-site collection feature and CSWP(Content search webpart) in SP2013. My target is to get metadata of the document library items, but somehow, I can able to show the documents not the metadata.
    Just wondering, will it be possible to show the metadata of the document library items, using cross-site collection feature?

    ReplyDelete
    Replies
    1. This is the flow I want to follow here. Document library->Items->Metadata of the document library item.

      Delete
    2. hi SP Pro,
      yes it is possible. At first you need to ensure that metadata which you need to display is crawled by search (there are crawled properties and managed properties in search service application for your metadata). After that you may configure Content by search web part to show these properties in your display templates using "Property mappings" web part properties category.

      Delete
  2. Hi,
    thanks for your drawbaks. I have a question: what about documents? Imagine you are using XSP in an Intranet/Extranet scenario, where you build an Customer Site for each of your Customers. Then you whant to share some documents not dupicating them. Documents are stored in your Intranet (permissions are already given to you customer too), but your customer will see these documents only in the context of their site. With the right combination of tag and queries you retrieve documents with the CSWP, but what about the url?
    Thanks for any comments or suggestions

    ReplyDelete
  3. Hello,
    when you connect to catalog you can construct the url which will be used for the content, related to the current site. We used this approach for publishing pages, but I'm not sure will it work also for documents.

    ReplyDelete
  4. Hi Alexey,

    I have Authoring SC and Publishing SC. In authoring sc i have list with some data, data has crawled in to Search Server. Now i want to pull crawled List data in Publishing SC, how can i do it.
    Can you please provide some steps or document.

    ReplyDelete
    Replies
    1. hello,
      you need to subscribe to the authoring site, i.e. add catalog connection. Here you can see how to do it programmatically: http://msdn.microsoft.com/en-us/library/microsoft.sharepoint.publishing.catalogconnectionmanager.aspx. After that there should be new search results source in publishing site collection, which will correspond to the authoring site's content in search index. The last step will be specifying of this result source in content by search web parts.

      Delete