Monday, June 28, 2021

Use SQL LIKE operator in CosmosDB

One of known pain of developers who works with Azure table storage is lack of SQL LIKE operator which allows to perform queries with StartsWith, EndsWith or Contains conditions. There is another NoSQL database available in Azure - CosmosDB. It is not free (although Table storage is also not free but is very cheap) but with some notes it supports LIKE operator.

The first thing you should do when create new CosmosDB instance is to select underlying API:

At the moment of writing this post the following APIs were available:
  • Core (SQL)
  • MongoDB
  • Cassandra
  • Azure table
  • Gremlin

If you will create CosmosDB with Azure table API - it will look the same as native table storage and will have the same 6 operations available for queries: <, <=, >, >=, =, <>. I.e. no LIKE operator still:


But if you will create it with Core (SQL) API it will look different (it will work with "documents" instead of tables and rows):

and which is more important will allow to use SQL-like API and queries from Microsoft.Azure.Cosmos nuget package. E.g. if we have Groups table we may fetch groups which have some user as member using the following query (here for simplicity we assume that members are stored in JSON serialized string):

string userName = ...;
string sqlQueryText = $"SELECT * FROM c WHERE c.members like '%{userName}%'";

var queryDefinition = new QueryDefinition(sqlQueryText);
var queryResultSetIterator = this.container.GetItemQueryIterator<Group>(queryDefinition);
var items = new List<Group>();
while (queryResultSetIterator.HasMoreResults)
{
    var currentResultSet = await queryResultSetIterator.ReadNextAsync();
    foreach (var item in currentResultSet)
    {
        items.Add(item);
    }
}

So if you need NoSQL database with support of LIKE operator consider using CosmosDB with Core (SQL) API.

Thursday, June 24, 2021

Compare performance of Azure CosmosDB (with Azure Table API) vs Table storage

As you probably know Table storage is basic NoSQL db option available in Azure. It is very cheap and used in many Azure components internally (e.g. in Azure functions or bots). From other side CosmosDB is powerful NoSQL db engine available in Azure which support several different APIs. You need to select this API when create CosmosDB instance and it can't be changed after that:

  • Core (SQL) 
  • MondoDB
  • Cassandra
  • Azure Table
  • Gremlin

I.e. if you have existing application which already uses Table storage - you may create CosmosDB with Azure Table API and switch to it without rewriting application code (although this is in theory. E.g. if you app targets .Net Framework you will most probably need to retarget it to .Net Standard or .Net Core. But this is topic of different post).

In our product we use Table storage and decided to compare performance of CosmosDB (with Azure table API) and native Table storage. For testing the following data was used:

  • table with about 100K rows
  • query against system PartitionKey field: PartitionKey eq {guid} (this is fast query itself because PartitionKey is indexed field)
  • each query returned about 300 rows from overall 100K rows
  • execution time was measured

Same test had been ran 10 times in order to get average execution time - for CosmosDB and Table storage separately. Here are results:

Test run# CosmosDB (msec) Table storage (msec)
1 3639 2825
2 3249 2008
3 3626 1613
4 3016 1813
5 2764 1942
6 2840 1713
7 4704 1776
8 2975 2036
9 3353 1808
10 2833 1699
Avg (msec) 3299.9 1923.3

I.e. average execution time for native Table storage was less than for CosmosDB with Azure Table API.

Friday, June 4, 2021

Issue with PnP JS sp.site.exists() call when page url contains hashes

Today we faced with interesting problem when tried to use sp.site.exists(url) method from PnP JS for determining whether site collection with specified url exists or not: all calls returned 403 with System.UnauthorizedAccessException:

During analysis I've found that it tried to make HTTP POST calls to the following url for all sites: https://{tenant}.sharepoint.com/sites/_api/contextinfo which is definitely not correct because https://{tenant}.sharepoint.com/sites is not correct url but managed path. Then we noticed that page where this code was executed contains #/ in url:

https://{tenant}.sharepoint.com/sites/mysitecol#/favorites

and it seems like PnP JS parses it incorrectly in this case. As workaround we checked which API is used inside sp.site.exists(url) call (/_api/SP.Site.Exists) and rewrote code with explicit API call instead:

let promise = context.spHttpClient.post(context.pageContext.site.absoluteUrl + "/_api/SP.Site.Exists",
  SPHttpClient.configurations.v1, {
  headers: {
	"accept": "application/json;"
  },
  body: JSON.stringify({
	url: url
  })
}).then((response: SPHttpClientResponse) => {
  response.json().then((exists: any) => {
	if (exists.value) {
	  // site exists
	} else {
	  // site doesn't exist
	}
  });
});

Hope that PnP JS will fix this issue at some point.

Update 2021-07-05: issue inside pnp js was fixed by adding initialization call:

sp.setup({
  spfxContext: this.context
});

See this discussion for more details.