Showing posts with label Azure table storage. Show all posts
Showing posts with label Azure table storage. Show all posts

Wednesday, February 16, 2022

Disadvantages of Azure table storage as NoSQL data storage

We used Azure table storage as NoSQL data storage for quite a time and got some experience which may be useful to share with community. First of all I want to say that I like Azure table storage: it is very simple to use (there are different SDKs), flexible with it's dynamic data schema for tables (although this must be a common thing in NoSQL world), fast (if you use it correctly and avoid full table scan scenarios) and last but not least - it is very cheap.

However at the same time it has number of cons which may be significant depending on amount of data you need to process and performance which you need to achieve. I collected them in the following table:

Cons Comments
No LIKE operator Can't perform search by sub string. Need to introduce complex data duplication patterns with different PartitionKey/RowKey for avoiding full table scan and performance issues
Only 2 system indexed columns (PartitionKey/RowKey) without possibility to add own indexes Can't add custom indexed fields for improving queries performance
No full pagination support on API level There is possibility to limit returned resultset size, but no possibility to skip number of rows
No native backup options There are 3rd party (including open source) solutions which can be used but no built-in support from MS

Like I said I personally like Azure table storage. But due to these cons you may also consider other options for data storage e.g. Azure CosmosDB which is more expensive but also more powerful at the same time.

Friday, January 14, 2022

How to get Azure storage account associated with Azure function app via PowerShell

As you probably know Azure function app is provisioned with Storage account behind it which is used for internal needs of function app (e.g. storing compiled dlls of Azure functions in blob storage of this storage account or storing App insights logs in tables). If we need to get instance of this storage account associated with Azure function app we may use the following PowerShell:

$app = Get-AzWebApp -Name "MyFunctionApp" -ResourceGroupName "MyResourceGroup"
$kv = $app.SiteConfig.AppSettings | Where-Object { $_.Name -eq "AzureWebJobsDashboard" }
$found = $kv.Value -match ".*;AccountName=(.+?);"
$storageAccountName = $matches[1]
$storageAccount = Get-AzStorageAccount -StorageAccountName $storageAccountName -ResourceGroupName "MyResourceGroup"

In above code we first get instance of function app using Get-AzWebApp cmdlet and find AzureWebJobsDashboard application setting which contains connection string of associated storage account. After that we retrieve storage account name from connection string using regex and finally call Get-AzStorageAccount to get actual storage account instance. Hope it will help someone.

Monday, June 28, 2021

Use SQL LIKE operator in CosmosDB

One of known pain of developers who works with Azure table storage is lack of SQL LIKE operator which allows to perform queries with StartsWith, EndsWith or Contains conditions. There is another NoSQL database available in Azure - CosmosDB. It is not free (although Table storage is also not free but is very cheap) but with some notes it supports LIKE operator.

The first thing you should do when create new CosmosDB instance is to select underlying API:

At the moment of writing this post the following APIs were available:
  • Core (SQL)
  • MongoDB
  • Cassandra
  • Azure table
  • Gremlin

If you will create CosmosDB with Azure table API - it will look the same as native table storage and will have the same 6 operations available for queries: <, <=, >, >=, =, <>. I.e. no LIKE operator still:


But if you will create it with Core (SQL) API it will look different (it will work with "documents" instead of tables and rows):

and which is more important will allow to use SQL-like API and queries from Microsoft.Azure.Cosmos nuget package. E.g. if we have Groups table we may fetch groups which have some user as member using the following query (here for simplicity we assume that members are stored in JSON serialized string):

string userName = ...;
string sqlQueryText = $"SELECT * FROM c WHERE c.members like '%{userName}%'";

var queryDefinition = new QueryDefinition(sqlQueryText);
var queryResultSetIterator = this.container.GetItemQueryIterator<Group>(queryDefinition);
var items = new List<Group>();
while (queryResultSetIterator.HasMoreResults)
{
    var currentResultSet = await queryResultSetIterator.ReadNextAsync();
    foreach (var item in currentResultSet)
    {
        items.Add(item);
    }
}

So if you need NoSQL database with support of LIKE operator consider using CosmosDB with Core (SQL) API.

Thursday, June 24, 2021

Compare performance of Azure CosmosDB (with Azure Table API) vs Table storage

As you probably know Table storage is basic NoSQL db option available in Azure. It is very cheap and used in many Azure components internally (e.g. in Azure functions or bots). From other side CosmosDB is powerful NoSQL db engine available in Azure which support several different APIs. You need to select this API when create CosmosDB instance and it can't be changed after that:

  • Core (SQL) 
  • MondoDB
  • Cassandra
  • Azure Table
  • Gremlin

I.e. if you have existing application which already uses Table storage - you may create CosmosDB with Azure Table API and switch to it without rewriting application code (although this is in theory. E.g. if you app targets .Net Framework you will most probably need to retarget it to .Net Standard or .Net Core. But this is topic of different post).

In our product we use Table storage and decided to compare performance of CosmosDB (with Azure table API) and native Table storage. For testing the following data was used:

  • table with about 100K rows
  • query against system PartitionKey field: PartitionKey eq {guid} (this is fast query itself because PartitionKey is indexed field)
  • each query returned about 300 rows from overall 100K rows
  • execution time was measured

Same test had been ran 10 times in order to get average execution time - for CosmosDB and Table storage separately. Here are results:

Test run# CosmosDB (msec) Table storage (msec)
1 3639 2825
2 3249 2008
3 3626 1613
4 3016 1813
5 2764 1942
6 2840 1713
7 4704 1776
8 2975 2036
9 3353 1808
10 2833 1699
Avg (msec) 3299.9 1923.3

I.e. average execution time for native Table storage was less than for CosmosDB with Azure Table API.

Monday, May 18, 2020

Query limits in Azure Table storage API

As you probably know Table API used by Azure table storage is limited by the following operators which can be used in queries:

  • =
  • <>
  • >
  • <
  • <=
  • >=

We needed to run analogue of NOT IN Sql operator against RowKey column:

RowKey NOT IN (id1, id2, …, id_n)

where these ids were guids represented by strings. In order to do that with mentioned operators we needed to use not equal operator (<>) and construct query like that:

RowKey <> id1 AND RowKey <> id2 AND … RowKey <> id_n

The problem is that this list of guids didn’t have fixed size and could be potentially big. So it was interesting is there limit for max query length and if yes – what is exact value of this maximum query length.

For testing I created test app and generated conditions over guids like that (in addition to mentioned conditions on RowKey we needed to get groups from particular partition only – so this extra condition was added there as well. Exact partition name is not important – we only need to know that its length was 8 symbols – it will be used for overall query length calculation below):

var tableClient = storageAccount.CreateCloudTableClient(new TableClientConfiguration());

var table = tableClient.GetTableReference("MyTable");
var queryConditions = new List();
queryConditions.Add(TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.Equal, "Test1234"));
int numOfConditions = 50;
for (int i = 0; i < numOfConditions; i++)
{
 queryConditions.Add(TableQuery.GenerateFilterCondition("RowKey", QueryComparisons.NotEqual, Guid.NewGuid().ToString().ToLower()));
}
string filter = CombineUsingAnd(queryConditions);
var query = new TableQuery().Where(filter);

var watch = System.Diagnostics.Stopwatch.StartNew();
var result = table.ExecuteQuery(query).ToList();
watch.Stop();


public static string CombineUsingAnd(List queryConditions)
{
 if (queryConditions == null || queryConditions.Count == 0)
 {
  return string.Empty;
 }
 string filter = string.Empty;
 foreach (string queryCondition in queryConditions)
 {
  if (string.IsNullOrEmpty(queryCondition))
  {
   continue;
  }

  if (string.IsNullOrEmpty(filter))
  {
   filter = queryCondition;
  }
  else
  {
   filter = TableQuery.CombineFilters(filter, TableOperators.And, queryCondition);
  }
 }

 return filter;
}

E.g. here how query looks like with condition on 50 guids:

((((((((((((((((((((((((((((((((((((((((((((((((((PartitionKey eq 'Teamwork') and (RowKey ne '60712ffe-061b-4a46-8402-01381177c4da')) and (RowKey ne '85694313-2776-485c-833d-ae78c185adba')) and (RowKey ne 'dac6f7fd-1198-486c-b3ea-e90337ee9711')) and (RowKey ne '5faf85cb-4662-4843-87f6-ac6dc20b04e1')) and (RowKey ne '8f524940-9822-46e1-b961-59a2e8fa51b4')) and (RowKey ne 'bacd765e-64d3-464a-a57e-bb30eeca0c2f')) and (RowKey ne '75ba6126-f529-4b30-ae41-882f2b223bff')) and (RowKey ne '5e9950ac-e611-4d58-8c91-8fcf16f2886a')) and (RowKey ne '8f08bd7b-7d6e-4c21-8b94-d5ca2a15f849')) and (RowKey ne 'a965f6f6-f03c-4b4a-91be-98481a35ba11')) and (RowKey ne '5858de75-addb-4f5a-acc1-6f9f55078b4f')) and (RowKey ne '9ada1d04-dd03-4b1f-ae26-bc51b7f35ec3')) and (RowKey ne 'cbb976e7-87c9-4b83-8007-355f3ce1061d')) and (RowKey ne '59e06da8-f343-4a98-9382-0bbf1ce21d3f')) and (RowKey ne 'ea0763d9-cb64-4750-a7e0-c1869ca82a15')) and (RowKey ne '926106f9-bf64-42db-b27f-be9bcb45907e')) and (RowKey ne '3cdcaf1e-ff36-4243-b865-9424e1f46427')) and (RowKey ne '8fcfc4ca-7510-4fa6-95d7-3ff2999de509')) and (RowKey ne '21b04d3a-32e0-4a59-900a-30babfa8fbd2')) and (RowKey ne 'df181ee8-4dc0-4232-b2bc-7e13dc8f2abf')) and (RowKey ne '6a0038d0-40a4-4a11-b626-9850b8178caa')) and (RowKey ne 'ffb78a96-85c9-4648-aba7-3baa56be6c1f')) and (RowKey ne '0c56afd8-164e-4574-9054-8824e6fc1969')) and (RowKey ne '51d95602-cfee-40bf-b9e3-b0f935083742')) and (RowKey ne 'f286d1fb-b9bb-4850-a911-58a77ffc7991')) and (RowKey ne '50280089-e61b-4874-8f99-8dc437f74181')) and (RowKey ne '05f72847-3936-4efc-9902-540a9d9c5ca8')) and (RowKey ne '066fcbe4-afee-46ff-b6d2-18e072f850a0')) and (RowKey ne '602f7cd4-7697-49ca-af51-e96c70eefed5')) and (RowKey ne '6858f832-a58e-4bcb-b57b-7c40d32cbcb6')) and (RowKey ne 'f9e2cc4f-a579-46f4-820b-e7d8ab9711c8')) and (RowKey ne 'cbb68c38-7e73-463b-8839-bef040803c37')) and (RowKey ne '1bf0d7ff-ece6-4ef1-9d48-ae424feea108')) and (RowKey ne 'de1dbdbf-8a75-40b4-bb9b-81e95ff4a542')) and (RowKey ne '31f5c94e-3eea-4c5a-8e60-8899eddc7829')) and (RowKey ne '94bc24a8-28e4-4b6d-965e-9360104bff21')) and (RowKey ne '93d7f84e-eb4b-4c7d-826b-196e8b46255e')) and (RowKey ne 'bd0d0989-7256-4ffc-8904-6135955b8aae')) and (RowKey ne '9c38865a-913c-468c-bd26-25e606f9b73a')) and (RowKey ne 'c2f9b899-76c1-4b04-aa68-fb3f903d81d4')) and (RowKey ne '9b478051-78b2-4a67-a25f-6cc09251ec33')) and (RowKey ne '908f80ea-e31f-4a85-9ff8-8cb80b601434')) and (RowKey ne '7775f8d4-2b9c-4903-a191-b54d214e380a')) and (RowKey ne '55b24a87-3259-4ca4-8869-f6d68f9c95c3')) and (RowKey ne '0476263d-cf8a-49b3-89f1-0bc8df9fd906')) and (RowKey ne 'f7a45c30-eba7-4944-b08b-a875d4b871e0')) and (RowKey ne '4136dc3c-10d3-4d58-afb9-23d8b422822e')) and (RowKey ne 'b6a40f8e-5284-4cbd-9356-2296df415650')) and (RowKey ne 'd63ecb62-5f48-4314-8a15-08c0b62ee835')) and (RowKey ne 'e3acdc4f-70a5-4a73-b023-1845ccf9ea51')

Testing results are summarized in the following table:

Num of conditions Filter length Query elpsed time
50 2876 12075
100 5726 10550
112 6410 10361
113 6467 10967
114 6524 Bad request
> 114   Bad request

So Table storage query maximum length is between 6467 and 6524 symbols. For us this knowledge was enough to proceed. Hope that this information will help someone.