Monday, January 31, 2022

How to get all items from large lists in Sharepoint Online

As you probably know Sharepoint lists have default throttling limit 5000 items which means that if query result to more than 5000 items it will throw resource throttling exception. One solution which you may try is to use RowLimit:

<View>
    <Query>
        <Where>
            <Eq>
                <FieldRef Name=\"Title\" />
                <Value Type=\"Text\">test</Value>
            </Eq>
        </Where>
    </Query>
    <RowLimit>5000</RowLimit>
</View>

Or with Camlex:

var camlQuery = Camlex.Query()
    .Where(x => (string)x["Title"] == "test")
    .Take(5000)
    .ToCamlQuery();

However if list contains more than 5000 items this query will anyway result in resource throttling exception regardless of RowLimit being set. Another possible solution is to try to add index to set some field as indexed and use condition against this field in query. But what if you can't do that (e.g. if you don't have control over the list schema)? How to get all items from large list then?

For that you may use query ID > 0 (which is always true since ID is auto generated integer) and CamlQuery.ListItemCollectionPosition property which allows to get paginated results. Code will look like this:

ListItemCollectionPosition position = null;
var items = new List<ListItem>();
do
{
    var camlQuery = Camlex.Query()
        .Where(x => (int)x["ID"] > 0)
        .ViewFields(x => new []
        {
            x["ID"],
            x["Title"]
        })
        .Take(5000)
        .ToCamlQuery();
    camlQuery.ListItemCollectionPosition = position;
    var listItems = list.GetItems(camlQuery);
    ctx.Load(listItems);
    ctx.ExecuteQueryRetry();

    items.AddRange(listItems);
    position = listItems.ListItemCollectionPosition;

    if (position == null)
    {
        break;
    }
} while (true);

It will fetch items by 5000 items per page until all items from large list will be fetched.

Thursday, January 27, 2022

Delete O365 group with associated site collection via Graph SDK

In order to delete O365 group (Azure AD group + Sharepoint Online site collection) via .Net Graph SDK the following code may be used:

public static bool DeleteGroup(string groupId)
{
    try
    {
        if (string.IsNullOrEmpty(groupId))
        {
            return false;
        }

        var graphClientApp = new GraphServiceClient(new AzureAuthenticationProviderAppPermissions());
        if (graphClientApp == null)
        {
            return false;
        }

        graphClientApp.Groups[groupId].Request().DeleteAsync().GetAwaiter().GetResult();

        return true;
    }
    catch (Exception x)
    {
        return false;
    }
}

public class AzureAuthenticationProviderAppPermissions : IAuthenticationProvider
{
    // implement own authentication logic for app permissions
}

Azure AD group will be deleted immediately while associated site collection will be deleted with some delay (usually few minutes).

Friday, January 21, 2022

CosmosDB: compare queries performance against strings and array properties

Imagine that we have CosmosDB instance running in Azure (created with SQL syntax enabled) which has groups entities. Each group has members property - JSON array of members of this group. Also each group has the same members serialized to string in membersStr property:

{
    "id": "cf66e4ea-39b6-43bf-a237-8ae997c8cd6a",
    "membersStr": "[{\"loginName\":\"user.automatic17@mytenant.onmicrosoft.com\",\"displayName\":\"User Automatic17\",\"id\":\"53bcf941-fa64-4af2-ac47-c7f48430df29\"},{\"loginName\":\"user.automatic52@mytenant.onmicrosoft.com\",\"displayName\":\"User Automatic52\",\"msGraphUserId\":\"515ee5ae-71dc-4c5a-a844-41af5f9d419c\"},{\"loginName\":\"user.automatic67@mytenant.onmicrosoft.com\",\"displayName\":\"User Automatic67\",\"msGraphUserId\":\"21106ae1-25fb-43ac-a5e5-aa1f0d3f9f2f\"}]",
    "members": [{
        "loginName": "user.automatic17@mytenant.onmicrosoft.com",
        "displayName": "User Automatic17",
        "id": "53bcf941-fa64-4af2-ac47-c7f48430df29"
    }, {
        "loginName": "user.automatic52@mytenant.onmicrosoft.com",
        "displayName": "User Automatic52",
        "msGraphUserId": "515ee5ae-71dc-4c5a-a844-41af5f9d419c"
    }, {
        "loginName": "user.automatic67@mytenant.onmicrosoft.com",
        "displayName": "User Automatic67",
        "msGraphUserId": "21106ae1-25fb-43ac-a5e5-aa1f0d3f9f2f"
    }]
	...
}

The question which we wanted to clarify is if we want to get groups where specific user is member - what query will be faster:

1. which uses membersStr property and LIKE operator:

SELECT c FROM c WHERE c.membersStr LIKE '%{userId}%'

2. or which uses array and object-oriented query against members properties:

SELECT c FROM c JOIN m IN c.members WHERE m.id = '{userId}'

For running this test I used CosmosDB with 10K entities with random users data. For testing I used the following C# methods:

private static List<object> GetGroupsByArray(string userId)
{
    string query = $"SELECT c FROM c JOIN m IN c.members WHERE m.id = '{userId}'";
    QueryDefinition queryDefinition = new QueryDefinition(query);
    var groupsContainer = cosmosClient.GetContainer(DBName, ContainerName);
    FeedIterator<object> queryResultSetIterator = groupsContainer.GetItemQueryIterator<object>(queryDefinition);
    var groups = new List<object>();
    while (queryResultSetIterator.HasMoreResults)
    {
        FeedResponse<object> currentResultSet = queryResultSetIterator.ReadNextAsync().GetAwaiter().GetResult();
        foreach (var group in currentResultSet)
        {
            groups.Add(group);
        }
    }

    return groups;
}

private static List<object> GetGroupsByString(string userId)
{
    string query = $"SELECT c FROM c WHERE c.membersStr LIKE '%{userId}%'";
    QueryDefinition queryDefinition = new QueryDefinition(query);
    var groupsContainer = cosmosClient.GetContainer(DBName, ContainerName);
    FeedIterator<object> queryResultSetIterator = groupsContainer.GetItemQueryIterator<object>(queryDefinition);
    var groups = new List<object>();
    while (queryResultSetIterator.HasMoreResults)
    {
        FeedResponse<object> currentResultSet = queryResultSetIterator.ReadNextAsync().GetAwaiter().GetResult();
        foreach (var group in currentResultSet)
        {
            groups.Add(group);
        }
    }

    return groups;
}

Each method was called with 10 random user ids. In both cases time of query depended from number of groups user belongs to (the more groups user was member of the more time query took). I made 10 test runs of both methods and measured execution time with System.Diagnostics.Stopwatch. Then I calculated average execution time for each run of 1st method and separately for 2nd method. Finally I measured average execution time for all 10 test runs.

Result was that query against string property (JSON-serialized array) was a bit faster (929.53 milliseconds) than query agianst array property (1362.3). Conclusion is that if performance is not critical for you and you want to have nice-looking entities in CosmosDB you may use array properties. But if performance is critical you may consider to use JSON-serialized property (or use both as an option).

Update 2022-04-14: we also compared internal query engine execution time and got different results. For array-based approach performance was better. For string-based approach query took 38ms:

while for array-based approach only 18ms:

Probably .Net SDK which was used during 1st test added own delays. Consider that when you will choose approach for your scenario.

Friday, January 14, 2022

How to get Azure storage account associated with Azure function app via PowerShell

As you probably know Azure function app is provisioned with Storage account behind it which is used for internal needs of function app (e.g. storing compiled dlls of Azure functions in blob storage of this storage account or storing App insights logs in tables). If we need to get instance of this storage account associated with Azure function app we may use the following PowerShell:

$app = Get-AzWebApp -Name "MyFunctionApp" -ResourceGroupName "MyResourceGroup"
$kv = $app.SiteConfig.AppSettings | Where-Object { $_.Name -eq "AzureWebJobsDashboard" }
$found = $kv.Value -match ".*;AccountName=(.+?);"
$storageAccountName = $matches[1]
$storageAccount = Get-AzStorageAccount -StorageAccountName $storageAccountName -ResourceGroupName "MyResourceGroup"

In above code we first get instance of function app using Get-AzWebApp cmdlet and find AzureWebJobsDashboard application setting which contains connection string of associated storage account. After that we retrieve storage account name from connection string using regex and finally call Get-AzStorageAccount to get actual storage account instance. Hope it will help someone.