Querying CosmosDb for a dynamic list of properties

Author: Stephen McDowell

Background and Problem

I have a site that exposes a custom report generation functionality where a user is able to dynamically select the fields they would like to include on a report. The fields map to the properties in documents that are stored in CosmosDb. A single document averages about 4kb. A typical report might consist of between 100 to 10k+ documents depending on a date range criteria and the amount of data the tenant has.

The document contains nested relationships similar to this:

public class MyObject
{
    [JsonProperty(PropertyName = "id")]
    public string Id { get; set; }

    public List<MyNestedObject> MyNestedObjects { get; set; }

    public string BusinessId { get;set; }

    public DateTime SomeDate { get; set; }

    public string SomeString { get; set; }

    ...
}

public class MyNestedObject
{
    public string SomeOtherString { get; set; }

    public List<AnotherLevelOfNesting> Items { get; set; }

    ...
}

public class AnotherLevelOfNesting
{
    public string YetAnotherString { get; set; }

    ...
}

If the user selects only MyObject fields the end result is a single report row per document. If the user selects MyNestedObject fields the end result is a report row per MyNestedObject. In this case data points for the MyObject fields would be duplicated per row.

My current implementation is returning the entire document from CosmosDb and then shaping the results in code to match only the fields the user selected. This is the over-fetching problem that I am attempting to solve.

Possible Solution

I've attempted to build a dynamic projection based on an input like this:

public class SearchRequest
{
    public string BusinessId { get; set; }

    public MyObjectFieldsToInclude MyObjectFieldsToInclude { get; set; }

    public MyNestedObjectFieldsToInclude MyNestedObjectFieldsToInclude { get; set; }

    public AnotherLevelOfNestingFieldsToInclude { get; set; }
}

public class MyObjectFieldsToInclude
{
    public bool SomeDate { get; set; }

    public bool SomeString { get; set; }
}

public class MyNestedObjectFieldsToInclude
{
    public bool SomeOtherString { get; set; }   
}

public class AnotherLevelOfNestingFieldsToInclude
{
    public bool YetAnotherString { get; set; }
}

The boolean fields marked as true on the search request would drive the properties to include on the request to CosmosDb.

public async Task<IEnumerable<dynamic>> SearchDynamicAsync(SearchRequest request)
    {
        var queryOptions = new FeedOptions { MaxItemCount = -1 };
        var query = _client.CreateDocumentQuery<MyObject>(UriFactory.CreateDocumentCollectionUri(DatabaseName, CollectionName), queryOptions)
                    .Where(p => p.BusinessId == request.BusinessId);

        // This projection needs to be dynamic based on the fields requested
        var projectedQuery = query.Select(x => new
        {
            x.Id,
            // if (request.MyObjectFieldsToInclude.SomeDate
            // x.SomeDate
            // if (request.MyObjectFieldsToInclude.SomeString
            // x.SomeString
            // x.MyNestedObject = new {
            // if (request.MyNestedObjectFieldsToInclude.SomeOtherString
            // x.SomeOtherString
            // }
            // ...          
       });

       return await CosmosHelper.QueryAsync(projectedQuery);
    }


public class CosmosHelper
{
    public static async Task<IEnumerable<T>> QueryAsync<T>(IQueryable<T> query)
    {
        var docQuery = query.AsDocumentQuery();

        var results = new List<T>();
        while (docQuery.HasMoreResults)
        {
            results.AddRange(await docQuery.ExecuteNextAsync<T>());
        }

        return results;
    }
}

Putting if statements in the projection obviously doesn't work. I've also unsuccessfully attempted to build the dynamic projection based off of this:Create Dynamic Func from Object.

Questions:

  1. Am I on the right path or should I be considering a different approach?
  2. How do I build a dynamic expression off of my SearchRequest object?

Thanks in advance!

Originally Sourced from: https://stackoverflow.com/questions/57946572/querying-cosmosdb-for-a-dynamic-list-of-properties