Dataverse Elastic Tables

When trying to create a new table inside the Microsoft Dataverse repository, you will probably see a new option when selecting the Dataverse table type. Elastic tables, which I have in mind, give us the possibility to use Azure Cosmos DB as a table data store and save our records as JSON documents inside the above-mentioned database. "Why bother?" you should probably ask. Does it work faster? Better? Is working with the mentioned data structures somehow different from the standard ones? In this article, I will describe some experiments I was doing with elastic tables and try to identify scenarios in which using document-based storage may be a suitable option.

CUD

The first thing I was trying to do during my experimenting with elastic tables was measure the required time for creating, updating, and deleting several records and compare it with the one required for classic Dataverse tables. During the test, I created, updated, and deleted a predefined number of records inside a foreach loop, calling the Dataverse API for every single record creation, update, and deletion. You may see the results (measured in seconds) in the table below.

 

CREATE

UPDATE

DELETE

TOTAL

Standard Table – 1000 records

141.93

149.06

182.97

~473.986

Elastic Table – 1000 records

151.05

170.84

162.67

~484.58

Standard Table – 5000 records

633.53

770.97

770.97

~2039.69

Elastic Table – 5000 records

735.04

744.36

722.99

~2202.39

Ok. As you can see, it is nothing special. The process is even a little slower in the case of elastic tables compared to the one that involves standard ones. But what will be the results in the case of executing CreateMultiple, UpdateMultiple, and DeleteMultiple requests against Dataverse API? Let’s take a look.

 

CREATE MULTIPLE

UPDATE MULTIPLE

DELETE MULTIPLE /DELETE *

TOTAL

Standard Table – 1000 records

11.00

19.37

180.35

~210.73

Elastic Table – 1000 records

5.93

6.96

5.49

~18.39

Standard Table – 5000 records

50.48

103.26

893.84

~1047.59

Elastic Table – 5000 records

29.58

35.14

28.18

~92.90

* In the case of a standard table, the DeleteMultiple request is not available. It works only for Elastic tables. So, for standard table deletions, I was using sequential calls of the Delete request to remove the created test data set.

We may see that in the case of creating and updating multiple records in a single request, elastic tables are 2 to 3 times faster than the standard ones. And of course, deleting multiple records with a single request is much faster than the same operation run for standard tables in a sequential way. It is your decision to decide if this comparison makes any sense.

To summarize, we can see that importing, updating, and deleting large data sets may be much faster when using elastic tables. The only requirement for this is sending multiple operations as a single request to the server.

Retrieving data

Let’s try to compare data retrieval performance between standard and elastic tables. I’ve prepared two test cases. The first one is about querying both tables using Dataverse context and a LINQ provider. What I’m trying to do is retrieve all the records with specific names from the standard and elastic tables.

For the second test, I’ve added one additional condition based on the PartitionId column. This is a special type of column available only for elastic tables and gives us the possibility to divide data into logical partitions. For the standard table, I’ve used a custom pg_partitionid string column. Of course, it has nothing in common with physical or logical partitioning, but it gives us the possibility of comparing queries with the same number of conditions.

The results of the above-described test have been presented in the table below.

Test records

2 standard table queries with a single condition

2 elastic table queries with a single condition

2 standard table queries with 2 conditions

2 elastic table queries with 2 conditions (one of them based on partition id)

1000

0.9248

2.0580

0.7123

1.5957

5000

3.0746

9.5361

1.7601

4.7870

Ok. We may see that elastic tables are not performance daemons. Every query (no matter if it uses a PartitionId or not) takes 2-3 times longer than in the case of standard tables.

By the way, be careful when using PartitionId with elastic tables. The DeleteMultiple request won’t work for records with this value set. Of course, this error will probably be fixed in the future because we’re still working with the preview version of elastic tables, but yeah... Be careful.

Another performance test I executed was about querying elastic tables with the CosmosDb SQL language.

I’ve tried to execute a very simple statement querying data using text column values. The results for 1000 and 5000 records have been presented below:

  • Single query returning 500 records from the total 1000: 0.6450 sec.

  • Single query returning 2500 records from the total 5000: 1.0615 sec.

Finally, the option of querying 5000 records seems to be working faster than the standard LINQ to Dataverse provider. Unfortunately, the developer experience for implementing and running Cosmos SQL queries may not be the best one. It reminds me of the old days of ADO.NET and calling SQL code directly from C# applications. So, when implementing queries against elastic tables, we may face the dilemma of choosing between code performance and maintainability.

Some other nice features

When playing with Elastic data, I’ve focused mostly on the performance-related topics in the previous chapters. I would also like to mention some nice features that are available only for elastic tables and may be useful for you in some specific scenarios.

"Time to live" value

Every elastic table contains a "ttlinseconds" column. The value inside this column defines the number of seconds for record removal that must elapse since the last record modification. This is great for scenarios when we produce a huge number of temporary records and don’t want to care about their removal after some time when they won’t be necessary anymore.

Querying JSON data

It is possible to store JSON data inside text columns in standard tables. But what if we would like to run queries that use values inside JSON itself? It was kind of difficult to achieve. Thanks to elastic tables, it is much easier now. First, it is possible to create a JSON-type text column inside an elastic table. At this moment, this possibility is only available with the metadata REST API, but I believe this option will be added soon to the Power Apps Studio editor. And when using ExecuteCosmosSqlQuery, it is possible to use syntax like: "where c.props.pg_myjsoncolumn.id \> 100" where pg_myjsoncolumnt is a Dataverse JSON text column inside the elastic table and "id" is a property of my JSON object stored inside this column.

Summary

So, to summarize my final thoughts about elastic tables, they seem to be a great option when performing operations on multiple data rows with dedicated requests. Also, querying data with Cosmos SQL syntax could work faster for sufficiently large data sets. Especially when our data is stored as JSON text, and we don’t care about using LINQ inside our code. The function of automatic data removal after a defined time sounds like a helpful option for some scenarios. However, in the case of the requirement of having many independent CRUD operations on single records and other elastic table features that are not important to us, standard Dataverse tables definitely should still be our first choice.