Monthly Archives: December 2020

Incremental refresh for the Dataflows which are loading data from SQL to Dataverse

In one of the earlier posts this month I was seemingly unhappy about the missing “incremental refresh” option: https://www.itaintboring.com/dynamics-crm/power-platform-dataflows-have-i-missed-incremental-refresh-option-somehow/

But, having thought about it a bit more, I don’t think it’s such a big issue. Not when working with the SQL data sources, at least.

All we really need is a view that will pre-filter source data:

image

So, I could just create a view like this:

CREATE VIEW ETLTest_Modified
AS
SELECT * FROM ETLTest
WHERE DATEADD(minute, 360, ModifiedOn) > GETUTCDATE()

And I could use it as a source table for the dataflow to limit the amount of data this dataflow has to transfer to Dataverse on every run.

This would do it, since, once the dataflow starts, it would grab 6 hours(in the example above) of modified data, and it’ll push it to the Dataverse table.

I might still see a bit of excessive utilization on the API side, though, since, if dataflow refresh time is not aligned properly with the view, the dataflow might still be loading data that’s already been loaded.

For example:

  • We might set up the view to only show data that was modified in the last 2 hours
  • We might want that data to go to Dataverse every 5 minutes
  • So, on the first run, the dataflow will take 2 hours of data and push it to Dataverse
  • On the second run (which would be 5 minutes later), the same dataflow would take 2 hours of data again… but that would include 1 hr 55 minutes of data that’s already been pushed to the Dataverse on the previous run

 

Depending on how often new data will be coming into the SQL table(and, also, depending on how often we want to push it to the Dataverse), though, this might or might not be a problem. For example, if that SQL table is only updated once a day at a specific time, we could set up the Dataflow to refresh 3 hours after the scheduled SQL table update, and, except for some rare occasions where SQL table is not updated on time, this would be a good enough approximation of the incremental refresh.

Also, there is one feature which is currently in preview and which might make this whole problem irrelevant, but, it seems, it’s not working with Power Platform Dataflows yet:

https://powerbi.microsoft.com/en-us/blog/announcing-dataflows-power-automate-connector-public-preview/

Somehow, when using “refresh dataflow” action, I’m not able to pick the environment where my Dataflow is – it only works with the default environment, it seems.

If this worked, I could just delegate “scheduling” part to the Power Automate, in which case I could do it like this:

  • Update my View above to only include data where “modifiedon” is between start and end dates
  • Those start and end dates might come from another table
  • In the scheduled Power Automate flow, I could set those “start” and “end” dates, then use “dataflow refresh” action to refresh the dataflow
  • For the next scheduled run, I’d user previous “end” date as a “start” date, and “current time” as an “end date”

 

This would have taken care of the incremental refresh.

So… just waiting for that action to start working properly in my tenantSmile

 

Canvas App on the Dashboard – how do we handle application url?

Environment variables can be very helpful in a lot of situations. Here is yet another scenario where I found them very useful the other day.

Imagine you have a Canvas Application you wanted to display in a web resource. Maybe even on the dashboard… something like this:

image

The app there is very simple, but it’s not the purpose of this post to talk about Canvas Apps development. It’s, mostly, to talk about the “integration” this time.

Every Canvas App has a web link:

image

When we are working in a single environment, it’s not a problem. However, that url changes when the app moves from one environment to the other, so how do we put it on the dashboard?

Traditionally, this is where we would create a configuration entity in the model-driven app, and we would use it to set this kind of properties. So we could, then, use javascript to read application url from the configuration entity, and, once we knew the url, we would open it.

It’s, basically, the same idea with the environment variables. Except that we don’ t need to create those configuration entities anymore.

The web resource below will read application url value from the environment variable called ita_DashboardAppUrl, and, then, it will open that url in an iframe:

 



<title>full screen iframe</title>
    <style type="text/css">
        html {
            overflow: auto;
        }
         
        html,
        body,
        div,
        iframe {
            margin: 0px;
            padding: 0px;
            height: 100%;
            border: none;
        }
         
        iframe {
            display: block;
            width: 100%;
            border: none;
            overflow-y: auto;
            overflow-x: hidden;
        }
    </style>

 

    <iframe width="98%" height="98%" id="canvasAppIFrame" src="about:blank" frameborder="0" marginwidth="0" marginheight="0" scrolling="auto">
  </iframe>
 
<script>
var appUrl = null;
function setIFrameSource()
{
  fetch("/api/data/v9.1/environmentvariabledefinitions?$select=defaultvalue,displayname&$expand=environmentvariabledefinition_environmentvariablevalue($select=value)&$filter=schemaname eq 'ita_DashboardAppUrl'")
.then(response => response.json())
.then(data => {
  appUrl = data.value[0].environmentvariabledefinition_environmentvariablevalue[0].value;
  document.getElementById("canvasAppIFrame").src = appUrl;
})    
}


setIFrameSource();

</script>

There is an api call that’s readin environment variable value, and that environment variable is set up like this:

image

So, now, I can just get that solution, deploy it in another environment, find Canvas App url for that environment, and use it to set current value of the environment variable in the new environment.

There is no need for the configuration entities, there is no need for unmanaged customizations in the production environment… it has all been taken care of with the environment variables (and with that simple web resource above).

Happy Holidays Everyone!

Power Automate – how do we check if a property exists in the object?

Every now and then, I would run into this error:

image

What’s happening there is that I’m going through a list of records, and, occasionally, there is a field that’s not populated. So, in the example above, here is where it fails:

image

That substring function expects a value, but how do we check if the value is there?

I could add a condition:

image

But you can see how it still does not work:

image

I could check if the property is empty:

image

That does not help either:

image

How do we work around this?

It’s, actually, simple.

There is a difference between this syntax (which I used in the conditions above):

items(‘Apply_to_each’)[‘_ita_location_value@OData.Community.Display.V1.FormattedValue’]

and the following syntax:

items(‘Apply_to_each’)?[‘_ita_location_value@OData.Community.Display.V1.FormattedValue’]

And that question mark before the property name makes all the difference. When using such syntax, the expression will return “null” if the property does not exist. And, when there is no question mark, the expression will fail, since it’ll try to access that non-existent property.

There are, probably, different ways to use it in the expressions/conditions, but here is an example:

empty(items(‘Apply_to_each’)?[‘_ita_location_value@OData.Community.Display.V1.FormattedValue’])

And now it works just fine:

 

With that in place, my condition works just fine when testing or empty value:

 

 

 

 

PS. For the record, BELOW IS AN OLDER VERSION OF THIS POST – IT WORKS, BUT IT’S NOT HOW IT SHOULD BE DONE – SEE ABOVE INSTEAD

It seems here is what works pretty well (though I’d think it’s a bit resource-intensive?):

  • We can use string function to convert our object to string
  • Then we can use “contains” on that string to check for the presence of the property in that object
  • And we can add “if” to the expression, so that the property is there we’ll be doing something with it. And, if not, we’ll just use some default value

 

It becomes a relatively long expression in the end  (I could have done it in C# much faster), but it seems to work:

if(contains(string(items(‘Apply_to_each’)), ‘_ita_location_value@OData.Community.Display.V1.FormattedValue’),  substring(items(‘Apply_to_each’)?[‘_ita_location_value@OData.Community.Display.V1.FormattedValue’], 0, 3), ”)

The same works fine with object variables, too.

And, once that is in place, my Flow starts working just fine:

image

CDS file/image fields and Power Automate flows

I got a little stuck the other day trying to upload files from Power Automate to a “file” filed in Dataverse – it turned out all the functionality was there, I was simply looking in the wrong place.

CDS (current environment) connector has a couple of special actions to work with the file/image attributes:

image

This might be a little confusing at first, since, when trying to use “Update” action, File/Image attributes don’t even show up:

image

And those are the only attributes that show up for the 2 actions above:

image

So, well, don’t make the same mistake as I did trying to update File attributes through the regular Update action.

That begs a question of why would I need File attribute if I can use Sharepoint, and, to be fair, there is no definitive answer, but there are a cuople of considerations:

  • Unlike files in Sharepoint, files and images stored in File/Image attributes are protected by the standard Dataverse security model (including field security if needed)
  • Files stored in Sharepoint (through the out of the box integration) are not structured at all  – that’s just a bunch of files associated with the Dataverse record. With the actual table columns in Dataverse, we (and the users) would know for sure what kind of file is stored in which attribute.

Those are, probably, the main two reasons.

Of course we are using file storage instead of Sharepoint storage, so the costs can be different, and that’s, also, something to keep in mind.

However, those would be project-specific decisions. And, as far as technology goes, just keep in mind Power Automate CDS (current environment) connector offers two separate actions specifically for File/Image columns.


How to: intercept record assignment and keep it under the same team all the time?

There was a little unusual scenario to cover. With user-owned tables in Dataverse (see, I’m practicing different pieces of new terminology in the same sentenceSmile ), how do we keep all records under the same team?

You might probably question the scenario to start with, and I’d be happy to just say that there is a hungry team here which wants all the records to itself:

image

But it’s not that. In this particular situation, this is just how we wanted to control security. There is a business unit, there is a team, and there are users from this and other business units that may need access to the records. Within the business unit, those records can be shared – that’s not a problem. Users can move between business units, they may need to be granted access to the BU data, and they may need to be removed.

It could probably be handled by giving such users a security role, but we may forget to remove roles. So we might give those roles to the team and keep adding/removing users to/from the team. Would not be quite clear what to do with the “owner” field if the users moves to another BU and does not need access to the data.

Plus there is data migration, so all those migrated records have to be assigned somewhere.

Anyway, it’s not necessarily the only option, but it’s an option, so how can we do it?

Well… let’s create a plugin.

Why?

  • We can register it in the pre-operation so we won’t need any post-processing to reassign records (which would be the case with Power Automate Flows and/or with the classic workflows)

We can use the same plugin for any entity. Although, we will need to register 3 steps per entity

And what are those steps?

  • Assign message (in PreValidation – can’t replace “Assignee” parameter of the request in the later stages)
  • Create message (PreOperation works fine – this is where we’ll be replacing “ownerid” column of the record)
  • Update message (PreOperation – same note as for the “Create” above)

Since the plugin will be running in Pre stages, it won’t need to consume additional API calls. Instead, it’ll be replacing “Assignee” (or “Owner”) in the incoming requests as they keep coming in.

public void Execute(IServiceProvider serviceProvider)
        {
            IPluginExecutionContext context = (IPluginExecutionContext)
                serviceProvider.GetService(typeof(IPluginExecutionContext));

            IOrganizationServiceFactory serviceFactory = (IOrganizationServiceFactory)serviceProvider.GetService(typeof(IOrganizationServiceFactory));
            IOrganizationService service = serviceFactory.CreateOrganizationService(context.UserId);


            var teamRef = new EntityReference("team", Guid.Parse("33844034-ac38-eb11-a813-0022483ba225"));
            if (context.InputParameters["Target"] is Entity)
            {
                var target = (Entity)context.InputParameters["Target"];
                if (target.Contains("ownerid"))
                {
                    target["ownerid"] = teamRef;
                }
            }
            else if (context.InputParameters.Contains("Assignee")) 
            {
                context.InputParameters["Assignee"] = teamRef;
            }

        }

And, then, just need to add SDK steps for every entity you want to handle that way. For example, this would be for the “Assign” message:

image

Of course you could make the plugin more intelligent – for instance, you might want to to look at the specific fields and only assign records to the team when those fields have certain values.

But those would be enhancements, and I just wanted to share basic code here.

Building a “dictionary” object in Power Automate Flow

I was toying with the idea of creating a dictionary variable in Power Automate, and, it seems, I got onto something. Unless there is an easier way?

Well, just for the sake of experiment, what if we wanted to count the number of accounts per owner in our Dataverse instance and store those numbers in an object variable like this:

image

There are only a few expressions we need to work with to get this done:

    setProperty(<object>, <property name>, <value>) – this one will set object property value

    variables(‘Dictionary’)[‘PROPERTY_NAME’] – this one will give us object’s property value, though that value has to be there already

And, if we wanted to test whether the property already exists, we could just use string() function to convert object variable to a json string, and, then, use “contains” condition. Like this:

image

The only other thing to consider is that we can’t use setProperty to update the same variable on which we are setting that property. So, I used two variables. The “Dictionary” itself, and another one, “TempDictionary”. Then I can use setProperty with the “Dictionary” variable and assign expression results to the “TempDictionary”. Then I can use “TempDictionary” variable to update my “Dictionary” variable. It’s kind of awkward, but…

Anyway, the Flow below will iterate through all Dataverse accounts, extract owner name for the “current” account, check if there is a counter for that owner already, increment it if it’s there or add a counter and set its value to 1:

image

    And here is the remaining part:

    image

    The last two actions are all about looking at the “content” of that Dictionary object.

    Have fun!

Promote to Admin and Check Access – two buttons we can use to investigate access issues

It might be yesterday’s news (or, at least, it’s probably at least a month old now) that we have “Check Access” button in the model-driven apps:

image

This one is available from the command bar, and it allows us to check record access for any user in the system:

image

It can also track record sharing – for example, if the same record were shared with the test user above, that screen would have changed:

image

Which means this button makes it much easier to see which roles are involved, and, also, what kind of sharing has been implemented to give specific user access to that particular record.

Sometimes, though, that might not be enough to see why the user is experiencing security errors – there could be a plugin, there could be a workflows, or there could even be a javascript that required access to something else behind the scene. And you might not see it from the screenshot above, but what you do see is how to set up a user account to do a bit more detailed testing.

Except that you might need a user account to do that kind of security testing, and, sometimes, it might not be available.

Don’t you worry, though. The next button has you covered:

image

What does it do, exactly?

It allows you to promote yourself (or, I think, any other user) to a system administrator. Even if you are not a system administrator at that time. So, the way these two buttons might work together is:

  • Using “Check Access”, you can figure out which permissions you need to test
  • You can assign required roles to your own user account, do record sharing as required, then remove “System Administrator” role from your account
  • From there, you can do the testing, figure out what’s wrong, and use “Promote to Admin” to get your “System Administrator” role back

There is a catch, though, so don’t jump into it yet. Before you do anything else, you’ll need to make sure you have “Promote to Admin” privilege:

image

There is a hidden “Support User” role which has that permission, but I find it awkward working with a hidden role – instead, it might be easier to create a new role with that one permission, assign it to yourself, and, from there, you’ll be good to go. As soon as that role is associated to your user account, you can always get your “system admin” privileges back.

Power platform Dataflows – have I missed “Incremental refresh” option somehow?

This month seems to be all about data integration/data migration for me, so, even though I’m still using good old SSIS in many cases, I wanted to look at the Dataflows a little closer.

After all, they don’t require any additional deployment/licensing – we just need to go to the maker portal and start building them:

image

But, it seems, there is something missing.  I’ll get to it closer to the end of this post – right now let’s just try something first.

We will need XrmToolBox and Record Counter tool (I used the one with 5 stars below):

image

There is a special “ETL Test Entity” I just created – it has a name field and and a “Test Id” text field (which is also an alternative key). Right now there are 0 records in this entity:

image

Then, there is a SQL table which is pre-populated with 10000 records:

drop table ETLTest
go
CREATE TABLE ETLTest (
Id uniqueidentifier,
FullName nvarchar(250),
ModifiedOn DATETIME2(7)
)
go
INSERT INTO ETLTest(Id, FullName,  ModifiedOn) VALUES(newid(), ‘Test’, getutcdate())
GO 10000

 

image

 

There is a dataflow which is loading data from the SQL table above to the ETL Test Entity in Dataverse:

image

image

 

And, below, I’ll give it a try:

image

 

10000 records moved in in just under 2 minutes – that was pretty fast:

image

 

image

This is a good result so far, since it could be very suitable for all sorts of data integrations. Besides, on a related note, it seems Dataflows can handle API throttling limits intelligently, though I did not take it far enough to test it this time:

https://powerusers.microsoft.com/t5/Power-Apps-Ideas/Remove-Service-Protection-API-throttling-limits-for-Dataflows/idi-p/484654

However, what if we do another refresh?

image

This is where I was hoping that, somewhat miraculously, the data flow would know that none of the records have changed, so none of them would have to be updated in Dataverse.

Instead, I can see all those records coming in again:

image

3 minutes later, they have all been updated:

image

image

There are still only 10000 records, so none were added (pretty sure that’s because my entity had an alternative key):

image

Still, I was hoping none of them would even be updated in Dataverse, since I had not changed any of them in the source, right?

And this is where I am finding this particular scenario (using Standard dataflows to load data to Dataverse) somewhat unclear. It seems there is supposed to be incremental refresh option according to the documentation page below:

https://docs.microsoft.com/en-us/power-query/dataflows/incremental-refresh

image

I can see this functionality for analytical data flows (those which would be dumping data to the data lake, for instance):

image

But I don’t see it for the Standard v1/v2 dataflows:

image

Which means using dataflows to load large data tables into Dataverse might be a waste of API calls for the entitlement limits.

Did I miss that feature somehow? Let me know if I did.

Otherwise, it might actually be more practical to use Azure Data Factory (or, possibly, something else such as SSIS) in the scenario above. Which is a little unfortunate since, compared to those other options, setting up a dataflow takes almost no effort at all.

 

PS. And here it a confirmation, thanks to a fellow MVP Carina M Claesson:

“Considerations and limitations: Incremental refresh in Power Platform dataflows is only supported in dataflows with an Azure Data Lake Storage account, not in dataflows with Dataverse as the destination.” It’s right there, on the same docs page, just all the way down to the bottom.

The challenges of using external data in the model-driven applications

Integrating external data with model-driven applications in the Microsoft Dataverse can be a challenging task for a number of reasons:

  • Data volumes
  • Storage costs
  • API throttling and entitlement limits
  • Technology limitations
  • Lack of experience with the tools/technologies
  • Vague requirements where a little change can affect the whole approach to the integration

 

In this post, I just wanted to provide a high-level overview of how this could be looked at, and, also, a few references to the Microsoft documentation which might be useful in this context.

 

 

 

Basically, it seems there are two options:

  • We can load data into Dataverse and deal with the ongoing data integration
  • Or we can try integrating external data without actually storing any/most of it in the Dataverse

 

There are a few major challenges with the “full sync” approach:

  • API limits (throttling and entitlement). This is because we can’t get around that API layer on the diagram
  • Storage costs and capacity model
  • Having to create and maintain those ETL jobs

 

For the API limits, have a look here: Requests limits and allocations

For the capacity model, here is another link: New Microsoft Dataverse storage capacity

For the storage costs, here is a link:  Latest Customer Engagement Licensing Change Could Lead to Dynamics 365 Storage Cost Increase

As for the technology limitations, there can be a bunch of problems:

  • MFA support
  • Ability to handle API throttling gracefully
  • Ability to run under application accounts
  • Ability to “rotate” user accounts (to work around those throttling limits)
  • Data protection (will your data be stored somewhere other than in the Microsoft cloud?)
  • And probably a lot more…

 

Lack of experience in the team can hit you badly, really. You can go with one ETL tool since it seems to be suitable, and, then, you’ll realize half-way through the project that something is missing. If only you knew from the beginning…

And this is where, once everything has been lined up and you are moving forward with the integration, a small change in the requirements can bring you back to where you started. Because, what if you were assuming you don’t have to do full sync and, suddenly, you have to do it? You might start hitting those thresholds, you may have to find a better tool, you may have to train  the team, etc.

And, as for the “partial sync” or “no sync”, the biggest challenge there is that you do have to tread very carefully.

Virtual entities might give your integration the most “natural” feeling, since you’ll be able to show them in the views, you’ll be able to use them in the lookups, and you’ll be able to use familiar model-driven forms (I’m still looking to do a few more posts on that topic, but have a look here for what I have so far). There will be limitations, though, since those are “virtual” entities, not the real ones. So, for example, if you think of building various advanced find views, you might just hit a wall there – essentially, you may have to combine data from two different data sources somehow.

You might use canvas apps, but, even though you can use them to surface data in the model-driven app, that’s just as far as you can go. Using external data in the lookups? No way. Using external data in the advanced find? That would not be an option. Also not an option (with the data not present in Dataverse as at least a virtual entity): excel export, excel import, word templates, etc.

And, yet, the problem of data integration keeps coming over and over – it just seems every case is special, so I’d be curious to here of how do you decide on the approach (for different data volumes etc).

On my side, and based on some of the discussions I was part of, it seems Dataflow can be the way to go for a number of reasons when bringing huge amount of data to Dataverse, so that might be the topic of one of the next posts.

Virtual Entities Revisited: using embedded canvas app for updates

As you could see in the first post of these series, virtual entities are still alive, Microsoft itself is relying on them for integrations, and we can also do some interesting things with them.

In the second post, I walked you through the process of setting up a virtual entity and creating a custom provider.

The only remaining part which has not been covered yet is turning virtual entities into a first-class citizen of your model-driven app by adding “update” support. And, for that, embedded canvas apps are almost a perfect fit. Well, as long as there is a connector, of course. Which is exactly the case for the virtual entities sourced from SQL.

So… I’m assuming you are familiar with embedded canvas apps – will just mention a few things, then.

1. Screen size and scale to fit

Unless you want font sizes to keep changing as the user is resizing browser windows, disable scale to fit and set custom screen size.

image

2.  Enhanced SQL connector – disable it

image

Or you can try keeping it enabled (which is the default). However, if you start getting “Bad Request” on update of the SQL data, you might have to remove all SQL connectors from your app, disable that setting, then re-add SQL connectors.

3. Add controls and “Save” button to the app

image

image

Add a formula to Patch SQL data source on select of that “Save” button:

image

And use it to refresh model-driven form, too (with a call to ModelDrivenFormIntegration.RefreshForm)

The end result should look more or less like this:

image

Unfortunately, we can’t get rid of that “Read-only” warning at the top, but, well, that’s not too bad.

But, before we wrap it up, I just realized that I missed something in the custom data provider. For the embedded canvas app to work properly, the filtering part in the plugin had to be somewhat more complicated. So, make sure to revisit the part where I was describing custom data provider since there are a few changes there.

PS. As a bonus, since you’ll be revisiting… Notice that I’ve also moved connection string to the secure configuration of the plugin step, so it’s not hardcoded anymore.

Have fun!