Sometimes I think that Microsoft Cloud is not quite a cloud – it’s, actually, more like an ocean (which is, probably, similar to how things are with other “clouds” to be fair).
As an on-premise consultant, I did not use to appreciate the depth of Microsoft cloud at all. As a Power Platform consultant, I started to realize some of the extra capabilities offered by the Power Platform, such as:
- Canvas Applications
- Power Automate Flows
- Different licensing options (can be good and bad)
- Integration with Azure AD
Yet I was suffering quite often since, you know, there is “no way I can access the database”.
And, then, I tried the Dataflows recently. Which took me on a little different exploration path and made me realize that, as much as I’m enjoying swimming in the familiar lake, it seems there is so much more water out there. There is probably more than I can hope to cover, but I certainly would not mind going on a cruise and see some of it. So, this post is just that – a little cruise into the cloud ETL/ELT capabilities:
And, by the way, normally, you don’t really do deep diving on a cruise. You are out there to relax and see places. Here is the map – there will be a few stops, and, of course, you are welcome to join (it’s free!):
Stop #1: On-Premise ETL tools for Dynamics/Power Platform
If you have not worked with Dynamics on-premise, and I am assuming it’s about time for the pure-breed cloud consultants to start showing up, on-premise ETL tools might be a little unfamiliar. However, those are, actually, well-chartered waters. On-premise ETL tools have been around for a long time, and, right off the top of my head, I can mention at least a few which I touched in the past:
- Scribe(now Tibco – thank you Shidin Haridas for mentioning they were acquired)
They all used to work with Dynamics CRM/Dynamics 365 just fine. Some of them turned into SAAS tools (Scribe online, for example), and some of them took a different route by merging into the new cloud tools (SSIS). Either way, in order to use those tools we had to deploy them on premise, we had to maintain them, we had to provide required infrastructure, etc. Although, on the positive side, the licensing was never about “pay per use” – those tools were, usually, licensed per the number of connections and/or agents.
We are still just near the shore, though.
Stop #2: PowerPlatform ETL capabilities
This is where we are going a little beyond the familiar waters – we can still use those on-premise ETL tools, but things are changing. Continuing the analogy, the cruise ship is now somewhere at sea.
Even if you’ve been working with the Power Platform for a while now, you might not be aware of the ETL capabilities embedded into the Power Platform. As of now, there are, actually, at least 3 options which are right there:
And, of course, we can often still use on-premise tools. After all, we are not that far from the shore. Though we are far enough for a bunch of things to have changed. For example, this is where an additional Power Platform licensing component kicks in since Power Apps licenses come with a certain number of allowed API calls.
Still, why would I call out those 3 options above? Technically, they are offering everything you need to create a ETL pipeline:
- A schedule/a trigger/manual start
- A selection of data sources
- A selection of data destinations
Well, data lake export is special in that sense, since it’s hardwired for the CDS to Azure Data Lake export, but, when in the cloud, that’s an important route, it seems.
How do they compare to each other, though? And, also, how do they compare to the on-premise ETL tools (let’s consider SSIS for example):
The interesting part about Data Lake Export is that it does not seem to have any obvious advantages over any of the other tools EXCEPT that setting up CDS to Data Lake export looks extremely simple when done through “data lake export”.
Stop #3: Azure Data Factory
Getting back to the analogy of Azure being the ocean, it should not surprise you that, once in the ocean, we can probably still find the water somewhat familiar, and, depending on where we are, we might see familiar species. Still, the waters are certainly getting deeper, and there can be some interesting ocean-only life forms.
Hey, there is one just off the port side… Have you seen Azure Data Factory? That’s a real beast:
This one is strong enough to survive in the open waters – it does not care about Power Platform that much. It probably thinks Power Platform is not worth all the attention we are paying it, since here is what Azure Data Factory can offer:
- It has data flows to start with
- It can copy data
- It has connectors
- It has functions
- It has loops
- It is scalable
- Pipeline designer looks somewhat similar to SSIS
- It can actually run SSIS packages
- It allows deployment of self-hosted(on-premise) integration runtime to work with on-premise data
- It offers pipeline triggers
- If has the ability to create reusable data flows
- It has native support for CI CD (so, there is dev-test-prod)
And I think it has much more, but, well, it’s a little hard to see everything there is to it while on a cruise. Still, this screenshot might give you an idea of what it looks like:
In terms of data transformations, it seems there is a lot more one can do with the Data Factory than we can possibly do with the Dataflows/Data Lake Export/Power Automate Flows.
Although, of course, Data Factory does not really care about the Power Platform (I was trying to show it Power Platform solutions, and it just ignored them altogether. Poor thing is not aware of the solutions)
Finally, going back and relaxing in the sun…
It’s nice to be on a cruise, but it’s also great to be going home. And, as we are returning to the familiar Power Platform waters, let’s try putting all the above in perspective. The way I see it now, and I might be more than a little wrong, since, really, I did not have an opportunity to do a deep dive on this cruise, here is how it looks like:
- SSIS will be becoming less and less relevant
- Azure Data Factory will take over (probably has already done it)
- Power Platform’s approach is almost funny in that sense. And, yet, it’s extremely useful. Following the familiar low code/no code philosophy, Power Platform has introduced its own tools. Which often look like simplified (and smaller) versions of their Azure counterparts, but which are meant to solve common Power Platform problems, and which are sometimes optimized for the Power Platform scenarios (environments, solutions, CDS data source, etc). The funny part there is that we, Power Platform consultants, are treated a little bit like kids who can’t be trusted with the real things. But, well, that approach does have some advantages:)