In this on-demand webinar, learn how to easily and wisely share datasets created in Power BI Desktop. We tell you about Power BI dataflows and certified datasets, the time-saving functions that allow analysts to share their valuable work with other report writers.
With shared datasets, a single dataset can be used by multiple reports, across workspaces. In this recording, you learn how Power BI dataflows and certified datasets work, why they exist, and how they are going to improve your business intelligence reporting performance and overall enterprise analytics.
Until recently, analysts working in Power BI Desktop could not easily extend their datasets for use by other report writers. Those datasets could only be used and accessed through an analyst’s workspace and apps. To share the dataset, there were two options: one was to perform a do over, cleansing and transforming the data. The other option was an enterprise-level data cleansing and transformation process. Thankfully, Microsoft provided dataflows, shared datasets and certified datasets as an in-between solution.
By allowing Power BI administrators to tag datasets as certified, Power BI analysts take advantage of someone else’s data enrichment work with full confidence that the dataset has been vetted by the organization. Dataflows make the process of data preparation—and creation of datasets—more manageable.
If you’re using Power BI Desktop, shared and certified datasets enable an effective data culture. Learn how to use them to quickly build reports, make decisions on trusted data and remix to create new insights.
BI Solution Architect
A certified Microsoft business intelligence architect and developer with over 20 years of experience in software development, Andrew has made regular appearances at the Power Platform World Tour events over the years. He also serves as assistant organizer of the NJ/NY branch of the Power BI User Groups.Read more
Q: Can I only create dataflows in Power BI Pro or Power BI Premium and not the free version?
A: Correct, you cannot create dataflows with the free version, Power BI Service.
Q: Are Power BI dataflows an alternative to implementing the gateway?
A: These are two separate technologies. You can use on-prem data in a dataflow if you have a gateway installed within the network. But dataflows do not present another means of accessing on-prem on their own.
Q: Do Power BI dataflows do query folding just like regular Power Query?
A: Yes, if the source allows query folding, then it behaves the same way as Power BI Desktop Power Query. Some source types such as .csv files do not allow query folding.
Q: Can security be applied to specific datasets or database objects based on a person’s role when accessing cloud datasets? For example, a business user makes a report, grabs a dataset, but is unable to view HR data.
A: Power BI allows for table and row-level security. If you created a dataset that only allowed certain groups to see the HR_Salary table, then if a user wasn’t in a group that had access to that specific table, then the user could still use the dataset but not see that table. Column-level is currently not available in Power BI.
Q: Is security setup on the Azure side or the Power BI service side?
A: Most companies have an on-prem active directory that is synchronized with their Azure active directory. It is best to use active directory to create groups. As employees come and go, the company can best manage this in active directory. Specific groups can be created for Power BI. Once this is established, then security groups are used within Power BI Service. In the demo we used the PowerBICertifiers active directory group to give access to the certification process in the Power BI admin portal.
Q: Can I do paginated reports with Power BI datasets?
A: Yes. As part of the Power BI Premium subscription you can create and publish paginated reports using the Report Builder tool. This tool is downloaded from the Power BI Service portal and can use a Power BI dataset as a data source.
Q: Can I schedule automatic refresh from SharePoint and from a database for the same report?
A: Yes. Our example had two different sources: SharePoint file and Azure database. This dataflow can use a scheduled refresh.
Q: Can I use a dataflow as a source for Power Query in Excel to populate a table on an Excel sheet?
A: Not yet. Use this link to vote for this feature and also track progress to when it will be added: https://ideas.powerbi.com/forums/265200-power-bi-ideas/suggestions/37479172-connect-to-dataflows-from-excel-s-power-query
Q: How does data in a dataflow get refreshed? Are they scheduled or done manually?
A: You can do both. You can manually refresh from the portal and you can create a repeating schedule from the portal.
Q: If we need an hourly report updated, how can we schedule that? Last time we tried there was a limit of six times per day?
A: Power BI Premium allows for up to every half hour. Power BI Pro lets you schedule refresh up to eight times per day. Check out the refresh schedule: https://docs.microsoft.com/en-us/power-bi/refresh-scheduled-refresh
Q: Can a ready dataset can be published as a dataflow?
A: Not at this time. But this is a good idea as you can use existing Power Query models. Vote for this idea: https://ideas.powerbi.com/forums/265200-power-bi-ideas/suggestions/39015112-connectivity-to-power-bi-datasets-in-dataflows
Q: When I have a live connection, does it reduce the DAX calculations that can be applied to the data?
A: In live connection mode, you cannot edit the model that you’re connected to. But within Power BI, using live connection allows you to create calculated measures for the report. For instance, we can connect to an existing dataset using Power BI and add a Month To Date calculation using an existing [Amount] measure within the model. The new measure is only available within your Power BI .pbix file. It does not update the original dataset.
Q: Wat are the main differences between dataflow and dataset? Which is better to use for self-service BI?
A: They are two different technologies that satisfy different use cases. Dataflows allow you to transform data and save it to the cloud as Entities or tables. You can then use that dataflow in a downstream process that uses as a data source to more data sources and transformations. These dataflows can be used as a source to Power BI Desktop. Within Power BI you can create relationships, more transformations and calculated measures. This Power BI Desktop file can then be published to Power BI Service to create a dataset. This dataset includes both the source data from the dataflows as well as the relationships, transformations and measures that were created within Power BI. A dataflow can produce data to be included in a data model. The published data model becomes a complete dataset ready for analytics.
Q: How do I edit an existing dataset?
A: In Power BI Service, click on the dataset. Go to File menu and choose Download report. This will open it in Power BI Desktop.
Q: Are the cloud servers based in the U.S.?
A: Microsoft operates in the geographical (geo) global locations. Here’s a link to Microsoft’s trust center site which provides the worldwide locations: https://www.microsoft.com/en-us/TrustCenter/CloudServices/business-application-platform/data-location
Q: With the cloud solution, does the Power BI gateway also get hosted on a cloud server as well?
A: The Power BI gateway creates the connection between Power BI cloud-based data analysis technology and the data source located on-premises. Gateway is an application that can be installed on any server in the local domain.
Q: Currently, when do we need to have on premise gateway and reporting server versus Power BI online?
A: If your company’s security policies are not comfortable with a cloud environment, it may want to use the on-prem version.
Q: After I create a Power BI report using a shared dataset, can I change it to a different shared dataset? (DEV vs. PROD)
A: No, Power BI doesn’t allow you to change datasets while maintaining the visuals in the reports. You would have a dev workspace and prod workspace. When the dev testing process is completed, you can publish the dataset and reports to the prod workspace.
Q: How do I add a table to our sales dataset?
A: In Power BI Service, click on the dataset. Go to File menu and choose Download report. This will open it in Power BI Desktop. You can add a table to the model, save it and publish it back up to Power BI Service.
Q: Will this session be available for download to be viewed later?
A: Yes, you can download the presentation and recording from our website at: https://senturus.com/resources/how-to-share-power-bi-datasets-dataflows-and-certified-datasets/
Greetings everyone and welcome to the latest installment of the Senturus Knowledge Series. Today, we’re pleased to present how to share Power BI datasets dataflows and certified datasets. We’ll be doing a demo and having a discussion of how to extend enterprise sharing from Power BI Desktop.
First a couple of housekeeping items you’re viewing the GoToWebinar control panel. You can restore or minimize the control panel by clicking the orange arrow. We do have everyone’s microphones muted except for the presenters here out of courtesy to them, but we encourage you to submit questions via the questions pane and you can see a graphic of that there.
Please get those questions in and we’ll answer them at the end of the webinar time permitting if we’re unable to answer any questions. We do complete the question log and after the questions are answered, we put it up on our website along with the presentation deck and a recording of the presentation which brings us naturally to the next slide. Everyone asks us. Can I get a copy of today’s presentation and the answer is an unequivocal. Absolutely it will be available on Senturus.com/resources.
Bookmark that page for all kinds of great tips and tricks and past webinars or you can click the link that one of our panelists has helpfully put into the GoToWebinar control panel.
Today’s agenda after we do a couple of excuse me brief introductions. We’ll get into the meat of the presentation describing what exactly dataflows are and what their role is in the organization and Microsoft’s plans for them talk a little bit about shared and certified datasets towards the end after that for those of you who are unfamiliar.
We’ll do a brief Senturus overview talk about some great additional and almost entirely free resources and then at the end stick around as will be getting into the aforementioned question and answer. I’m very pleased to be joined today by Andy Kinnear. He is a certified Microsoft BI architect and developer in excess of 20 years of experience doing software development and BI application development.
He’s made regular appearances at the Power Platform World Tour events over the years and also serves as an assistant organizer of the New Jersey New York branch of Power BI user group. So we’re thrilled to have him here with us today presenting on this topic. My name is Mike Weinhauer and I wear several hats here. One of them being the MC for our webinars and events. So before we get into the presentation, we like to do get a finger on the pulse of our audience. So I’ve got a couple of polls to share with you here. First of all, what is your role using Power BI in your organization? So I’ve displayed the poll out here for you if you can all go ahead.
Vote. Are you an ETL Developer? Are you Power BI power user? Are you a consumer of data or you are a manager/executive? So go ahead and weigh in there. If you don’t mind. I’ll give a few more seconds for that. We got about two-thirds of you voting democracy in action.
About 3/4 80% you guys are participatory today. I like it. Alright, so we are going to close that out and I’ll share the results with you. So we got a lot of power users out there not too surprising. But a fifth of you are managers/executives and third or ETL developers. So this has a little something kind of for all you guys. So hopefully you find this this valuable. I’m going to move along to the next poll.
And that is what version of Power BI do you use? So hopefully you’re aware of this there’s Power BI Desktop, which is a free local download and then there’s the next level as Power BI Pro and then Power BI Premium, which is the more enterprise level so you can go ahead and choose the one that is most applicable to you.
And again, you know, what about 2/3 here will let this run for just a little bit longer we’re up at about three quarters here get those votes in your practicing for the primaries and November here. So it looks like about, two thirds are using Power BI Pro, good, 25% or so using Premium and then and then a full half of using just the basic version. So this is great those of you who are in those various tiers, you know Desktop and Pro and Premium.
You may be wondering what are some of the differences there and we’ll touch upon that. We do have other resources where we get into that and that’s some of the things that we have some of the one of the things we help our clients with. Thanks for weighing in there and then moving on to the next slide here. We’ll get into the heart of the presentation and I’ll hand the floor over to Andy.
We want to understand why Microsoft is adding these features to the toolset. You know, what the void that needs filling is and we’ll talk about some of the objects that are in play here and the features that they’ve added and we’ll start with looking at what a dataset is so in Microsoft Power BI you have the ability to bring data into Power BI do some transformations on it.
Create a data model with relationships and calculated measures and in the end you’re producing a data cube. So if you heard the term cube in the past many production environments have data cubes that have been built by the IT department and then put into a production environment. Here Microsoft is using the same exact technology as analysis services cubes and has made it part of the Power BI Desktop solution.
So it is when you create a Power BI file, a PBX file, the data is married to the report that you’re that you’re developing as well. So analysts can download the free version, have it on your desktop bring the data in in the background. You’re creating a cube. When you do all your transformations and calculated measures and that and whatnot and then you create a report or a number of reports based on that and it’s for your use.
It’s self-service BI now if you take that and you publish up to Power BI Service, which is the Power BI portal that allows you to share your reports and your datasets. Then once you publish that, Power BI will create the visuals separated out as a report and the cube or model gets separated out as a dataset. All right, so it is still connected with this one, but there’s a slight degree of separation. If I wanted now to go in and create a new Power BI report. I could create a new report and connect into this existing dataset.
All right, so it’s been shareable in the workspace at least up to very recently. It’s been shareable within the workspace that you publish to but only within that workspace. Okay, and the enterprise model so enterprise BI which is going on for years. You have an enterprise environment where the cube is totally separate from the front-end tools and take a look at this picture here.
Okay, this graphic shows what a typical cloud environment looks like for enterprise level BI. You’re grabbing data from different sources and might even be streaming data that you’re pulling in. The data goes down into a data lake storage and then within the data lake itself, there are transformations that run over the data you’re adding new fields. You’re calculating new fields and once that processing is finishing you can move it off into let’s say an Azure SQL database.
Azure SQL data warehouse and then from there down into say an Azure analysis services cube again the same technology that’s in Microsoft, but now it is published up to the cloud and it’s developed using visual studio instead of Power BI and it gets the stamp of approval from the organization. This is all done by the IT department by developers that know all these different coating technologies and one of the issues with the enterprise level.
I now have the cubes great. I’m really doing well with it, but I need three new fields you put in a request and it sits there for a month, right? If it gets approved it gets put in for the next release, but you have to wait and you have to wait your dependent upon the IT department. So that’s three ways of looking at datasets and cubes.
Right a cube is typically set off and its enterprise environment datasets are when you publish up a desktop Power BI Desktop to Power BI Service but to just to make sure everyone understands it is the same technology under the covers.
So we’re saying that the technology there is that it’s a columnar tabular model like a high performance analysis services structure right. Andy, you know, it’s different permutations of it local or in the cloud. Yes, cube is really data store that’s created to be very efficient. So it it’s in memory.
It’s compressed data and it allows you to create calculated measures and relationships that complete out a model that can be used in an environment, so if you want to create if you want to compare current month against the prior month this calculated measures that will work during runtime to handle that type of aggregation and comparison and then if you were to slice it on something different those calculated measures would recalculate in real-time and produce the results needed. So that’s the basic benefit of having your data in cubes. Is this speed and the flexibility to look at?
At the current context of what you’re reporting on.
Self-service BI so the idea behind this with Power BI is that analysts can create their own models and are not dependent upon the IT department to access data. All right, typically business analyst or power users that look at that use Power BI they get very good at what they do. They know how to use the tool. You don’t need to be you don’t need to have a degree in computer science to build your own models.
The tool is made so that you can use ring the data in do transformations on it create calculations and DAX which is a programming language, but it’s more simplified. It’s very similar to excel which a lot of analysts are used to so it is something that analysts will be able to pick up fairly quickly and learn how to use so it’s simplified in the sense that you don’t need that computer science degree to do it and then Power BI service allows for a means for analysts to share their reports and models.
And so that is the first step of getting away from self service. Now if self service would bring in data in and now you’re just you’re building the models that you want and you can do analytics on it and help your company make better business decisions, but now you’ve done great work. You’ve created a model that’s pretty impressive and everyone’s proud of you for doing so and the first question they ask is can you share that with this department over here or can you share that with your co-workers in this this area?
So it’s getting away from self-service BI, but it’s also a means where you’re saying all right, this is some good work done here. We need to be able to have the rest of the organization benefit from it. But one of some of the resulting issues, then I can come about and let’s create useful data sets for the organization, but they were only being used departmentally or individually.
There are two phases to Power BI data set development when you bring data in you can do transformations at the table level which is in Power Query editor. You can split columns. You can merge in a pen tables. You can do a lot of good ETL type stuff, table and extract transform and load. So you’re doing that, you’re extracting the data or actually ingesting the data transforming it and then when you’re done with Power Query editor, you close it and apply and it goes into the Power BI model which allows you to create
relationships and allows you to create calculated measures and you can do some new columns there too as well with DAX, but there are two phases here and doing and doing the data transformation process to create a single data set. So one of the things that Microsoft decided they could do is you could create a process that will allow saving and sharing of the data after power after the power query process right now that doesn’t exist in Power BI.
You to go into Power Query Editor you click save and apply and it goes into the model and there is no the tables that you were working on haven’t been saved to your local environment. They go into the model and the model gets saved. No one can access those tables that you worked on. So one of the things is to give a process so that you can save and share that data and then also have a process to share data sets across the organization.
So now that you have the ability to create a dataset from a dataflow and we’ll get more into that. Then you want to be able to have other departments work with the data itself. So we’re going to talk about that and when that happens you getting away from the production level environment, right? So production level environment you are looking to say here’s the data that we consider to be the single source of truth. Right and you’re going to come in here and get that data. So now we need something not to the same level.
A production level environment but something that says that what we’re working on has the confidence of the organization as being the correct data to use and we’ll show you that with the certification process. Now, let us quickly go into Power BI to show what power query editor is if you’re not familiar with Power BI at this point. I have a data ingested already over here and model. That’s built but if I go into edit queries that’s going to open up Power
Query editor. This is all part of Power BI desktop and it’s a tool that allows you to do data transformations, as you can see up top here and you are on the home row, you have the ability to remove columns you can use first row as head is very useful with Excel. You can change the data type of the columns. You can merge some of these merger of Corey’s but you’re merging tables in essence over here that you have available.
And then there’s a transform tab where you can do a whole host of other types of transformed split columns based on a certain value a lot of the math and structuring of the data as well. And then once you finish with that, this is the part where you close you click on apply. I’m just going to click close here when it goes back into the model and it adds those changes into the data model and then to distinguish itself from the second part. This is what a data
model looks like, so I’m in the model view back here. I create the relationships in the model view. I can go to the table view but we’re still in the data model, which is a table view of the data model and you can create measures new columns. You can do all sorts of processing in here too. But mostly it’s calculated measures.
I want to compare this month to date with the same period last year and I can create a calculated measure that does that and look at the two of I use side-by-side. That’s the power of the data model itself. So what Microsoft wanted to do was how can we take out the power query editor and give it to you as a cloud platform separated from Power BI and allow you to save the data after the Power Query process so that can be used by other people within the organization.
So this dataflow allows you to bridge the gap between enterprise and self-service Power BI and allows the analyst or power user of Power BI to actually create some data flows which are which are ETL processes extract them transform them and then load them.
So you’re extracting different data sources databases files, you could from get files from SharePoint and bring them into our query editor tool and then you can do your transformations just like you did in Power BI, but when the power query editor process is finished you click refresh and it loads the data into your cloud storage. Now, where is this cloud storage? Okay. This is part of your Power BI service that you have either Power BI Pro or Power BI premium data flows. You can create data flows in both and they each have their own memory limits for saving.
About data and in particular, it is its cloud storage underneath the covers you use an Azure data leak storage, but it is your own part of your Power BI subscription that you are using to store that data. Okay, and those data flows when you create a data flow, let’s say it produces 10 tables. So I’ve gotten tables. I’ve confirmed cleansed and conform the data. It’s ready to go and tabular form and it’s saved out to the data lake.
Another person could come along and say that’s great. I need three of those tables and I have five tables of my own that I’m going to pull in and I’m going to create my own data flow out of that. So what it does is allows you to link to an existing data flow and create another data flow. And in this way an organization can help build out an ETL process for the self-service Power BI group if not the production level environment.
My analysts are creating dataflows again, analysts have done some great work. And now they need ability to share this across the organization so they can do that in a form of dataflows that persist data in the cloud and then allow other developers analyst come in and create their own dataflows and reuse if they have the axis security for it. We use the data that’s already been out there.
And then these dataflows can be used as a source into Power BI to create a data set. So when you open up Power BI, you pull It in it looks like a database you’ll see a list of tables and you say I want these three tables and pull them in and now you create your Power BI report and created the resulting data set when you publish up to service.
So just to recap power query editor in the cloud and separated from Power BI. Okay, so it’s taken that element of Power BI and moving it off as a separate application and making it available in Power BI service data flows extract data from the sources and save data in the cloud. They persisted up there allow for data transformations and mashups and they can be used by others as seeds into a new data flow.
Okay, so that’s the reason behind it. Why don’t we go take a look and do it ourselves?
I’m going to come over to my Power BI environment here. I’ve got this workspace all set to go. And if I go to dataflows here, I can see I’ve got a dataflow already created. All right. So this is a hundred percent ready to go. We’re going to use that to save a little time in certain areas, but let’s create that from scratch. All right, so I’m going to go to the create button here data flow. All right, and again, we’re in a cloud environment. So I’ll open up this application, it’s
totally up and your Power BI portal and few options here. What we’re going to do is create a new entity or add new entities essentially create a new data flow.
Alright. So first thing we’re going to do is we’re going to draw data from an Azure SQL database.
When ready, I go here and put in my credentials before so it’s memorized that click next. Okay, and now you can see here. I’ll connect it to a database just as you would with Power BI and we’re going to pull in few. This is the adventure works sample database and we’ll pull in a handful of tables here will bring in the full product.
Category and internet sales will be our fact table and we’re just going to click transform and right now it’s bringing that data into the environment.
And as you’ll see this is you’re used to power query editor, this is the same functionality doesn’t have a hundred percent of the same functionality. It is based on the language me, the same me language that’s in power query editor in Power BI.
So if you if they’re as they build out this tool it’s going to add more and more functionality to it till it is the same as the one in Power BI but it does work on them language and if you have Do you can always copy code from Power query editor in Power BI and pull it up in here? If you find that some of the features aren’t there yet. So with this we have the same applied steps over here on the right. You have the queries over here on the left and you can see for each table as we go.
And a couple of things will do out of the box here.
Just wait for dim customer to show up. Alright, well.
Will create a full name field just by doing.
Clicking away, so I’m highlighting three fields here now go to transform.
And separated, make space and we’ll call this full name.
Click, ok. So now in the course of a few clicks, we’re doing what a SQL developer might have to write a SQL query to do to derive a column. We now have a full name field in our customer. That’s the whole point of making an ETL tool available to data analyst and power users people that aren’t necessarily SQL coders give them the opportunity to build out an ETL package with click.
And the functions up here in a GUI. Alright, so coming back here. We’re goanna we’re goanna mash up. We’re goanna get some more data now and instead of database. We’re going to go to a SharePoint folder and grab that URL.
All right, so it’s going to be Microsoft account. Okay next.
So now it’s grabbing this file off the off the web.
It’s an item budget file. So if you used to budgeting a lot of times with budgets, they’ll be done in Excel. They want to have months across the top and what we’ve done here is we’ve made the adventure works category field as the as the categories for budgets. So a particular budget in a particular year. This would be 2013 data and it would have what they’re expecting their sales to be for each category in here. All right.
So once that’s in there you can It’s bringing it in now and there’s a few steps to bring in an Excel file in this particular case. We want to get this file in and click. Okay, and if you are familiar with bringing in SharePoint folder files with power query editor, it’s undergoing the same process. It’s going to bring everything in as a binary then it’s all going to allow me to expand on that binary and then it eventually starts to look like the Excel.
A process here so I click binary and you can see over here as we do each click the applied steps shown a new step. So pop up in a second and each time M is a sequential language. So as it brings it in it adds a new step to it and you can always go and delete that step. I’ll go back to another and that’s how you would go about in terms of building out. What’s underneath the cover so underneath the covers.
Is that go to the advanced editor? This is what the M language looks like. If you would have program it, they’re like directly and the UI is now building that for us.
Okay, and another step here we need to do we need to expand this table out and it’s going to show us the columns in this the next step. And so now it should start to look like what you would want it what you what we saw back in the Excel file itself. So one of the things I noticed right away is that the First Column first row here is my columns. And if I wanted to go up and say use first row as headers right nice little nifty thing special.
Bringing in Excel. So I’ll put that in you can we’re bringing our steps over here promoted headers. And for this particular guy when we start to do analytics. You don’t want to have a matrix style format here. You want to put this in a tabular style format.
In other words you want accessories to have 12 rows January through December and the amounts in one single column so that we can refer to that column and then also, Slice it on a date Dimension and a category Dimension here. So we need to unprovoked these rows again query editor makes this very simple after first. Let’s get rid of some extraneous columns here move columns.
Remove this column.
And if you’ve ever done one pivot in SQL server using the SQL code, it’s a little bit of a tricky business to do it here. It one pivot columns.
And it does it for you. All right. So the GUI is interacting with the code behind the scenes to create the M language and here we have a nun pivoted table. All right, and so right away, I’m going to rename this column.
But months and thus the budget.
And one other thing we could do here is right. Now we know this 2013 and we have the months here. But in order to hook it into a date to mention we’re going to have to have a key that that shows the first of the month. I want a date key in the format of YY/MM/DD for each one of these months so I can go here. I want to create and then I add a column on had a conditional column.
And just show how that works. So short month here equals January.
And then you go over and you put into a 1/3 all 101 and you keep going through this so I can go in and do this for February and on down. That’s why I’ve saved the other workflow at this point to move things along but we are going to do this one here. Let’s do February.
So at least I can close this accurately.
Time and again, we’ve had these steps here as goes through all sequentially created one on top of the other. This step makes no sense here because the columns haven’t been on pivoted yet. That’s why it has this sequential processing here. As you can see for those two months. We actually have to date key here. It’s going to allow me to connect to the date Dimension and we have accessories. It’s going to allow me to accept connect to the accessory category dimension.
And then with the columns here, we have all the values in a single column called budget. I’m going to cancel out of here. You could save it and close it and then refresh the data, but cancel out we do have that data set fully cleaned out fully developed right here. Okay. So once that data flow is created. Sorry if I inadvertently called it a dataset. I meant to say dataflow at this a dataflow and with this dataflow the data has been refreshed.
And it is in our cloud environment as tables if I come over to Power BI at this point.
I’m going to go to get data.
And I’m signed in with my account. I want get data pops up. It’s going to allow me to see different sources as normal.
If I go to Power Platform and dataflows and click connect gonna check all of the workflow all of the workspaces that I have access to and check the data flows that are available to me and then produce them in this so I have one data flow and if I click here you can see dataflow webinar and it’s got the tables that I’m interested in now instead of important them at this point and save a few minutes. We’re going to go directly to one where I’ve imported them already. Okay, so you can see here on fact budget.
We have our date key our short months and then we have fact the internet sales where we have sales amount in here and all of this has when I brought the data in I was able to create those relationships. So the date is key here, in fact budget now points up to the date key and the date dimension category goes directly to dim product category. If you notice that fact internet sales has a product key. So it’s a relationship to product as a many to one relationship to product category, which has a mini sub. Sorry product subcategory that has a many to one relationship to product.
So the purpose of mashing your data together with this tool is so that you can do analytics with it. And if I come here I’m going to first I want to see my sales amount.
Now put that in too.
English, month name okay, and you can see here January through December this the amounts that we may now I want to see I’ve budgeted certain amounts for this data. So I want to see what the budget itself is going to produce drag that on here. Now, I can see the sales as compared to the budget right classic example in retail technology. So you can see here a budget is a little higher than the sales aren’t doing too well in the beginning of the year.
Wanted to catch up as the Euro and on where the sales started to eclipse the actual budget amounts and because we have mashed up the product category, I can come in here and drag over product category make a slicer out of that. And now if I want to see just bikes right. I’ve got a slicer on both the budget and the actual sales.
So you remember as the spreadsheet allowed someone to put in a budget at the category level and you can see we’re looking at amounts of close to a million. So one point two is the budget 800,000 if I look at clothing now, its slices both of them. So now both of them are much lower in terms of 27,000 sales budget very close at this point, but I am slicing across both fact tables because we brought in data from two different sources mash them up together.
Other made useful foreign keys out of them. And then once I have that the analytics becomes drag-and-drop filter on need and I guess components hasn’t sold anything but in this particular case, you can see the value of having data brought together from different sources, and now I can do an analysis against both at the same time. So let’s take this guy.
And we’ll save and we’re going to publish it up to that same workspace.
So now we are in Power BI Desktop and we use the dataflow to come in and create a model within Power BI Desktop. Now, I’m going to publish this up to Power BI Service and when it goes up to service.
Okay success. So now it’s up in service I go over here. I want to see.
My reports, okay, so I have the report up here.
Take a quick look at that. Okay.
And if I go back to my workspace, I now have a data set that’s been published up as well. So that’s the separation we have the report and the data set is separate Technologies. So now if I want to come into Power BI again get data.
All right, so I could be a different user now, but I have access to that workspace. And if I go to Power Platform, I know I want to see my Power BI data sets that I have available and I can come in here.
And make use of that same model, right? So the model is now more advanced than a data flow data flow. You can’t connect to create reports. You can connect to a data flow to pull data in but now as I connect to this model, this is a live connect link to an existing published model. So if I have those calculated measures that we talked about in their say prior month calculations, then I’ll be able to use it Watch What Happens here on the left? Okay, the table View and the model view disappear.
I don’t have the ability to change this model. I am using a published model up in my soul my service. All right, so I’m goanna just do a quick let’s say subcategory. I’m down here if I go to fact internet sales.
Now I’m going to save this as something different.
Thousand so now I’m reusing a data set and that’s a good thing. If you constantly pull in the same data and then publish up the Power BI you’re creating data set at the data set. And because it takes up space. It may cost you more money. If you have a pro license, you have limited amount of space. If you have premium, even though you have much more space it is still limited. And if you have a lot of people publishing up you’re now taking valuable resources. So in this case, we’re utilizing the same
dataset and then publishing back here. I’m going to publish back up to the same workspace with a different report. Okay, so if I come back in here right now look at reports and I’ve got two of them, right and I have two of them, but I only have one data set and so will know much more efficient use of existing data that’s out there. Now, one of the things that Microsoft recently opened up was allowed you to scare these datasets not just within the workspace but across multiple workspaces.
People have said this is great. We want the operations department to see it too. And how can we get that to happen? You know and everyone in operations to be in you the workspace. You might have confidential stuff in the work space itself, but you want them to see this particular model. So how do we go about doing that and is an Administration portion of this that allows you to right now.
I’m a Power BI admin and as I come in here I can go to settings and then the admin portal and within that admin portal some of the new things that were talking about. I’ll go back to the slides just for this. Okay, the shared and shared and certified data sets. So if I come in here, I now have this ability here and this is by default. It’s enabled which is new. So datasets now can be used across the entire organization.
Or, you can set it up so that only certain groups will have that ability. So it gives you some governance over who has the ability to share those datasets. But for the most part and it was we have it here is the entire organization can see can share those datasets if we come down here.
The other part of this is going to be the certification process and you can allow certain people to stamp a data set as certified meaning the organization has tested it and it looks good. And I trust Andy Kinnear to go and be one of the certifiers if he’s done is testing process and he validates the data and everything looks good. We’re going to allow him to be a certifier allowed tag data set as certified and I am part of the Power BI certifies a group which is an Azure Active Directory Group, and my name is one of the members in this group.
So I’m allowed to do it. But the rest of the organization doesn’t or whoever’s in this group is allowed to do it. That’s the admin piece that’s been added here both for sharing datasets and for certifying datasets. So let’s take a look at that. All right, so share datasets, allow datasets to be used in Power BI Desktop and publish to a different workspace. The dataset is not recreated in a new workspace. It has a link in a new workspace that points back to the published data set. So once again, we’re not constantly pushing.
New data into create new datasets when we already have a good clean model out there all right and that does save space and compute Power in your Power BI service subscription alright so the shared data set demo is now if I want to take this report let’s save it as something else all right let’s close mm Okay, and now I’m going to publish this to a different.
Different workspace dataflows and certified datasets to so as this goes up.
And it’s very quick because the data is already there. Now I come into my workspace here.
This is the second one and if I click on datasets, there it is the dataflow model. That’s the same one. You can see there’s a little chain link here if I go back to the other one.
And I look at data sets. This is my data flow model 100. There’s no little chain icon here what the chain is signifying.
Is that it is a linked data set in length from another workspace? So you’re allowed to publish up now into other environments. So it allows you to share across the organization. And now if I’m going to do that, I want to know are these data flow are these data sets? Which one’s the other these are our good which ones are them are something that’s ready to go. Am I creating? I might create a data set that’s in test mode and I published.
My workspace and it’s not finished but I wanted to publish up see where I’m at. See how it reacts with the system, but it’s not ready to go and I don’t have any way of telling anyone else in the organization whether or not that data set is useful or is accurate right?
So if I come back to my model here I come down this way.
Into the data set and I got the ellipses here that I can go into and go into settings.
And since I am certifier, I do have this endorsement here available and it gives me three different options. So it all looks goes up as the fall. Right which means that it’s just out there and if you want to use it, then you got to have to trust that, you know, what’s going on with this data set and that it is it’s ready to use but there’s nothing from the organization. That is it’s that it’s good. So we kind of look at it three ways here defaults pot luck. All right. Good luck.
We’re not saying it’s accurate, but it’s being used but it might not be finished might not be validated anyone can promote their own datasets. So if you go in if you create a dataset, then you can promote it.
You can click the promote adoption that’s in here and it will go up and it gives some semblance of confidence that someone you know that he’s not an organized not an organizational certifier, but he’s been working with Power BI data sets and has a model out there and it’s I have enough confidence when he’s clicked promoted on it that I could use it as well. And then the last piece here is certified says that only certified administrators can attach this level. So every organization can have a different process is not a process in place that says it must go through this type of validation must be tested and user tested. It’s basically saying for this organization, I’m going to trust you
to certify it, some organizations will have an elaborate process for that. Some organizations will be smaller and say you know, what if Kinnear knows what to do and he certifies that’s good for me. All right, but it’s some level of governance in the Power BI process that allows you to do it. I click apply. And now what I’m going to do is go into Power bi again. I’m going to go to get data. So I’m back in Power BI desktop now.
How to get data, choose Power Platform, and now when I have a dataset and I click connect.
Okay. Now you can see that dataset now shown up the endorsement value here certify and if other datasets were promoted you might want to say okay. This person’s dataset is pretty good. And he’s click promote it on it. I feel comfortable about using that the others here. Who knows right?
Who knows what you’re getting and that’s the one way that they’ve put into place now is this process of both sharing the datasets and then giving some level of governance to say This is a this is something the organization has approved or this is something that those particular data set developer has approved so you can have a certain confidence working with those data sets. Okay, and that that concludes our presentation and I think that’s really fascinating stuff stick around for the questions guys. We got a lot of great questions.
Andy if you have a second to look at the questions while we’re sort of flipping through these about Senturus slides, then we’ll get to the QA at the end. But as you can see, you know, the pendulum sort of swings in the BI world and many of you online come from our Cognos background where everything was really centralized and then tools like Power BI and Tableau came along as a response to that. But then the challenge with those tools was doing everything in the desktop and having multiple versions of the truth and different copies of data sets and stuff like that.
So we’re really seeing a move by organizations like Microsoft to find that happy medium between the courses. The wild west where everything’s on people’s desktops and or replicated infinitely in the in the cloud and the tightly controlled, you know with the challenges around agility and things like that that organizations complain about and the data flows really provide this kind of happy medium where you have the power and the flexibility of all the things you can do.
In Power BI Desktop, within by the way of transformations and modeling but you get the nice governance and certification and reuse and all the good things that come with that in the form of leveraging Power BI service certified data sets and Power Query in the cloud. So hopefully you were able to follow along with that while Andy gave us some great demonstrations there.
You may feel a little daunted by some of this because it does get complex and we can help you in many different ways whether That’s our comprehensive Power BI training offerings including transforming data on Power BI that we’re offering it just at the end of this month here. If you’re helping if you’re trying to move your organization from that desktop wild West sort of paradigm to more of an enterprise scale.
We can help you get on the path to that more efficient accurate shareable dashboard type reporting or if you just need help with a specific problem we can offer you a bucket of hours and that you can allocate however you want to. A couple quick slides about Senturus. We Are the Authority in business intelligence.
We focus exclusively on business intelligence and we have a depth of knowledge across the entire BI stack. Our clients know us for Bridging the Gap between disparate data sources and business and IT and business users by delivering solutions that five you access to reliable analysis ready data across your organization enabling you to quickly and easily get answers at the point of impact in the form of the decisions you make and the actions you take the next slide. You can see that we offer a full spectrum of BI services are consultants or leading experts in the field of analytics with years of pragmatic real-world experience and experience advancing the state of the art.
We’re so confident in our team as well as our methodology that we back our projects with an industry unique 100% money back guarantee. We’ve doing this for nearly two decades, 1,300 clients ranging from Fortune 500 down to the mid-market. You’ll recognize more than a couple of those logos up there. We solve business problems across various Industries and functional areas, including the office of finance, sales and marketing, manufacturing operations. HR and IT, we’re both large enough to meet all of your business analytics needs ranging from architecture and design to implementation,
data lakes, data warehouses to BI solutions front end reporting dashboard development upgrades migrations performance optimization as well as comprehensive training and yet we’re small enough to provide personalized attention.
A couple of quick events here if you want to and resources here. If you want to expand your knowledge, please visits Senturus.com/resources. Bookmark it with hundreds of free resources ranging from webinars like this one to topics on all things BI to our famous up-to-the-minute easily consumable bite-size blogs. We have a great event on connecting to and prepping data and Power BI.
I will be talking about the pros and cons of different data connections scenarios there. So we’re going to add to our quiver of Power BI events. We’re also going to be adding a Snowflake demo in February. So if you’re interested in that come see us in March we have a really exciting one. We’re adding a class on complex aggregations in Tableau, and we’ll be giving you a little sample of that in March. So please come visit us there along those lines.
We do have complete training offerings across Cognos, Power BI and Tableau, so, there is you need corporate training or individual training. We offer everything from customized tailored group sessions to specific one-to-one or one-to-many mentoring to instructor-led online courses to self-paced eLearning.
We offer several different resources here from on our site where you can find product reviews technical tips Insider viewpoints product demos, including our analytics connector and additional upcoming events. So please go take a look at that. Alright, and we have about five minutes left here. So we’ll tuck into the questions at this point.
I need to find the question log here disappeared on me.
And so Andy see that there is a ton of great questions out there. Did you have one in the already picked out or do you want me to team up for you a couple of I can see here that be informative. Let’s see here. It one participant is so to confirm you can only create Power BI dataflows in Power BI Pro or Power BI premium. You can’t create.
Those in Power BI desktop, right the free version? And yes, that is correct. There is no way of saving the data from Power query in Power BI desktop to your desktop. It is the cloud service where that’s the case. All right, another good question here. Do dataflows have to query folding just like regular power query? The answer is yes. So where the data source allows for query just a little explanation work.
If you go through the step-by-step process the applied steps in Power Query. That’s the M language that that gets created. And when you do process this it will create a query that goes back to the engine. So let’s say at the end of your Transformations the last thing you do is say oh yeah, I only want 2013 data so I don’t want the other years that are out there and that might be your last step.
But our query is not bringing all the data in and then filtering it but the M language does is a process called query folding which takes all of its steps uses that to then create a query that gets sent back to the source. So that only the data coming across is 2013 data. So it’s an excellent question. And yes, it is exact same capabilities as power query.
We saw a couple of questions on regarding the Power BI Gateway. I think there’s some confusion around that right? So the Power BI Gateway is there’s a question you with them cloud solution. Can you host the Power BI Gateway and then our data flows an alternative to implementing the Gateway take a stab at that one.
So it is separate technology and what you saw today in the demonstration hooked into the cloud services which my ID had access to so I did not need a gateway to go to The Office 365 environment and I did not need a gateway to go to Azure SQL server, and that was purposeful. I did not do not want to have to mess with gateways during the demonstration. In any case if you wanted to go to an on-perm source, it’s the same as you would with Power BI so dataflows also need a Gateway if you need to move data from you.
Or on premise environment up into the cloud environment great. Yeah. Thanks for defining that I wasn’t sure that everybody necessarily knew that so you still need the Gateway, but it could be used to access on premise data as part of a data flow.
So there’s a question about security being applied to specific datasets or database objects based upon ad roles when accessing cloud datasets. So can you talk a little bit about the application of security? I know that’s something that Microsoft does a pretty good job with in terms of AD and something that they continue to enhance.
So the security is applied at the workspace level and if you have access to that workspace you have access to the cassettes and it is all Azure active directory when you’re up in the cloud environment. All right, so you’re utilizing if you have groups that you fit into the organization and the organization as Azure has say active directory on-prem. You can synchronize your Azure active directory with your organization’s active directory and you can use the group’s in allowing the security and in in the Power BI environment.
Got it. And I think that’s where that’s where people often pick up the phone and ask us for help is when you’re talking about hybrid environments, right? You’ve got on premise and cloud you have different authentication needs of different levels of security you need when you’re combining different data sources, right? So that’s where things get a little a little murkier. There’s a question about paginate Andrea ports. You want to tackle that one? Can we do pageant and reports?
Sure, so passionate reports It’s a separate technology from a data flow. You can agitate a reports is let’s it used to be reporting services or the on-perm Technologies reporting Services. If you have Power BI premium, you can create a pageant acted report and now has the same tool that reporting Services had which is report Builder and it’s part of your Power BI premium environment. So you can use it if you have a premium where you can publish it.
Always create a tool itself is free. You can download it from the portal which is the report Builder tool and you can create what reports but if you want to publish it and share it right now. It’s only available in premium.
Great, and then I guess we’ll do it where the top of the hour here. So I would be respectful of everybody’s time. But let’s do one more here. There are two questions related to the refresh of data using data flows. So one of them was around scheduling an automatic refresh from SharePoint and from a database for the same report and then more a more general question about how data flows affect data refresh. So maybe you can ask maybe you can respond to date. What do you how does data?
If work with data flows and then maybe use it. How will this? Yeah with that. You can schedule a data flow refresh it is.
Similar to what you saw in a dataset refresh you can if you have premium you can schedule it to refresh every half hour it also will we showed earlier the linked data flows? So if a link data flow had was being refreshed. It’s actually going to have the other data flow refresh as well. So whatever was dependent upon it downstream gets refreshed and then a link data set would be we would be subsequently refreshed if you will.
You can schedule the refresh. If you have premium you can do an incremental refresh and that’s a nice thing to have your data sets a very big so just bring in the new and change data by setting up an incremental refresh. That is a Power BI premium only functionality.
That’s great. So if you could just do me a favor in advance to the last slide there and do so well with that we’ll wrap up here. We will respond to the questions. We couldn’t answer and post those to the webbing. Keep going to the next one. Is there a slide right after that? Yeah, there you go.
And if you have any questions or want to reach out to us, you can reach us at info@Senturus.com or if you like to pick up a phone still there’s a 888 number there you can reach us. So we encourage you to go visit our website. Give us a call for any of your analytics needs training needs and us thank you for joining us today. We ought to thank Andy for an excellent presentation and all of you for taking some time out of your day to spend with us, and hopefully learn a little bit about Power BI and its capabilities. We look forward to seeing you on at an upcoming Senturus event. Thanks and have a great rest of your day.