The world of data analytics is continually evolving. The latest chapter in this transformation story is the advent of Microsoft Copilot, an AI-powered system designed to streamline business insights. The anticipation surrounding this breakthrough tool is enormous, with potential benefits ranging from time-saving data organization to advanced KPI calculations. However, it is essential to temper these high expectations with a balanced perspective, considering the potential challenges of integrating Copilot into your data management routines in addition to the significant rewards.
Microsoft Copilot unveiled
Copilot is a groundbreaking AI tool that automates data analytics processes, making it easier for businesses to gain actionable insights from their data. By providing comprehensive reports, detecting patterns and calculating KPIs, it promises to revolutionize the way businesses operate.
With the integration of Copilot into Power BI, Microsoft has enhanced the capability of Power BI at every stage. Extensive language models help users to achieve more and extract greater value from their data.
For example, using Copilot, you can specify the visualizations and insights you desire, and Copilot will handle the rest. Novice users can create and modify reports, produce and amend DAX computations, generate narrative summaries, and query their data, all in a conversational language. By effortlessly adjusting the tone, range and style of the narratives and incorporating them smoothly within the reports, Power BI can convey data insights with greater effectiveness through easily comprehensible text summaries.
High hopes and great expectations with Copilot
The buzz around Copilot is hard to ignore, with stakeholders envisaging it as a solution to their data woes. The notion that it will handle virtually everything, from sifting through vast datasets to producing digestible reports is incredibly appealing. The time that could be saved on organizing and prepping data alone makes it an enticing proposition.
Beware! The important caveats of working with Copilot
However, as is often the case with any advanced technology, it is crucial to be aware of potential pitfalls. While Copilot can deliver sophisticated outputs, the inner workings of these AI calculations are opaque to users. This creates what is commonly referred to as the black box problem, a term used to describe instances where the internal workings are not visible or understandable to the user. This lack of transparency into the Copilot’s methodologies used to derive figures can lead to potential misunderstandings or inaccurate interpretations. Let’s look at some examples:
Key or ID columns
Consider numeric ID columns in your dataset. An AI like Copilot might mistake these for measurable metrics and erroneously incorporate them into its calculations, resulting in skewed averages or totals. Users need to be vigilant about their data types and the way AI handles them.
Date and time data
Suppose you have a dataset where the date and time are recorded as a single string, such as 2023-06-08 14:30. If you ask Copilot to calculate the average or sum of this column, it might misinterpret the request because date-time strings are not numerical values that can be averaged or summed.
In this case, Copilot might attempt to convert the date-time strings into numerical values for the calculation, which could lead to nonsensical results. For example, it might treat 2023-06-08 14:30 as a large number like 202306081430, which would not provide a meaningful average or sum.
Missing or NULL values
Then there is the issue of missing or null values in your dataset. Suppose you have a dataset with some missing values in a numerical column. If you ask Copilot to calculate the average or sum of this column without specifying how to handle missing values, it might interpret these nulls as zeros, which could skew the results of the calculation.
For instance, if you ask Copilot to average a column of sales data that has some missing values, it might include the missing values as zeros in its calculation. This could result in a lower average than if the missing values were excluded from the calculation.
Predication or classification tasks
Let’s say you have a dataset where one of the columns is a categorical variable, such as Product Category, with values like Electronics, Apparel, Groceries, etc. If you ask Copilot to predict the Product Category based on other features, it might struggle if it hasn’t been properly trained to handle categorical predictions.
In this case, Copilot might attempt to treat the categories as numerical data or might misinterpret the ordinality of the categories (i.e., treating them as if they have a specific order like low, medium, high). This could lead to inaccurate predictions or classifications.
Outlier handling
A dataset with a few extreme values or outliers could be problematic as well. If you ask Copilot to calculate statistical measures like the mean or standard deviation without specifying how to handle outliers, it might include these extreme values in its calculations, which could significantly skew the results.
For instance, if you have Copilot calculate averages sales from a column of sales data with a few extremely high values, it might include these outliers in its calculation. This would result in a higher average than if the outliers were excluded or treated separately.
All of these examples underscore the importance of understanding your data and how AI systems like Copilot might interpret it. It’s crucial to ensure that date and time data are correctly handled and that any requests made to Copilot are appropriate for the data type.
You still need to know the fundamentals of DAX
Copilot’s integration into Power BI Desktop is not limited to generating complete reports from your data. It will also be embedded at various stages within the platform to assist report creators and data modelers in extracting, shaping and modeling the underlying dataset.
Microsoft is emphasizing the introduction of the DAX Query view. This feature enables developers to verify measure definitions, update measures in bulk and generate DAX for measures, calculated columns and calculated tables.
While this might seem like the ultimate solution to all challenges, it’s important to note that it doesn’t eliminate the need for a fundamental understanding of DAX. While it excels at simple calculations, it may occasionally produce incorrect syntax or return DAX for calculated tables. This feature can certainly accelerate the creation of measures, but it shouldn’t be seen as a substitute for a proficient DAX developer.
Power BI’s ability to handle complex business queries is one of its strengths. Therefore, while Copilot can enhance productivity and streamline processes, the expertise of a skilled DAX developer remains invaluable.
Data prep best practices with Copilot
Despite these challenges, businesses can leverage Copilot effectively by focusing on the organization and structure of their data. Copilot might not need data that’s as pristine as a human analyst would need, but it can’t make accurate assumptions without clear, well-structured data.
Here are some quick tips:
- Clearly define and differentiate between IDs and measurable metrics in your data.
- Regularly review and verify the outputs generated by Copilot.
- Conduct periodic data audits to ensure the consistency and integrity of your data.
- Be ready to step in and perform manual checks and analysis when required.
Human discernment, rigorous data prep
Copilot has the potential to significantly boost efficiency in developing reports and deriving business insights. Nevertheless, its power should not override the importance of rigorous data preparation and scrutiny. As you embark on this novel period of AI-driven analytics, keep in mind that even though Copilot might serve as your guide, you must remain the one in control, navigating your data exploration with discernment and understanding.
At Senturus, we are actively exploring Fabric and Copilot, both the potentials and the pitfalls. We will continue to post updates as we learn more. If you have any questions about Fabric, the Microsoft Azure/Power BI stack or data architecture and prep, get in contact with us.
About the author
This article was written by our own Steve Nahrup, senior Microsoft solutions architect. Fluent in the Microsoft stack, Steve has more than a decade of experience overseeing the architecture and design of data warehouse, BI, reporting and dashboard solutions. Curious about new technologies, he is constantly downloading free trials of new platforms and arranging meetings with their product teams to discuss an ongoing relationship where he is granted a license free-of-charge in return for ongoing feedback and thoughts on the current state and future releases.
|