Looking to move to the cloud and modernize your analytics environment? Microsoft’s Azure data platform may fit the bill. Highly secure and providing limitless scalability, Azure data-centric architectures facilitate easy collaboration and help speed time to market.
Get an overview of Azure and the key technologies in its data platform in this on-demand webinar. Learn how to use Azure data services to solve common data challenges using best-practice architectures.
Topics we cover include
- Data workloads on Azure: transactional vs. analytical
- Popular Azure services for relational and non-relational data scenarios
- Benefits of Azure for data-centric architectures
- Azure greenfield projects and data migrations
- Storage approaches: data lake, data warehouse, data lakehouse
- Technologies: Azure Storage, Azure SQL Database, Azure Synapse, Azure Data Factory, Azure Databricks, Power BI
Presenter
Vitaly Livshits
Azure Consultant & Microsoft Certified Trainer
Vitaly is a Microsoft Certified Trainer (MCT) with deep experience in all aspects of the Microsoft data platform. He started his career on an IBM mainframe in 1998, authoring printed reports for a bank. Today he is helping consulting clients succeed with Azure. He is also actively training students on Azure data engineering, data analysis, database administration and solution architecture topics.
Read moreMachine transcript
Welcome to today’s webinar. Today’s topic is an overview of the Azure data platform.
0:13
The GoToWebinar control panel is here to help you. Feel free to use this, to make this session interactive. We are usually able to respond to your questions while the Webinar is in progress, and if we don’t reply immediately, we’ll cover it at the Q and A session at the end of the presentation.
0:40
The first question we usually get is can I get a copy of the presentation? And the answer is absolutely yes. It is available on senturus.com. Just click resources and then go to Knowledge Center or you could just click on the link that was magically posted in the GoToWebinar control panel.
1:01
Today’s agenda, We’ll do some brief introductions. Then, we’ll jump right into Azure. We’re going to cover some data workloads, services, architectures, implementations, technologies.
1:12
We’ll wrap up with a quick Senturus overview and discuss some additional resources, and then we’ll get into your questions in the Q and A.
1:22
Joining us today is Vitaly Livshits, he is a Microsoft certified trainer with deep experience in all aspects of the Microsoft data platform.
1:32
Started his career on an IBM mainframe in 1998, authoring printed reports for a bank, and today he’s helping clients succeed with Azure.
1:42
He’s also actively training students on Azure data engineering, data analysis, database administration, and solution architecture topics.
1:51
My name is Todd Schuman, I’ll be your host today. I run the practice lead here at Senturus for installations, upgrades and performance tuning.
2:00
Before we get, start with the presentation, wanted to do two quick polls.
2:05
Poll number one, what Azure data related services do you currently use? Please, select all that apply.
2:12
We’ve got Azure SQL Databases, Synapse Analytics, Databricks, Data Lake Storage. Is there something else that you’re using?
2:21
Give it a couple of seconds here to kind of collect the majority of people attending.
2:29
Obviously, SQL databases is a big one here, I am seeing a lot of other.
2:40
Almost there.
2:46
OK, so, obviously, SQL Database is a big one.
2:51
Other, I’m curious to know, feel free to drop a note in the chat as to what other services you’re using, but, outside of that, almost 19% with Synapse Analytics and Databricks and then a little bit higher on the Datalake Storage. So, good to know. Thank you for that.
3:09
We have one more poll.
3:12
What on premise database platforms are you interested in migrating to Azure?
3:18
Please select all apply, SQL server, Oracle DB two, other non major Microsoft vendors, open sources such as MySQL, Postgre SQL, something else, or you already have everything in Azure.
3:33
Again, we’ll kind of keep it open here to get about 70% of the people in. We’ve got about 53.
3:44
It looks like, again SQL Server, being a big one here, followed up by Oracle’s DB2 is something else.
3:51
And then open source, other, and surprisingly, a lot of you, 9% of you have, already got all your stuff, and because you’re a database, which is great.
4:01
So, OK, I’m going to turn it over to Vitaly now and we’ll wrap up in a little bit and take your questions at the end. Thank you.
4:09
All right. Thank you very much Todd. Pleasure to be here. So a little bit more about what we’ll be talking about today. We’ll be talking about the different types of workloads that we can put an Azure, and there are different ways to categorize them.
4:24
But one way to do it is to think about them as either transactional workloads or analytical workloads. We will talk about the distinction in just a few moments, and see exactly what we can do with both our workloads.
4:41
The next thing we will do, also, is, we’ll talk about the different services that we can use in Azure for different types of data. We have different ways to categorize data. one of the ways to do it, is, to think about it is, either relational data, which is highly structured, typically, in tables with relationships between the primary keys, Foreign Keys, cardinality and so on, and non relational. It spans all kinds of systems from document databases, graph databases, column value databases, And basically everything that’s not relational goes into that non relational pocket, but some specialized systems will talk about how to do that in Azure as well. At the third thing that we’ll be talking about, this, is an Azure.
5:28
There are some concepts, some foundational technologies, some of those architectural foundations, if you will, that are baked into the platform itself, that apply to all of these services. And we’ll talk about what they are, and then we’ll talk about how, you can take advantage of them in your own deployments in your own organizations.
5:49
The next thing we’ll be talking about is in the context, especially in the context of the Azure SQL platform, we’ll be talking about how to get it deployed for both brand new projects, which are greenfield projects, and also for migrating existing deployments.
6:04
And I can see from the survey, from the poll that we just did, that 42% of you are running or planning to run the Azure SQL database. So, I think it will be very relevant for, those groups of individuals, for those companies that are doing that.
6:27
It will also be relevant to just over half of you I see 51% are currently writing SQL server, which is pretty much where we’re at.
6:36
A lot of my career has been, as well with the Microsoft stock, taking those SQL Server deployments, making them better and better, and eventually, of course, taking them to Azure us. We’ll talk about how to take your SQL Server deployments and migrate them, and start driving them on top of the Azure cloud. And the final thing, we will be talking about this, with storage approaches.
6:57
There are three different ways to structure your, especially your analytical workloads, data lake, data warehouse, and finally, the data lake house as well. Data Lake House, being the newest approach. Perhaps some of you have seen, as perhaps some of you have not. So, we’ll talk about what this is and how you can benefit from each of these approaches. We’ll talk about some of the pros and cons as well.
7:23
All right, wonderful. And, yeah, just a reminder, if you have any questions, please feel free to keep those questions coming, and I’ll try to answer them as we go through this presentation.
7:34
And the final thing there is a sample of, the technologists will be covering, every deployment and Azure is using, typically, a whole set of technologies. It’s very rare that we deploy something in Azure, and there’s just one service or two services, or even three services. It’s typically going to be even for smaller deployments it’s often ending up being Combination of, 15, 20, 30 services, which can sound a little bit overwhelming at first, but that’s just the way it is. Just the way things work.
8:06
And, it’s, actually beneficial, because each of those services, they are specialized in specific, you know, addressing specific requirements.
8:17
So, by taking those purpose, design services, you’re actually making your architecture more robust, and you’re making your implementation time and cost as minimal as possible. So, it actually has a lot of value in using those specialized technologies that we’ll be talking about throughout this this hour today.
8:41
Alright, so, just a quick intro to Azure, and I know some of you may already be running Azure. I think 90% of you, I can see from the poll already, have all of your database in Azure, and some of you probably have some of your database already in Azure. So, what’s Azure were very briefly. Azure is Microsoft’s public cloud platform. The competition being, of course, Amazon Web Services, and the Google Cloud Platform, AWS, and GCP.
9:09
And in Azure, has been officially released back in February first, 2010, it was originally called Microsoft Azure then was renamed to Windows Azure. Just highlight the fact that it’s not just Windows. Workloads that you can run on Linux workloads can run any kind of workloads in Azure. And today, Azure is available around the world in more than 60 regions in Azure terms means at least one data center.
9:43
Some locations have a single data center.
9:45
Some locations have almost like a cluster grouping of three data centers.
9:50
So, it’s globally distributed, no matter where you are, no matter where your customers are, and no matter where you are workloads, are databases, front end applications, you can co-located things.
10:01
You can take full advantage of any data residency considerations, to make sure you stay in compliance with those things.
10:09
So data latency, all of those things are enabled by worldwide distribution, and in each of those data centers, there is a selection of services available altogether in Azure, or depending on exactly how we count it.
10:25
There are more than 200 different services, So, services, think about this as types of software.
10:31
So, Azure SQL database as a service, Azure App Service, we’re running, websites as a service. Azure Data Lake for storing files, is a service. So all of these specialized software components, if you will, they are called services. And today, of course, we are focused on the data category. We could be talking about many other categories, like application development, internal things, and so on.
10:56
But yeah, just to keep it manageable, we are focused exclusively on the data space, and the way I look at the Azure, I think Microsoft has it on one of these pages as well. It’s really about fast implementation, it’s about high productivity, it’s also, of course, about cost efficiencies, and it’s about taking your idea from just paper into a fully working system as quickly, as efficiently as possible.
11:26
And today, you can deploy any kind of workload on Azure, and make it work, I think faster than anywhere else.
11:36
Certainly, if I look at the speed, that I can do things. With it, for example, even SQL Server on premises, I can probably do the same thing in Azure, in order of magnitude faster. Not just initially, but also in terms of ongoing maintenance and enhancements.
11:53
So it’s a, great cloud, too.
11:56
Deploy your workloads too.
11:59
So speaking of workloads, there are two ways to think about database workloads, one is transactional.
12:06
It is also known as OLTP Online transaction processing. And that’s your typical operational database.
12:12
So for example, if you’re writing an e-commerce website, and it has a front end, well, that’s going to be your, operational workload. And that front end talks to that transactional database, which traditionally has been used running on SQL server in the Microsoft space.
12:31
We’ll see in a few moments, how to do it in Azure and this database that we are using for transactional workloads.
12:39
It’s there, it’s, optimized, it’s best suited for combined, it reads And data modifications writes basically inserts, updates, and deletes. So it’s optimized for this purpose in terms of its throughput in terms of its backups in terms of its pricing, even. So it’s all about this, combine the reads and writes at the happening an ongoing basis. If you think about e-commerce, for example, your customer is purchasing something.
13:10
So they are looking at your product catalog and then they are placing an order when they look at the product catalog. Of course, the city, it’s when they place an order, it’s going to be some rights happening as well as there.
13:23
Others information is saved as their credit card. Information is saved, as their shopping cart, saved, and so on.
13:32
On the right-hand side, we have the analytical workloads.
13:35
It’s been known as all up for quite some time. Sometimes nowadays is just called analytical workloads. It’s pretty much the same thing for our purposes. And this is a very different kind of database. This is a database that doesn’t have an ongoing transactional type of workload.
13:52
This database is be read only for most of the day, and the data is just being retrieved from the database to run somebody porting to run analytics to run their business intelligence type of workloads.
14:06
The only time this databases typically change this is during the batch loading process, which is typically overnight, maybe throughout the day, but it’s, fairly intermittent nature. It’s not being changed hundreds of times, a second is being changed, maybe once in 24 hours, maybe once, every hour, every two hours, by separate batchload process, and then the users are just writing reports from this database.
14:34
So, often, it ends up being colon, Data warehouse, if it’s a data warehouse, it covers your entire organization.
14:42
The term for that is Enterprise Data Warehouse, also known as EDW.
14:48
So now, we’re just going to expand this and see how to do it in Azure. As with all the other architectures we are going to be looking at here. This was just one of the options in Azure.
14:59
There is a variety of possibilities for every architecture. And you really need to consider, I think very carefully. The pros and cons all the ramifications, if you will, of all of these architectural decisions we’re making. Because they are, of course, going to be there for the duration of the projects. And they’re going to be a little bit tricky to change the project goals. So, it’s worth spending a little bit more time on the design, phase.
15:23
And architecture phase will make your implementation and ongoing maintenance so much easier.
15:30
So, in this case, this is wild approach for typical transactional type of architecture or database. And we’re going to focus on this part here, which is Azure SQL database. So, those of you who are running SQL Server today or running Oracle DB two or any of the commercial vendors, and they see some of you are running open source as well, MySQL and Postgres SQL or really any other database. This is typically in the relational database.
15:57
This is typically where you would go to if it’s a transactional type of workload, you would go to Azure a SQL database service.
16:07
And in this case, what’s happening is the services used together with a whole range of other services, to ultimately, present a front end to the customer.
16:17
So, the customer is going to your e-commerce website, through the internet.
16:22
Of course, they’re going through the application that has the, our front end, with all its HTML, JavaScript, and images, and scripts, and so on.
16:32
And that application then has its own API, some backend API, rest API, and the API is using the database as its backend, so all of those transactions, for the shopping cart, for the purchases, or browsing the product catalog, all of those items are going to be then going to this particular Databases as Transactions.
16:58
So, that’s the, that’s our focus here on this database. Of course, there’s a lot more going on, There is a developer who creates all those web pages. They have a DevOps process, but that’s for, kind of another, session, too.
17:13
Dive deeper into those aspects.
17:17
The other kind of workload is this analytical workload, and analytical workloads, you can see from a data perspective, it’s a little bit more complex, if you will, because there are more services on the walls And because there is typically just by the nature of the business rules.
17:34
There is a fair degree of complexity in transforming data from its original state, which is here on the left hand side of the diagram.
17:45
To ultimately, the visuals that your data analysts, that your business users, that your executives, That your project manager. So whoever is consuming the reports ultimately wants to look at, which is over here in the Power BI space.
18:02
So, let’s go through this diagram and see what those middle boxes are doing, and how, it all works in Azure.
18:11
So, with, as far as the data sources are concerned, in this case, we have two types of data sources. We have our on-premises data source, which is a SQL database. So, there’s your SQL server, there’s your Oracle DB2, MySQL, Postgres SQL. Any of those SQL databases that you have currently within your on-premises data center, they are here. And perhaps you have some third party information, maybe some market research data, maybe some financial information from the stock exchange, for example, maybe some geographical information from your maybe local government, and things like zip codes, postal codes, and things like that.
18:51
So all of this external data and your proprietary yeah.
18:56
On premises database is then being ingested as being loaded into now this Azure clouds, all of this.
19:07
All of the other items are the host at the inside Azure.
19:12
So, the first thing that happens is they being loaded into the service called storage blobs which is part of the Azure storage account. And what the service is doing is the services is a very large scale file repository. That’s one way you can think about it sometimes called the data lake as well.
19:31
So you can ingest practically unlimited volume of files of practically unlimited size into this storage account with storage blobs.
19:41
And then you have those files there. You can use any file format that you would like.
19:46
You can have them as CSVs, XML, JSON, or kf file. There are so many file formats that you can use. Any file formats are suitable for this purpose.
19:58
The next thing that you do is you then transform those files into this service, into this platform called Azure Synapse. Azure Synapse will talk about it more in a few moments. It’s an end to end Analytics platform. So it’s a platform that has a built-in database.
20:16
It has built-in also at ETL an ELT technology called Data Factory, which is used to transform and load data. It has integration with Power BI. It has so many different components there that are essential to modern, analytical workloads.
20:39
The next thing that happens is there’s a. The next step. That’s here is something called a cube.
20:46
So within analysis services, we can create a cube, we can create a special database that’s ideal for analytics. And this special database allows us to do what’s called a slicing and dicing data, exploring it through different dimensions.
21:02
It lets us create what’s known as a star schema or snowflake schema and make it very easy to use for our report builders. In our report builders are here using the Power BI platform on the right-hand side.
21:19
So that’s an end to end analytics pipeline. This is very common to do either this exact analytical pipeline or something very, similar to this. Sometimes, I find some of my customers skip the analysis services.
21:34
Part here, sometimes they may use data lakes instead of storage blobs, but it is fairly similar.
21:42
To, is this fairly similar to this?
21:45
So that’s your analytical. That’s your analytical workload.
21:50
All right.
21:50
The next thing I want to talk about is the two types of file migrations.
21:55
And let me just see if there are any questions so far so good. All right, So the next time, next thing I want to talk about is the two types of workloads in, that you can deploy to Azure.
22:10
The first type of workload is known as a greenfield workload.
22:15
And a greenfield workload means we are starting a brand new project.
22:19
So we either have a project that just arrived or perhaps we are completely rebuilding an existing project. So we don’t need to migrate an existing database.
22:31
That is, in some ways an ideal scenario because, but this scenario, what happens is we have a lot of freedom to choose any platforms we want.
22:41
We are not restricted by legacy data considerations. Now, in real life, this may not be happening all that often. Perhaps, more of your projects will be migration projects, but it is good to know the options for both. Hence, we are going to cover both here.
23:00
In the migration scenario, we have an existing database. So all of those existing SQL Server databases, for example, you can take them and you can deploy them to Azure, and you can benefit from all the things that Azure offers.
23:12
We’ll talk about more about some of those considerations in that in a few moments.
23:18
This projects tend to be a little bit more complex or why they more complex. Well, because those are systems that we have deployed in our enterprises, they’ve typically be around for some time between the round for sometimes many years, sometimes perhaps even many decades. And they have all kinds of integrations with that. Maybe your SQL Server database has 20 or 30 or 50 different integration points with other systems. And it becomes a little bit challenging to decouple those integrations and separate the database from all its applications and all its related data pipelines and everything else, that it’s connected to.
23:55
So, there are some strategies that can be deployed to gradually shifted over, to Azure. In addition to shifting the data to Azure, we may of course, need to shift the code base to Azure. That’s a whole separate conversation. How do we migrate to Azure, But this is, it is possible as well. It is not required. So, you can keep your code on premises, and keep your in sort of move your database driver.
24:23
So, now, you are writing what’s known as a hybrid cloud environment for that system, where your code is in your own data center, and your databases, and the Azure data center Deployment topology that is used by quite a few organizations, as well. And often, this is just a path on them eventually shifting.
24:43
They are called base to Azure as well, but it gives them a gradual migration path to reduce risk and speed up, ultimately, that time to market for them, for their system.
24:53
When we do the migrations, we can use different approaches to perform that migration, so the first approach is known as the lift and shift.
25:02
Lift and shift means you are taking the system as is. So, those are 55% of that D square, I think, SQL server.
25:10
You can take your existing SQL server databases, and you can just keep running SQL server in Azure in a virtual machine.
25:19
Pretty quick, fairly easy.
25:22
Of course, may not give you all the benefits of Azure, but it’s great as a stepping stone towards a more complete migration and more complete migration may.
25:35
Oh, when you do that? When you take it to the next level, which you can do right away, or you can do it in phases, first, lift, and shift, and then you start taking advantage of more and more Azure features. That requires a little bit more effort that requires a little bit of, re architecture, requires a little bit of development.
25:51
And, oftentimes, what I see many of my customers do is, they gradually benefits for more and more Azure features.
25:59
So, firstly lift and Shift, take your SQL server, VM from on prem, put it in Azure, and then, the next thing is, maybe use extra Azure features one by one, until the benefit from more and more Azure capabilities.
26:17
All right, so, let’s talk about now, how do we do it with specific Azure technologies?
26:23
So, the first technology that they want to talk about in more detail is this Azure SQL database service.
26:31
So, Azure SQL Database Service is what’s known as platform as a Service. Platform as a Service. Very, very key. Term here to know when you work with Azure, abbreviate that as pass, as well.
26:43
So, what’s Platform as a Service platform as a Service is what Microsoft is, calling fully managed service, which means the Azure Data Center, the Azure Platform, takes care of many of those things that traditionally, we had to do, manually and interracially spend a lot of effort.
27:02
On Doing, for example, backups, I’m sure many of you who are in the, in the DBA type of role, the database administrator, all know all about backups and the importance of backups.
27:15
Setting up a backup schedule and testing them regularly, and putting the backups off site and, and then, of course, when you, when it comes time to do the restore, is also all the complexities that are involved there. It’s a fair, degree of complexity to get data, disaster recovery with backup set up on any on prem database, even on SQL server. And here on Azure, they are largely taken care of for us.
27:43
So as soon as we create an Azure SQL database, the platform, we’ll automatically do backups and it will keep them for a customizable length of time within the system.
27:55
With long-term backup storage, we can pick keep backups for many, many years.
28:00
Within the Azure cloud, we can put them in a different data center for redundancy. So, lots and lots of flexibility, and most of this is either automated completely, or just point and click. And very, very little effort, very, very fast to set up, so huge benefits.
28:17
The other thing that some baked into this platform is very high availability.
28:23
SLA stands for Service level Agreement, so it’s a financial guarantee that Azure provides to us as customers.
28:31
And with Azure SQL database, we can go up to 99.995%, which translates into, at most, just over two minutes of downtime per month.
28:42
Which, if you, think about it, that’s it.
28:44
That’s incredibly reliable platform, Very, very high availability. And it’s there out of the box, just by provisioning service.
28:55
The other thing that’s traditionally been taking a lot of time, and it will still is taking a lot of time for our on prem deployments, is this whole topic of upgrades and patches.
29:04
So, in terms of upgrades or future upgrades, when we want to move from SQL Server, for example, when we are upgrading our SQL server to, the 2019 deployments to SQL Server 2022, it. So, it’s a fair bit of work. We need to regression test everything you need to deploy it. We need to consider the licensing. And per user, cost per server cost, per core course cost. It’s a bit of a mini project to complete an upgrade because of all those moving pieces. Because of all those things that are involved, the same thing with security patches. As we know, it’s very important to stay up to date on those security patches for our well in everything operating system, of course, and databases. So, when we have those SQL server service, we have the operating system. We need to keep up to Date to Windows or Linux if you’re writing SQL server on Linux.
29:58
And on top of that, we also need to stay up to date on security patches for SQL server engine itself.
30:04
So same thing, it becomes a fairly heavy maintenance burden to do it with a traditional environment. And here, this is automatic.
30:14
The platform just patches it.
30:16
So as soon as you create a SQL database, the Azure Cloud will automatically maintain it up to date on the latest version, and on all the latest security patches.
30:32
Let’s see if there are any other questions. OK, so we have one question that came up. And the question is what does the recommended unless the risky approach, Greenfield, or migration? Yeah, good question. So certainly it’s less risky to do Greenfield because we are typically creating a new system.
30:50
So if we want to make any adjustments in our architecture for Greenfield, it becomes typically relatively low risk because we don’t have users on the system yet. We don’t have the transactions going through, we don’t have the analytical workloads. So, we have some briefing room if you will, to do it with Greenfield. Whereas for Migrations, we typically have kind of like a ticking clock, right? When we toggle over to a new system and we migrated, it becomes much more of a high pressure, high risk type of situation.
31:23
Because we’re moving data eventually for our production systems. So it becomes very important to, have the full testing.
31:34
And then if we don’t plan it correctly, then, of course, the risk goes up. So, yes. I would say in general, migrations of course, can be, can end up being high risk. But there are all kinds of techniques we can we can use there as well to reduce the risk and make it a very controlled, very reliable, a very safe migration process.
31:54
Alright, the next point there is on the security front. So, on the security front, this is another great example of how Azure makes things easier. I just had to actually, one of my clients, literally last week, was securing their Azure SQL Database, and that was helping them with some security audits.
32:11
So, there are so many features that are available within the platform, to make sure it’s secure, and it’s, many of those features are even enabled out of the box.
32:22
And the ones that are not, You can, just purchase them for very small fee through the Azure interface, and then just enable open at this point and click. So, just some examples is very strong authentication.
32:35
So, if you are using your account for connecting to a database, for example, you know, that’s one of the best security measures today is multi factor authentication, or have a second factor being, for example, that SMS message, or a phone call, Laura or Microsoft Authenticator app or a custom hardware token.
32:57
So, all of those things can be used to authenticate to Azure SQL database out of the box, just because it’s using your Azure account. So, right away, you have strong authentication security.
33:08
You have built-in encryption in transit.
33:12
When you connect to Azure SQL database, you can encrypt the connection, in fact, it’s encrypted by default.
33:20
Can we do the same thing with SQL Server or other databases? Yes, we can. But it’s more work. We need to purchase a certificate. We need to deploy it. We need to maintain. We need to renew it.
33:31
And here, the platform is doing it for us completely automatically.
33:36
In the other example on this front will be that will be the defender service that’s available for SQL database.
33:43
And it can protect us from fairly advanced scenarios. For example, from SQL injection attacks that somebody can be launching through. The website, for example, is connected to this database, And this is just a sampling of the features that are more security features that are available in other example, will be databases themselves are encrypted at rest.
34:05
And they can be encrypted with an Azure key. You can bring your own key.
34:09
if your regulatory framework requires that if the interest in which you work requires customer encryption keys, you can easily bring them into Azure as well.
34:20
The other thing that’s available there is, the sum whole idea of instance, scalability and elasticity.
34:26
So often, what happens is, when we run the workload, our workload today, in terms of its complexity and number of users, is going to be very different from, its, from the workload, to be around tomorrow, from the workload, around a year, or two, or three from now.
34:42
And traditionally, what has been a big challenge for many deployments, I mean, certainly, the ones that I’ve been participating in, the over the two plus decades I’ve been working with Microsoft databases is, we always get to a point where the database is just not fast enough. So we have SQL server in the virtual machine. And that’s great today.
35:04
But then a year from now, more users, more systems. And now it’s a slow database. In our users are complaining malware. ETL jobs are timing out. And now we need to upgrade our server to a bigger server. We need to upgrade our database. It is offered a fairly time consuming process. Even if you’re running in a virtualized environment to get more capacity, Whether it’s disk space or ROM or networking or CPUs, of course as well, CPU cores, and with Azure, it is as easy as just dragging a slider on the screen and you drag it to a little bit more to the right. It becomes more powerful. Of course, more expensive as well, You drag it to a bit more to the left, it becomes less powerful, more, cost efficient, and you decide how powerful you need it.
35:58
And the final point I have here is around Artificial Intelligence optimizations.
36:02
So the database’s trying to be as quick as possible automatically.
36:08
So, for example, you can turn on automatic index tuning.
36:12
It will look at the SQL that you are writing and automatically create any missing in.
36:17
This is something that to be traditionally had to do manually with the index tuning advisor, for example, in SQL Server. Now, we can do it just by selecting a checkbox in the interface, and it will do it for us on an ongoing basis and make sure the right indices are in place.
36:36
All right. Now, let’s talk a little bit about migration. So, we have an existing database and we need to take them to Azure. So, how do we do it?
36:43
Well, the simplest way to do it will it be this lift and shift migration?
36:47
So, we’re still running SQL Server.
36:49
You’re still running it on a virtual machine. You’re just trading it in an Azure.
36:52
virtual machine as opposed to, on prem, in something like hyper V, or VMware as the hypervisor. The Azure has its own hypervisor, It’s actually called the Azure hypervisor, is similar to hyper V, or VMware if you’re currently using those platforms today.
37:12
So, pretty simple to, to pull off this type of migration.
37:15
You just take your VM backup, you restore it into Azure, and then you have your SQL Environment trading into Azure.
37:25
SQL Server Environment, trading into Azure
37:29
The great thing about this option is it gives you access to the operating system.
37:33
So whether it’s Windows or Linux, you have access to the C drive.
37:38
You have access to installing the other applications, and keep writing those applications from the database server, if you’d like.
37:46
And that’s the fastest option.
37:48
It is not the option that gives you the most return, because after you do this, you still have some administrative burden. You still have some administrative effort. For example, Windows patches, You still have to deploy them when you have to configure them for deployment.
38:05
SQL Server upgrades, of course, you’re still responsible for the SQL Server upgrades.
38:11
It’s not quite as easy to scale. It’s not, it doesn’t have the automatic backups, for example. So, those benefits, we talked about on the previous slide, they are not going to be realized by infrastructure as a service. If you want to realize those benefits, which I totally recommend you do, please consider Azure SQL managed instance.
38:32
The Azure SQL Managed Instance service is platform as a service.
38:36
So it has fully automatic, patching, backups, easier scalability, And the great thing about this service, as opposed to Azure SQL database, is it’s almost fully compatible with your SQL Server deployments. No matter what version of SQL Server, you are writing, it goes many, many versions back from the latest and greatest SQL Server release. So, an example of what you can do there, is you can do, for example, C, allows store procedures, have dot net store procedures.
39:08
You can do them on top of managed instance, in Azure, you cannot do them on top of Azure SQL database.
39:17
Another example will be, many of you probably have SQL agent, jobs set up for things like backups, for example, or anything else.
39:27
So, those SQL agent jobs, you can run them on top of managed instance. Managed instance, has a SQL engine runtime, whereas Azure SQL database requires some other automation approaches for those administrative jobs.
39:42
So, this is a great choice and this is where I usually recommend you migrate a database to try to get to managed instance. It’s a little bit more effort than putting it to the virtual machine. But, I find the benefits often are pretty easy to justify.
40:02
Alright, so now, let’s do a quick live demo here, just to see how easy it is to ultimately create those SQL database and start using them.
40:12
So, I’m just going to pull up what’s known as the Azure Portal.
40:16
So, Azure Portal is our interface to Azure. Well, one of the interfaces to Azure. We have all kinds of scripting interfaces API, but this is our graphical user interface, and we’re just going to go from scratch. We’re going to create the database.
40:32
So, create the database.
40:34
I’m going to say it’s an Azure SQL database, and we’re going to take that database all the way from a state where we just have a idea that we have to use a database all the way to a writing working database that you can run SQL on in literally a few, minutes. Alright.
40:52
So, database name, we’re just going to call it, Azure data webinar demo server. I already have two servers available. I’m going to pick this server. You can see the locations are there.
41:05
I can basically put it in any location around the world. In this case, I’m putting it in the East US region. There are all kinds of other options here.
41:15
And here’s the capacity. So this is where I can scale the database to be as big or as small, as powerful, or as cost effective as I want it to be. Huge range of options.
41:27
Everything from the most basic database, which is, I’m located in Canada than myself. So, this is in Canadian dollars, This is about $5, for those of you who are United States, So five, $5 per month, right? So this is almost three.
41:43
Very basic database, great for experiments, learning all the way to something like this, for example, which is an incredibly powerful database, that’s now going to cost you about $20,000, about 24,000 Canadian dollars. So, you decide what you need.
41:59
And within a few seconds, Azure will provide it.
42:04
So, in this case, I don’t want to be getting a $20,000 bill. I just want to get a $5 bill.
42:10
So I will create this database all kinds of advanced options here. We’re not going to go through all of them.
42:17
I just want some sample data.
42:19
And the sample data is just Adventure Works database, which is, it’s just like a fictitious online store that has some outdoor gear sales so I’m just going to use that simple data set to be able to confirm that the database is behaving the way we need to behave.
42:39
So I’m creating this database and what’s happening now is going to the skew to the provisioning queue. And typically depending on the service, it sits there from anywhere from a few seconds to a few minutes. And once it’s ready, it’s going to give us a confirmation saying, yep, database is ready to use.
42:55
Please go ahead and start using it.
43:05
Involves doing it. Let’s see if there are any other questions OK. There was one more question that came in. The question is, do we need to migrate our database to SQL Server to complete the migration? A great question there. And the answer is no.
43:16
You can actually take your database from its current platform, from, for example, Oracle or MySQL, or Postgres SQL, and you can pull it directly into Azure into many of those services. You look at, yep, so you don’t have to migrate to the SQL server first. Yeah, very good question there. If you want to look up the specific tool that lets you do that, Let me see if I remember from memory, What the tool is database.
43:42
Migration assistant use the tool or DMA. OK, if you look up for DMA for Oracle or DMA, for Open SQL. You’ll find there are those tools available?
43:51
And they actually help you without vacation, But yes, absolutely.
43:53
You can go direct from, non SQL server database to Azure.
43:58
Very good question, OK, so while we were talking there, the SQL database got created, and we’re just going to confirm that it’s your SQL database that we can run SQL on. So, I’m going to login.
44:11
And you can see a sample data set here with our sample of customers and products.
44:15
For example, we’re going to go to product table and do the things we are all familiar with, which is running SQL. So, we’re going to run SQL.
44:22
There’s the SQL, and there are the results coming back from that SQL database.
44:27
So literally, in less than five minutes, we created the new database in Azure, in the data center of our choosing be, the capacity that’s be wanted, and now the databases operational and ready to use.
44:46
Let’s talk a little bit more about this technology called Data Lakes. And this was one of those things that kind of sounds complex when you first look at it. But actually, once you work with it for a beta, it is actually pretty straightforward concept, i think.
45:00
And the concept really is just cloud based file storage, that’s massive scale, high security, and so on.
45:09
But really, all you’re doing is you’re storing files in this hierarchical namespace with folders and subfolders. So the same way you store files on your C drive, for example, This is now the cloud sized version of the same idea. And once it’s deployed to the cloud, you get very high performance. You can configure it as well in terms of things like i-ops, for example, in terms of throughput of the database of that storage account. You are highly likely to run out of fast storage limits.
45:45
You can store petabytes of information in the same storage account.
45:50
And if you need even more than that, you can just create multiple storage accounts in your Azure or tenant in your Azure environment.
45:59
The other thing that is very popular as well is you can then take your files and you can distribute them to other data centers, you can replicate them to other data centers with just point and click.
46:13
So, for example, let’s say you’re deploying your workload to the East US region, you can click a button, and it will replicate it automatically in the background between East US and West US.
46:25
And it will just handle all the updates automatically.
46:28
So, as soon as you make any changes to the file in East US, the same change will replicate to west US in the background.
46:37
All kinds of security features, from encryption in transit, encryption, addressed, authentication, authorization, custom keys, things like ACLS, access control list on individual folders. So you can set up any security that you want on this database.
46:54
And on top of that, if you have any big data systems that use Hadoop today, it has an API available.
47:00
That’s Hadoop compatible, So you can take your Hadoop clusters today that have the HDFS Hadoop file system on them, and you can just pretty easily lift and shift the storage component of that into Azure. So, this Azure storage accounts and data lakes, and this gives you practically unlimited file storage.
47:23
For non relational databases, there is this platform called Cosmos DB.
47:28
And Cosmos DB has those specialized database types that be, I talked about a little bit earlier today.
47:36
So all of that non relational data can go to Cosmos DB, and, you know, we can be talking about this platform on its own probably for. For a few days. There was actually a Microsoft courses multi-day course, which is focused on Cosmos DB. So it’s a pretty big platform.
47:49
Just to remember, this is for your non relational data workloads.
47:55
Alright, let’s see, another question that came in, can you recommend an inter training for Azure Management, which is the foundation for what we are looking at right now? Yeah, absolutely. In fact, towards the tail end of this presentation soon, that Google will talk a little bit about some of the training opportunities as well. So, yeah. So I think there’ll be a greater chance to for us to give you some pointers there.
48:19
And the final thing I want to cover today is this slide here in the city slides. We’ll break it down after that.
48:29
And this is for Analytics workloads. Now, how can we do analytics in terms of architectures on Azure, in addition to the ones that we already looked at? How can we structure our analytical systems?
48:43
So the first one is data warehouses.
48:46
When I first started with analytics, back in the early two thousands, this is how pretty much all the organizations were doing that. They were creating a Data Warehouse database usually on SQL Server, in the Microsoft chops and the Microsoft environments. So we have this structure data coming in, for example, in CSV files, let’s say, customers and products and orders.
49:09
We go through this ETL pipeline, you get into the specialized database, and then the users run the reporting.
49:16
Everything is good, except the things that are a little bit challenging, which is time to market. It takes a long time to develop those ETL pipelines. And the data in data warehouses, or also is, always going to be a little bit behind, Maybe a day, behind, maybe, maybe longer. So, some challenges there, and also, data warehouses become fairly expensive to store large volumes of data, so.
49:41
And they can be a little bit hard to scale as well. Especially, scaling out of data warehouses is a little bit challenging. So then, the industry moved towards this idea of data lakes and data lakes.
49:53
You have file based storage, and then your users are, for the most part, trying to access data lakes directly. Sometimes they build data warehouses, but you’re more advanced users. Just go to files and data lakes.
50:08
Then the newest approach is this combined term data lake house, data lake and data warehouse together, which is a data lake with a special technology in the middle, which gives you the best of both worlds and gives you the easy access of data warehouses and a fast time to market of data lakes.
50:32
Alright. So how did we do it? So, Data Warehouse. Pretty much the way we looked at. This is very similar to the slide we looked at already. So I think we have gone through it enough.
50:41
The data Lake, how does data Lake look like? Well, you create an Azure Storage account.
50:47
You make it a data Lake, is just a checkbox in the interface.
50:50
When you create the account, then you direct all of your users and all of your automatic workloads towards that data lake. So, fairly simple architecture.
51:01
The challenge with that is, it can be a little bit tricky for the users to consume data directly from the data lake, because, especially for users who are less technical, less of a power user. You go there to a data lake.
51:16
You see maybe 10 files, and you need to try to create the report based on combining data from 10 files. It becomes a challenging task to do the reporting level. So.
51:27
Then, we get the best of both worlds, which is Lake House. So, how do we build the Lake house on Azure? Well, there are a few different ways, This is one option.
51:39
So, we have a storage account with a data lake.
51:43
Then we have two types of users. We have our advanced users, our data scientists, our machine learning engineers, our power users for the ports as well.
51:53
And they can just write some some code and run it on those Spark pools so they can run code and for example, in things like R or Python or if they use Synapse Analytics, even using .net code or even Spark SQL code.
52:08
And they write some code and they access the data and the data lake.
52:14
The rest of the users, and this is the magic here, if you will, this is the part that makes it easy to consume files and data lakes.
52:24
There is a serverless SQL pool engine as part of Azure Synapse analytics and that engine allows us to present files in the data lake.
52:41
So, the data is still all in the data lake.
52:43
And this is just the virtualization layer that makes files look like SQL tables, but the relationships and the data types and column names and everything else we are used to from the SQL world.
52:56
And now, once you have the SQL tables, you can just consume them directly from Power BI, or you can build that allows a services cube and then make it even easier to slice and dice by dimensions using that star snowflake schema. So, that’s one option.
53:11
Another option would be to use something called Azure databricks as well. So, like I said, every architecture, we look at here has all kinds of varieties we can build upon. But this is, this is one way to do it.
53:25
Let’s see, OK, another question came in.
53:28
Is a data lake only for small volume of data files. A data lake is for any volume of files. So I have some deployments where data Lake is used for, you know, just a few gigabytes of files, and have some deployments where it goes into hundreds of terabytes or even petabytes of files. So really, it’s any volume of files.
53:45
And the great thing with data lakes, and most of the services we looked at today, is you can scale them, too, fulfill the volume requirements you have.
53:56
So if you are storing just a few files, you’re only paying to store those few files, so maybe you’re paying a few cents per month. And if you are storing large volume of files, of course, you’re paying for the larger volume falls, but you’re not forced to buy capacity and pay for something you’re not using. So that dynamic, scaling that elasticity, that skill ability, works for performance works or pricing works for everything else.
54:20
So data lakes are there for any volume of files are a very good question.
54:31
I’ll pass it back to Todd. Thank you so much Vitaly, ah, lots of great information here. Hopefully, everyone enjoyed today’s topic.
54:40
We do offer a broad selection of Azure training classes, all taught by today’s presenter Vitaly. So, see our website for more details on that.
54:50
We also have some additional resources.
55:01
We have hundreds of free resources on our website. We’ve been committed to sharing our BI expertise for over a decade now. A couple months ago.
55:18
We have a couple upcoming events. Thursday, September 1, we have a webinar on Introduction to Data Visualizations Using Python.
55:27
If you’re interested in learning more about that, make sure you sign up for that one.
55:31
We don’t have it on the slide deck, but we also have a What’s New in Cognos 11.2.3 tentatively slated for September 22nd.
55:41
Hopefully, we’ll have Rachel, Su from IBM give you a first look at all the new features and functionality available in, the latest version of Cognos, which we expect out next month.
55:51
So again, check our website for the details and sign up for that one if you’re interested as well.
55:57
And a little background on Senturus. We concentrate on BI modernizations and migrations across the entire BI stack.
56:05
We provide a full spectrum of BI services, training and power BI, Cognos, Tableau, Python and Azure and proprietary software to accelerate bimodal BI and migrations.
56:16
We particularly shine in hybrid BI environments.
56:21
We’ve been focus exclusively on business analytics for about 20 years now, and our team is large enough to meet all your business analytic needs, yet small enough to provide personal attention.
56:33
We are currently hiring as well.
56:35
If you’re interested in joining us, we’re looking for the following positions: look at the job descriptions on our site and if you’re interested, e-mail us your resume at jobs@senturus.com.
56:47
And then finally, the Q and A. Let me now, turn to your question. one question that came in and the other question is, for more govern reporting to recommend data models using Lake House tables. Yeah, so, it’s kind of a big question. I guess. I’ll try to give you a few pointers as to govern the reporting. Yes, so, you can certainly, go, Social. Lake House Tables will be great. Azure, Synapse analytics has a great new feature that, lets say, looks wasn’t preview. But it’s probably going to be generally available soon. The release to production soon. Which is called the Lake Databases. So definitely check it out. You can do it using the Serverless SQL pools as well. the way we’ve seen it before. Lake databases have some new benefits. For example, you can set up relationships. They’re using primary keys and foreign keys, like you can in your relational database. What else to consider on govern the reporting?
57:50
Definitely check out our purview. We didn’t talk about it here just because so many, services we can cover in a short period of time. But yeah, Azure Purview or what’s known as Microsoft Purview, is, is a great service for anything to do with data governance. So you can do discovery of your services, you can do all kinds of documentation. I was almost think about it like a like a enterprise grade, the data dictionary, in a way. So you can document your sources. You can make sure that people know where the data is. So when they do the reporting, the reporting is easier and it produces ultimately higher quality results. So yeah, just a few pointers there.
58:30
Great, thank you.
58:34
Think everyone else is just saying thank you for a great presentation, so I will leave the Q&A window open for another couple minutes. If you have anything that comes up, or any questions you wanted to ask before we end the webinar, feel free to put them in.
58:47
We will post the responses on our website, as long, as well as the recording and the slide deck.
58:52
And with that said, thanks again, Vitaly for a great presentation.