PL-300 – Section 16: Part 2 Level 4 – Transform – Text and Numbers

  • By
  • May 5, 2023
0 Comment

127. Transform/Add Column – Text – Format

In our practise activity that we just did, I created this year analysis, and it’s looking, okay, fairly good. It gives me information, it gives me, for instance, that East of England and Southwest keep changing places. But the thing that it doesn’t quite do if you have a look at the months is have the months in order. January, February, May, April, March, June, July. It’s got May and March in the wrong order. In fact, it’s the right order, it just depends on what you’re searching.

Now, the reason why it’s not treating them in the right order is because it’s treating in descending order of the average price. And you can see that we did a drill through, so that we’re looking at the year 2008. If I clear that filter, you can now see that the months go December, November, September, October. It’s largely in descending month order, but not quite. September and October are in the wrong place.

Now, you can see how it is sorted. This little arrow here is showing that by default, this is sorted by average price. And I’ve got to this drop-down by just double-clicking on these three dots here. So, if I wanted to sort by month, which could be a bit more ordinary and month ascending, you might be expecting to get January, February, March, but you’re not, you’re getting April, August, December, because this is the order in alphabetical order. So, what we need to do is convert this so that it actually means something to the computer. We need it to be in a date.

Now, at the moment, we have got the year and the month in two separate columns. And that is quite a problem for the computer to actually understand that it is a date. We have year, month. So, month is just text, year is a number, neither of them are dates.

Now, we’re going to have a look. Instead of going to transform date column, which you might think we’re doing immediately, we’re not. We’re going to have a look at the text column.

Now, I should point out that most of this is replicated elsewhere. So, the split column, which is the first thing. We’ve already had a look at the split column. So, you can split by delimiter, by number characters, by position, when it goes from lowercase to uppercase, uppercase to lowercase, when it goes from being a digit to not being a digit, and so on. So, we’ve already had to look at split column. We don’t need to have a look at it again. The rest of it, format, merge column, extract, and pass, is in the transform menu, but is also in the add column menu. So, what’s the difference between transform and add column?

Well, the answer is, do you want to overwrite the existing column that’s there, in which case use transform, or do you want a completely new column to be added? In which case, use add column. Of course, you could always just use add column and then manually delete the existing column if it’s surplus to requirements. So, let’s just have a look at what we’ve got here in the text column. First of all, we’ve got the formatting. So, if I click on the month, we could format everything as lowercase, and you can see it’s using the Table.TransformColumns, and there’s Text.Lower. We could transform it into uppercase, no surprise, watch, we’ll there Text.Upper. You could transform it into capitalise each word, also known as proper case, and that’s what it uses Text.Proper, so we would have every word starting with a capital letter so if I transform region, for instance, then the of in East of England would also be capitalised. You can also trim. So that removes white spaces, in other words blanks or spaces from the beginning, that’s called leading, and at the end, that’s called trailing. So, the awesome language is why you have to do leading trim and trailing trim as two separate functions but here you don’t need to. You only need to do it as one. So, if I were to trim this, there would be nothing happening because there are no blank spaces but if there were, then they would be removed. So, if you’re having a problem with blank spaces, that is very useful. In addition to trimming, you’ve also got something called cleaning, so that is removing characters which don’t print. So, you might have some very weird characters, and that’s Text.Clean.

Additionally, you can also add a suffix or a prefix so a prefix goes at the beginning, suffix goes at an end, so suppose I wanted a prefix of Region: no space. That’s fine, that can be done. And you can see here how the computer’s done it. It has gotten & to connect two texts together. And the underline that you can see here, that is your original text. So, we have this column called region, and it becomes what each row actually had, prefixed with region, so it combined the two. And you can see the similar formula for the suffix.

128. Transform/Add Column – Text – Merge Columns

Now, these next two are related. Merge columns, these concatenate, these join together two columns into one. Of course, it need to be two, it could be more than two. But what I’m going to join together is the year and the month in order to create something that’s looking like a date. So, if I merge these columns together, and I’m using transform, you’ll notice. And it’s asking me what’s the new column going to be called. So even though I’m using transform, it’s still going to give me a new column. So, I’m going to call this, date combined. And it’s saying, okay, how are you going to combine them? Well, I want a space between them, but equally. I could say I want something custom. So, I could have the word, and, for instance, in between. But I’m going to have a space.

So, let’s combine those, and you’ll see that the computer has created a new column, date combined, and it’s removed the two existing columns. So, it’s not like it’s taken an existing column, made that be combined, and then got ride of the others, it’s actually just merged them all into one big column. And a new column, at that.

Now, if you wanted your existing columns to remain, what do you do? You don’t use transform merge columns, you use add column merge columns. And you could see the formula, slightly different, it’s Table.AddColumns, where it was previously Table.TransformColumns, But it leaves existing columns in place. So, we’ve combined these as text. And how do we actually get them into a date format? Well, you might think we’ll be going over here into the date, but we don’t have to. All we have to do, is change the data type. So, we can change the data type by here, next to the column, or we could do it in transform any column, or we could do it in home transform data type. So, I’m just going to change this to date.

Now, notice the format that the computer currently has it in, 1995, Feb. It doesn’t have this as Feb, 1995. Will that make a difference? Let’s change it to date, see what happens. And the answer is no, the computer has successfully worked out what it is, that it’s worked out that is a year followed by a month in the English format. We will be looking at examples of how to do non, your language date, so in my version non English date formats, later. So now we’ve got this date combined we can close and apply.

So, let’s do that, and we get back to our Power BI desktop. And they’re saying something’s wrong with one or more the fields, average price, and what’s happened is that at some point I’ve accidentally deleted the transformation into a number. So, let’s just turn that back now into a whole number. There we go. Close and apply. So, if you see this sort of thing, it means something’s wrong with what you’ve just done, please correct. So now we have changed absolutely nothing, because we have added a new field, date combined. Had we deleted the columns month and year, then a lot of the stuff that we have previously done will no longer work. So rather than doing that, I’m adding extra columns, and then if I want to completely go on to this new date combined I can do that, and then delete these columns afterwards, once they’re no longer used. So, let’s click on this visualisation, and it’s currently going by month, so instead I’ll drill down into the hierarchy to this version of month. So, just as a reminder, we’re currently sorting by month that is sorted in alphabetical order.

Now, if I get rid of this month, and drag in the date hierarchy month, you can now see that it is coming in in the proper order that we would expect, January than February than March and so forth.

Similarly, I could go through all the other examples that I’ve got the, region analysis, for instance. Click on this and change the axis that I currently got for year and month, so delete those, drag in year and then month. Same functionality, except that when you drill down, you actually get January, February, March, instead of having it in different order. So, I could go through the entirety of this dashboard, change all of my existing individual year and month preferences to the year and month in the date combined date hierarchy, and then, I could edit the query again and delete year and month separate because they’re no longer in use.

Now, just to clarify, let’s just do that just to show that it is actually possible to remove these columns. Date combined relies on the year and month, but I’ve just got rid of the year and month. If you were to try to do this in something like Excel, so here’s an example in Excel. If I was to then put have a formula date combined and just concatenate them together, and then delete the existing columns, we’d get an error, because I’ve removed the data that I’m relying on.

However, in Power BI, that’s not the case. It’s going through a series of steps, and it’s at this step here, the insert merge columns that it’s done all the calculations. I no longer need to rely on these two columns to populate this. It’s something that’s done in the past, as opposed to Excel, where it’s something that’s done in the present. So, I can now remove them, and date combined will be unaffected. I’m not going to do that, however, because there are some elements in this visualisation which actually rely, still, on the existing year and month, I think it may be just be the year now. So, I’d have to go through all of them, which is probably a good practise to do, to be honest, go through all of them and change them to date hierarchy, and only then, now nothing is dependent on this individual column, can I remove it.

So, in this video, we had a look at the start of the transform, or add column for the date. We’ve had to look at formatting and merging, we’ve already seen that we already are splitting the column previously. And you saw once we have something that looks like a date, you can just transform it into a date just by changing the data type. In the next video, we’re going to have a look at how to extract data.

Comments
* The most recent comment are at the top

Interesting posts

The Growing Demand for IT Certifications in the Fintech Industry

The fintech industry is experiencing an unprecedented boom, driven by the relentless pace of technological innovation and the increasing integration of financial services with digital platforms. As the lines between finance and technology blur, the need for highly skilled professionals who can navigate both worlds is greater than ever. One of the most effective ways… Read More »

CompTIA Security+ vs. CEH: Entry-Level Cybersecurity Certifications Compared

In today’s digital world, cybersecurity is no longer just a technical concern; it’s a critical business priority. With cyber threats evolving rapidly, organizations of all sizes are seeking skilled professionals to protect their digital assets. For those looking to break into the cybersecurity field, earning a certification is a great way to validate your skills… Read More »

The Evolving Role of ITIL: What’s New in ITIL 4 Managing Professional Transition Exam?

If you’ve been in the IT service management (ITSM) world for a while, you’ve probably heard of ITIL – the framework that’s been guiding IT professionals in delivering high-quality services for decades. The Information Technology Infrastructure Library (ITIL) has evolved significantly over the years, and its latest iteration, ITIL 4, marks a substantial shift in… Read More »

SASE and Zero Trust: How New Security Architectures are Shaping Cisco’s CyberOps Certification

As cybersecurity threats become increasingly sophisticated and pervasive, traditional security models are proving inadequate for today’s complex digital environments. To address these challenges, modern security frameworks such as SASE (Secure Access Service Edge) and Zero Trust are revolutionizing how organizations protect their networks and data. Recognizing the shift towards these advanced security architectures, Cisco has… Read More »

CompTIA’s CASP+ (CAS-004) Gets Tougher: What’s New in Advanced Security Practitioner Certification?

The cybersecurity landscape is constantly evolving, and with it, the certifications that validate the expertise of security professionals must adapt to address new challenges and technologies. CompTIA’s CASP+ (CompTIA Advanced Security Practitioner) certification has long been a hallmark of advanced knowledge in cybersecurity, distinguishing those who are capable of designing, implementing, and managing enterprise-level security… Read More »

Azure DevOps Engineer Expert Certification: What’s Changed in the New AZ-400 Exam Blueprint?

The cloud landscape is evolving at a breakneck pace, and with it, the certifications that validate an IT professional’s skills. One such certification is the Microsoft Certified: DevOps Engineer Expert, which is validated through the AZ-400 exam. This exam has undergone significant changes to reflect the latest trends, tools, and methodologies in the DevOps world.… Read More »

img