How to… Fill Down cells

With Tableau Prep being aimed at people who are newer to preparing their own data sets for analysis, one data source is encountered more than any other — spreadsheets. One of the main reasons why data is found in spreadsheets more than any other is the flexible entry and storage options the format offers. The same flexibility offered is actually a hinderance when it comes to preparing data for analysis.

For example, the way that humans perceive how data is stored in a spreadsheet is very different from how data software will read the values. Take this example of project financial reporting data (inspired by an output from Xero):

Figure 1 Spreadsheet structured data

When you look at this data, you probably recognise there are three different projects in the data set with a series of expenses and invoices listed under each in turn. However, most data software would not read the values in this way. The software might recognise the three project names but as there are no expense or invoice values listed in the same row, the values wouldn’t be attached in the below rows in the same way you would have perceived.

Getting the project details

To get the details you need, you need to create some logic that reads across those rows. With early versions of Tableau Prep, you didn’t really have any options as how to do this. In 2022, the elves at Tableau gave us the ability to set a Source Row Number. This meant we could specify what we wanted to do to a specific row using logic calculations. However, on large data sets with differing numbers of rows per piece of logic you would want applied, this could be very painful!

Tableau Prep Builder 2023.3 offers us the ability to Fill Down cells. This may not sound evolutionary but it does open up the ability to solve data preparation challenges that have been previously impossible. Therefore, we have a few options to explore how we might be able to associate the right project to the right invoice or expense.

Separating the Project Name from the Project Field

To gather the actual project name, it will be easier to hold it in its own data field. You might be able to identify the project name through the values that aren’t:

  • Null

  • Contain ‘Invoice’

  • Or, contain ‘expense’

But what if we have a project with ‘Invoice-o-rama’, that logic will break.

Therefore, the easiest constant way to recognise the project names is the fact the rest of the rows contain data fields with nulls. Let’s use the function ISNULL() to recognise if the fields of Cost and Invoice Amount are null. If they are both found to be null then we’ll return the project name in a separate field called Project Name:

Figure 2 Identifying the project name

This formula gives you a new data structure to work with:

Figure 3 Project Name field added to the original data

Time for Fill Down

You might now be seeing where you’re going to be able to utilise the new Fill Down functionality — on the Project Name field. Tableau Prep Builder has created a straight-forward interface to allow you to not only select that you want to fill down but it’s also allowing you to see the effect the action will make. Fill Down is an option you will find under the ellipsis menu > Create Calculated Field > Fill Down. By hovering over the Project Name column on the Profile Pane, the ellipsis menu will appear allowing you to choose Fill Down for that field.

Figure 4 Selecting the Fill Down function

Like all of the options in the Create Calculated Field menu that works across multiple rows of data, you will need to set an order to the function. This can be done by adding a Source Row Number into your dataset, a future How To… post will illustrate how to do this. Adding the Source Row Number into the data set will produce a new data field that numbers each row 1, 2, 3… etc until the end of the dataset.

When setting up the Fill Down, you will select the Source Row Number as the field the Fill Down will be ordered by. If you don’t have any other ordinal field (numeric id or date field), then Source Row Number will automatically be inserted into the Order by dropdown. You can see in Figure 5 that you can name:

  • Name the calculation (you can’t name the field any name already used in another data field in this data set)

  • See the values in the incoming field (left most column)

  • Are you breaking up the data field in to any particular groupings that would restart the calculation (Group by column)

  • Order the calculation filling the nulls in the column below any other values (you can see the results of this in the Results column)

  • The Compute Using column allows you to select which column will have the fill down applied to (in case you select fill down in a different data fields menu by accident).

Figure 5 Setting up the Fill Down function

You may notice the + icons that are next to the Group by and Order by options incase you want a more detailed grouping or nested order.

This results in the new column, Complete Project Name, containing a full set of values with no null values.

Figure 6 No nulls in Complete Project Name field

Finishing the task

To complete the process, we now need to tidy up the data set to get it ready for analysis. The following steps will be required:

  • Remove the Project Name field

  • Remove Source Row Number

  • Remove the rows of data that just represent the project title but don’t contain any Cost or Invoiced Amount data. This can be done with a filter based on if both of those fields are null like when we used that concept to find the project names in the first place.

  • Remove the number of the invoice or expense in the Project column so they can be aggregated or otherwise analysed.

Summary

Fill Down is going to be a great additional piece of functionality in the data preparers arsenal. It’s going to be a common task to complete when working with spreadsheet data sources so definitely one to build familiarity with.

Previous
Previous

How to… Change your Headers in Prep

Next
Next

How to… make complex calculations easier