32,39 €
Tableau is unlike most other BI platforms that have a single data modeling tool and enterprise data model (for example, LookML from Google’s Looker). That doesn’t mean Tableau doesn’t have enterprise data governance; it is both robust and highly flexible. This book will help you effectively use Tableau governance models to build a data-driven organization.
Data Modeling with Tableau is an extensive guide, complete with step-by-step explanations of essential concepts, practical examples, and hands-on exercises. As you progress through the chapters, you’ll learn the role that Tableau Prep Builder and Tableau Desktop each play in data modeling. You’ll also explore the components of Tableau Server and Tableau Cloud that make data modeling more robust, secure, and performant. Moreover, by extending data models for Ask and Explain Data, you’ll gain the knowledge required to extend analytics to more people in their organizations, leading to better data-driven decisions. Finally, this book will guide you through the entire Tableau stack and the techniques required to build the right level of governance into Tableau data models for the correct use cases.
By the end of this Tableau book, you’ll have a firm understanding of how to leverage data modeling in Tableau to benefit your organization.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 357
A practical guide to building data models using Tableau Prep and Tableau Desktop
Kirk Munroe
BIRMINGHAM—MUMBAI
Copyright © 2022 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Publishing Product Manager: Arindam Majumder
Senior Editor: Tazeen Shaikh
Technical Editor: Kavyashree K S
Copy Editor: Safis Editing
Project Coordinator: Farheen Fathima
Proofreader: Safis Editing
Indexer: Pratik Shirodkar
Production Designer: Shankar Kalbhor
Marketing Coordinators: Nivedita Singh
First published: November 2022
Production reference: 1301122
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-80324-802-8
www.packt.com
For my wife and business partner, Candice; thank you for your patience and for keeping things running at Paint with Data while I was working on this book.
Kirk Munroe is a Tableau Desktop Certified Professional, Tableau Certified Data Analyst, Tableau Certified Partner Architect, and Tableau Certified Partner Consultant, with over 20 years of work experience in business analytics.
He is the co-founder of Paint with Data, a Tableau partner and visual analytics coaching consulting firm. Kirk works with clients to improve their analytics skills, from data modeling to storytelling and presenting. Kirk has worked at analytics software companies, including Salesforce/Tableau, IBM/Cognos, and Kinaxis, in senior roles in product management, marketing, sales enablement, and customer success.
I dedicate this book to everyone in the Tableau #DataFam. I’ve never experienced such support and encouragement from a technical community before. I hope this book adds to the resources and conversations and leads to better dashboards everywhere!
Michael Warling is a tenured analytics professional specializing in helping people realize the full value of their analytics investments. With a background in fine arts and statistics, he has been able to develop extensive data visualization skills and has supported a variety of corporate business functions with their analysis needs.
Dwight Taylor, with more than 22 years of experience, is the CEO, chief solution architect, and Tableau evangelist of VIZYUL LLC, a certified minority-owned business enterprise and consulting firm that helps clients tell stories with data. Many of VIZYUL LLC’s consulting engagements involve building enterprise-scale data sources for use across multiple Tableau dashboards.
Understanding how to build a professional Tableau data model is critical to the success of enterprise-scale Tableau projects. If you ever wanted to peek behind the curtain to better understand how professional data architects create complex Tableau data models, this book is for you! Thanks for the opportunity to review this book, Packt Publishing!
Ethan Lang is an award-winning data visualization designer and engineer. He is the co-leader of the Veterans Advocacy Tableau User Group, a Tableau User Group ambassador, and a technical reviewer on multiple best-selling Tableau books. His data visualization strategies have been shared in dozens of cities across North America, including Orlando, Portland, Boston, and Kansas City.
Tableau is powered by a proprietary language, called VizQL. VizQL combines querying, analysis, and visualization into a single framework. Other BI tools offer querying (SQL, MDX), arranging the data through analysis, and then charting the data in three distinct steps. VizQL is what makes Tableau so quick and easy for analysts.
For VizQL to work properly, the underlying data needs to be in a very neat, tabular format with the following characteristics:
Every column needs to contain values representing a distinct field, ideally free from null valuesThe data needs to be at the lowest level of aggregation that the analyst needs to answer questions with no missing rowsFrom Tableau’s release in 2005 to the release of Tableau Prep Builder in 2018, Tableau mostly left the shaping of data to other technologies. In other words, Tableau assumed you, the analyst, would connect to data that was already in the ideal format. Tableau did however let you add rows from more than one table through unions, add additional columns through joins, and adjust the metadata of your data model by renaming fields, changing field types, and creating hierarchies, groups, and folders.
Tableau Prep Builder was released in the first half of 2018. It brought robust data cleaning and shaping to the Tableau platform. Tableau has gained many more data modeling capabilities since that time including relationships, data catalogs and lineage, and virtual connections. Tableau also released artificial intelligence and machine learning features in the form of Ask Data and Explain Data. Each of these has an impact on how we best model our data for Tableau.
This book explores all the data modeling components and considerations of the Tableau platform. It provides step-by-step explanations of essential concepts, practical examples, and hands-on exercises. You will learn the role that Tableau Prep Builder and Tableau Desktop each play in data modeling. The book also explores the components of Tableau Server and Cloud that make data modeling more robust, secure, and performant. Moreover, by extending data models for Ask Data and Explain Data, you will gain the knowledge required to extend analytics to more people in their organizations, leading to better data-driven decisions.
The book wraps up with a final chapter that explains when it is best to use Tableau Desktop and when it is best to use Tableau Prep Builder. To complete your understanding of the topic, the final chapter also goes through four real-world scenarios and the data modeling components needed in each case. This is a practical guide that will help you put your knowledge to the test.
This book is equally targeted at data analysts and business analysts looking to expand their data skills. The book offers a broad foundation on which you can build better data models in Tableau to allow for easier analysis and better query performance.
This book is also for those individuals responsible for making trusted and secure data available to their organization through Tableau. These people often carry the title of data steward and work to take enterprise data and make it more accessible to business analysts.
Chapter 1, Introducing Data Modeling in Tableau, starts with connecting to data to create our first data model. We will look at the ideal data structure for Tableau and will connect to multiple tables.
Chapter 2, Licensing Considerations and Types of Data Models, starts with an overview of Tableau product licensing and the impact it has on data modeling. We will then explore the foundational knowledge of how data modeling fits into the overall Tableau platform.
Chapter 3, Data Preparation with Tableau Prep Builder, focuses on the Prep Builder user interface, the process of connecting to data, and the first step in any data preparation process – the cleaning step. We will also create row-level calculations, including optimizing string fields.
Chapter 4, Data Modeling Functions with Tableau Prep Builder, explores extending the width of data by adding new columns through joins and extending the length of data by adding rows through unions. We discuss consolidating fields from columns and adding new fields from data in rows. We also cover the strategy and techniques for aggregating data to the proper level for analysis.
Chapter 5, Advanced Modeling Functions in Tableau Prep Builder, covers two advanced modeling functions, namely, adding new rows and pivoting rows to columns. These are two important data modeling techniques that are unique to Tableau Prep Builder. We also look into extending our flows by integrating data science models.
Chapter 6, Data Output from Tableau Prep Builder, looks at the four output options available from Tableau Prep Builder. The last step of our flows is always one or more output steps.
Chapter 7, Connecting to Data in Tableau Desktop, is all about using Tableau Desktop to connect to data, the first step whenever we use Tableau Desktop. This chapter looks at all the different data types that we can connect to in Tableau Desktop.
Chapter 8, Building Data Models Using Relationships, looks at how to combine multiple data sources into a single data model. The focus in on combining data sources at the logical layer through a feature called relationships.
Chapter 9, Building Data Models at the Physical Level, explores situations where you, the data modeler, need to be one level deeper than the logical layer. For these use cases, we must go to the physical layer of the data source by creating joins.
Chapter 10, Sharing and Extending Tableau Data Models, focuses on sharing and extending Tableau data models using published data sources and extending the model using hierarchies, folders, descriptions, grouping, and calculations. We also look at the implications of live versus extracted data models.
Chapter 11, Securing Data, covers key concepts and steps for securing data models. We cover adding users and groups and setting up project security, as these are fundamental for understanding access and authorization as they relate to our data models and the data contained in them. We also look at securing the data in our data models through row-level security options.
Chapter 12, Data Modeling Considerations for Ask Data and Explain Data, looks at the powerful machine learning features that put analysis in the hands of casual users. For these casual users to get answers to their own questions, the data models and available fields supporting them must be well thought out or users may end up frustrated with answers that don’t make sense.
Chapter 13, Data Management with Tableau Prep Conductor, explores the additional features of the Data Management that enhance our data models. These features are Tableau Prep Conductor, data catalog, data lineage, and data quality warnings. We will also look at certified data models, which are a standard feature of Tableau Server and Cloud.
Chapter 14, Scheduling Extract Refreshes, focuses on keeping data extracts created in both Tableau Desktop and the web client up to date using the scheduling services of Tableau Server and Cloud. We will also look at the role Tableau Bridge plays in making on-premises data available on Tableau Cloud.
Chapter 15, Data Modeling Strategies by Audience and Use Case, puts the entire book together by looking first at the general use cases for Tableau Desktop versus Tableau Prep Builder. We will then explore which pieces of the platform we should use based on our audience and use case using four real-world scenarios.
To get the most of out of this book, you will need to have a basic understanding of querying a database using basic SQL or having used a business intelligence tool in the past. If you have done planning, reporting, or analysis in Excel or another spreadsheet tool before, you will also have a good basis of understanding to get started with this book.
In the book, you will be using Tableau Desktop, Tableau Prep Builder, and the web interface of Tableau Server or Cloud. If you don’t already have a license for these products, you can download a free, 14-day trial of Tableau Desktop, Tableau Prep Builder, and Tableau Cloud from https://www.tableau.com/products/cloud-bi. The trial versions of the products are fully featured and can be used in all the exercises in the book. If you need to use the trial version, it is highly recommended that you don’t start the trial until you are ready to jump into the book as it only lasts for 14 days.
The files used in the book are found on the GitHub link found below. It is recommended to download all the files as a ZIP file before you start. The easiest way to do this is to click on the green <> Code button and select Download ZIP.
Software/hardware covered in the book
Operating system requirements
Access to Tableau Server or Tableau Cloud
Windows, macOS, or Linux
Tableau Desktop version 2022.2 or higher
Tableau Prep Builder version 2022.2 or higher
If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.
You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Data-Modeling-with-Tableau. If there’s an update to the code, it will be updated in the GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
There are a number of text conventions used throughout this book.
Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: “Mount the downloaded WebStorm-10*.dmg disk image file as another disk in your system.”
Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: “Select System info from the Administration panel.”
Tips or important notes
Appear like this.
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, email us at [email protected] and mention the book title in the subject of your message.
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.
Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Once you’ve read , we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.
Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.
Thanks for purchasing this book!
Do you like to read on the go but are unable to carry your print books everywhere? Is your eBook purchase not compatible with the device of your choice?
Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.
Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application.
The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily
Follow these simple steps to get the benefits:
Scan the QR code or visit the link belowhttps://packt.link/free-ebook/9781803248028
Submit your proof of purchaseThat’s it! We’ll send your free PDF and other benefits to your email directlyThe Tableau platform has several components that play a role in data modeling. This section covers the way Tableau models and queries data natively. The section sets the stage for all the hands-on learning that continues throughout the book.
This part comprises the following chapters:
Chapter 1, Introducing Data Modeling in TableauChapter 2, Licensing Considerations and Types of Data ModelsWelcome to data modeling in Tableau. You might know Tableau as a great self-service analytics tool that provides both powerful analytics and is also easy to use. You might also think that Tableau is light on the key enterprise analytics requirement of data security, data model robustness, and data maintainability. In this book, you will learn that Tableau has all these key data requirements covered. You will learn how data is best structured for Tableau analysis and performance, and understand the functionality of Tableau Prep Builder and Tableau Desktop and the role each plays in building data models. You’ll then publish these data models to Tableau Server or Online and optimize them for performance, governance, and security.
By the end of this book, you will have all the strategies and techniques needed to enable individuals in your organization to answer their own questions with data, regardless of their level of expertise. You will also drastically reduce the calls you receive from these same individuals about confusing data and dashboards that are slow to load.
Tableau is very different from most other BI tools in that the model can be either implicit or explicit. For instance, many analysts open Tableau Desktop, connect to data, and immediately begin creating visuals. In this instance, Tableau implicitly created a data model (that is, made a connection, executed a query, and created metadata) without an analyst having to do anything to create the model.
This implicit data modeling works well when your data source has already been prepared for analysis and you are the person creating charts and dashboards. Often, our data is not structured this way. It comes from different sources and needs to be combined and defined in meaningful ways. In these instances, Tableau provides the tools for you to create data models that are scalable, secure, and targeted to the different skills of a broad class of developers and consumers.
Tableau uses a data model as the foundation for the creation of all analyses. A Tableau data model contains the following:
Connection information to the underlying data source.The queries required to retrieve the data.Additional metadata, or data about the data, added to the underlying data. Metadata can include more readable field names, field types, the grouping of data into hierarchies, and calculations not in the underlying data.Tableau works best when your data is in a traditional spreadsheet table format – that is, Tableau assumes that the first row of your data consists of column headers and each column header maps 1:1 to a field name, with additional rows of data each containing one record of data. If the underlying data is not formatted in this way, analysis within Tableau becomes very difficult and performance will suffer. To address this, you can model your data in a format that works best with Tableau. The best practices to model data properly are the primary content of this book.
This chapter demonstrates how Tableau automatically creates a data model when you connect to a data source, how it interprets rows and columns in your data, and how you can shape and combine additional data into your data model.
In this chapter, we’re going to cover the following main topics:
What happens when you connect to data in Tableau DesktopThe ideal data structure for TableauShaping data for TableauConnecting multiple tablesTableau Desktop (and Tableau Prep Builder version 2022.2 or higher in future chapters) version 2022.2 or higher is needed to complete the exercises in this chapter.
If you don’t have a licensed version of Tableau Desktop, you can obtain a 14-day free trial from https://www.tableau.com/products/desktop.
Another alternative is Tableau Public. The free Tableau Public version of Desktop contains almost all the same features as the paid version, with the exception of a small number of data source connection options, and output can only be saved to the Tableau Public site. However, it often has enough features to perform visual analysis as long as the data isn’t confidential. The Tableau Public Desktop version can be found at https://public.tableau.com/s/.
The files used in the exercises in this chapter can be found at https://github.com/PacktPublishing/Data-Modeling-with-Tableau/. We recommend downloading all the files before getting started. The quickest way to do this is to click on the green <>Code button and then select Download ZIP. Expand the ZIP file and make note of the directory. We will be referencing it throughout the book.
Note
The aforementioned requirements are applicable to all chapters in this book.
When you connect to data in Tableau Desktop, Tableau will begin to interpret your data. First, it will create a field for each column of your data. Second, it will assign a data type to each of the fields. Tableau does this because it is powered by a proprietary query technology, called VizQL. VizQL is the technology that underpins Tableau, enabling a visual analytics experience by automatically creating visualizations for a user. This is very different than most business intelligence tools that rely on the user to tell the tool how they would like the data visualized through the picking of a chart type.
For VizQL to work, Tableau needs to know the type of each field. The two main field types in Tableau are discrete and continuous:
Discrete fields: Colored blue in Tableau. By the Oxford Dictionary’s definition, discrete means independent of other things of the same type. When placed on a Tableau visualization, discrete fields usually create a header – similar to a column header in a spreadsheet.Continuous fields: Colored green in Tableau. Again, using the Oxford Dictionary, continuous is defined as happening or existing for a period of time without being interrupted. When placed on a Tableau visualization, continuous fields create an axis – that is, they create a visual display of data.One way to think about the relationship between discrete and continuous fields is that continuous fields are recording measurements and discrete fields are describing those measurements. In a statement, this can usually be phrased as continuous by discrete – for example, sales (continuous) by region (discrete), as shown in Figure 1.1.
Figure 1.1 – Demonstrating discrete and continuous fields
Within these two main field types, there are additional field types that inform VizQL how to create a visual display when they are brought onto the Tableau canvas. These can be seen in Figure 1.2 and are as follows:
Number (decimal): A number that allows fractions. Represented by a # symbol in the Tableau UI.Number (whole): An integer or a number with no decimals. Also represented by a # symbol.String: A field that contains alphanumeric characters. Represented by abc.Date: Tableau accepts several date formatting options. Represented by a calendar icon.Date & Time: A date field with granularity down to the second of a day. Represented by a calendar icon plus an analog clock.Geographical/Spatial: A field that can be plotted on a map. There are many subtypes of geographical fields, including country, state/province, city, postal/zip code, airport, congressional district, NUTS (Europe), and a latitude or longitude value. Represented by a globe icon.Binary/Boolean: A field that takes a true/false or yes/no condition. Represented by a T/F icon.Figure 1.2 – Additional field types
Let’s open Tableau Desktop and connect to the Superstore sales 2022.csv file. This file contains the sample data that comes along with the Tableau installation. It is a sample (and fictional) retail dataset that is useful for demonstration and learning purposes. We will use this data throughout the book when we can. This will help you as you increase your Tableau learning journey, as most of Tableau’s training videos use the same data:
Open Tableau Desktop.Click on the Connect to Data blue hyperlink near the top-left-hand side of the Tableau Desktop UI:Figure 1.3 – Connect to Data in Tableau Desktop
You are now presented with many different options for data sources. We will discuss some of these in upcoming chapters. For now, find the To a file section and click on the Text file option. Navigate to the Superstore sales 2022.csv file in the location you saved it on your computer. Click Open.Tableau will bring the data in and bring focus to the Data Source tab, as follows:Figure 1.4 – The data pane in Tableau Desktop
The top part of the screen acts as a visual canvas where we can bring in additional data through relationships, joins, and unions. For now, we will look at the bottom part of the screen.
This part of the user interface is broken into two sections. The section on the left displays the metadata for the fields in the data source. The metadata list contains the type, field name, physical table from where the data is being queried, and remote field name.
Type is the Tableau field type, which allows VizQL to guide the analyst to the best visual display for the data. Field Name is Tableau’s attempt to take the remote field name and map it to a business-friendly name. In our dataset, the column names already translate easily to business-friendly names. Imagine if our source file had contained postal_code and not Postal Code in the first row. Tableau would automatically transform postal_code to Postal Code in the field name, making our data modeling job easier for us. We can always change the default name Tableau assigns to any field.
The section to the right contains a sample of data that will be queried, called the Table Details pane. By default, Tableau returns the first 100 rows as a sample. The sample size can be changed in terms of the number of rows to return.
Next up, we will look at what a table format looks like and why it is so important for Tableau.
Tableau performs best and is easiest to use when every column in the source data corresponds to a single field and each row represents a record of data at the lowest level required for analysis. As you can see in the following screenshot (Figure 1.5), Tableau will put all the field names in columns, including the type of each field. This is the metadata in your data model. Every row below the row of field headers will contain data, with each row representing one record of data:
Figure 1.5 – The Tableau data format
When data is structured in this manner, it allows Tableau to perform optimally based on query performance, the ease of building analyses, and combining data from different sources.
Tableau will automatically assume your data is stored this way when you connect to a new data source. We saw this in the previous exercise. Tableau took the first row from the comma-separated values file and used it to create field names, restarting after the comma separator. Next, Tableau assumed after the first carriage return that the second row would be a row of values, neatly falling into the columns above it. For each additional carriage return, Tableau assumed another data record.
It turned out that Tableau was right in the assumptions it made. Data prepared for analysis is often already stored in this format, with each row being a unique record of data, separated into individual fields based on fields in database tables or headings in flat files and Excel.
What happens when it isn’t? This puts an analyst in the position of creating more complicated analyses (calculations, sheets, dashboards, and stories) and query performance almost always suffers.
In the next section, we will look at one of the most common ways where we see data structured in a manner that is not Tableau-friendly.
In the previous section, we looked at the data format that works best in Tableau. We will now look at one of the two main examples where data is shaped ineffectively for Tableau and how we can easily change it to the correct format before beginning our data analysis.
Time-based data, especially financial data, is often stored in Excel with the dates in the column headers, the fields spread across columns in the first row, and values falling in the cells in the intersection. We can see this in Figure 1.6. This is a planning sheet for sales targets for the year 2022 for category sales of our SuperStore data:
Figure 1.6 – SuperStore sales targets format
If we connect this data to Tableau, it gives us the following metadata:
Figure 1.7 – SuperStore target default metadata
Thinking back to having fields in columns, a field type, and values in rows, we can see that this interpretation of the data is not helpful at all. What we want are three fields of the following types:
Date: Type – date. There are 12 values, one for each month in 2022.Category: Type – string. There are three values: Furniture, Office Supplies, and Technology.Revenue target: Type – whole numberIn order to get these fields into the right format, we need to pivot those date columns into rows and then rename two fields. Tableau makes this easy for us. Here are the steps:
Open Tableau Desktop.Click on the Connect to Data blue hyperlink near the top left-hand side of the Tableau Desktop UI.Choose Microsoft Excel. Navigate to the SuperStore 2022 Budget.xlsx file in the location you saved it on your computer. Click Open.You will be presented with a screen similar to Figure 1.6.In the Table Details pane, click on the header of Jan-22, hold down the Shift key, and click on the Dec-22 header to select all the date columns. You might find this easier if you first collapse the metadata pane:Figure 1.8 – Multiselecting columns in the Table Details pane
Hold your cursor over the top-right-hand corner of the Dec-22 header to bring up the down arrow.Click on the down arrow to bring up the menu of options. Choose Pivot.Figure 1.9 – Menu options from the Table Details pane
You will now be presented with three fields. We are almost there!Figure 1.10 – Table Details after the pivot
Our last step is to rename our three fields. We can do this by clicking on the field name and typing over the ones that are there, or by clicking on the same down arrow that we used in step 8 and using the Rename option.Let’s rename our fields as follows: F1 to Category, Pivot Field Names to Date, and Pivot Field Values to Sales Target:Figure 1.11 – Table details after renaming
Our last step is making sure our field types are correct. Category is a string field (alphanumeric) so it should show Abc, which it does. Similarly, Sales Target is a number and Tableau has it correct. Date is showing as a string. We would really like this as a date field because Tableau has special date-handling capabilities to make analysis much easier for us. To change the Date field to a type of date, click on Abc over Date and change the field type to Date:Figure 1.12 – Table details – changing the data type
Don’t close this workbook; we will pick up from this point in the exercise in the next section, where we will combine data to our data model from other tables.
In this section, we looked at how we may need to reshape our data to make better data models in Tableau. In the next section, we will look at a use case where we need to add additional columns (fields) of data from other data sources to our model.
Now that we understand the ideal format of data for Tableau, we can look at ways to expand the size of our data model by connecting to multiple tables. We have connected independently to two different tables in this chapter. We will now combine those tables, along with a third table.
There are two ways in which we can add multiple tables to a single data model. The first is by joining data. We join data when we want to add additional columns to our model for additional context. Think of our sales target data. We want to join that data to our SuperStore sales data to add the additional context of sales targets. That will allow analysts to create views to see how the different categories are performing against their targets.
The second way to add additional data to our model is by adding additional rows (with the same columns). The SuperStore data we used earlier in this chapter was for 2022. What if we wanted to add in sales from 2021? That is where a union would be the answer.
For our next exercise, it should be noted that we are creating a relationship to add the new column and not a join. Using a relationship allows Tableau to create the proper join at runtime based on the data being analyzed. You can think of a relationship as a dynamic, smarter join. We will explore the details and differences between relationships and joins in Chapter 7, Chapter 8, and Chapter 9.
To add the additional column of sales target, we will pick up from the end of the previous exercise:
Click on the blue Add hyperlink to the right of Connections:Figure 1.13 – Clicking the Add hyperlink
Select To a file | Text file | Superstore Sales 2022.csv | Open.Under Files, drag Superstore Sales 2022.csv onto the canvas until the orange noodle shows, and then let go.At this point, Tableau tries to create a relationship between these tables. Sometimes, Tableau can figure out how to create this relationship. In this case, we can see Tableau was unable to do this from the exclamation mark in a red triangle:Figure 1.14 – Creating a relationship between two tables
We need to tell Tableau that the two relationships it should create are Date with Order Date and Category with Category. Let’s relate the two category fields first. We do this by selecting Category and Category1 and making sure the operator is set to equals ():Figure 1.15 – Creating a relationship by linking fields
To complete the relationship, we will also need to link Date fields to ensure that the sales targets are aligned with the proper dates for analysis. In this specific case, the dates are not at the same grain (one is at the month level and one is at the day level). This will require us to create a relationship calculation, which we will explore in future chapters.Now that we have joined additional data sources (tables) to add additional columns for analysis, we will explore adding new rows of data through a union.Hover your cursor on the right side of the Superstore Sales 2022.csv file rectangle on the canvas, and click on the down arrow when it appears. Select Convert to Union…:Figure 1.16 – Convert to Union…
Drag the Superstore Sales 2021.csv file from the files section on the canvas into the union dialog box. After clicking OK, all the rows from both 2021 and 2022 will be available for analysis:Figure 1.17 – Dragging files to the union dialog box
We have now covered the two core methods of expanding our data model through relationships and unions.
Congratulations! We are only one chapter in and you have already created a data model in Tableau. We now know how to structure the ideal data model for Tableau. We looked at connecting to data, exploring how Tableau interprets that data, reshaping data for analysis, and adding additional data. With this learning foundation, we are well-positioned for more advanced topics in the following chapters, where we will build security, maintainability, and robustness into our data models.
In the next chapter, we will look at Tableau licensing models and how licensing impacts data modeling. We will also explore the difference between embedded and published data models and the difference between data models that connect live to data and those that extract data for analysis.
Tableau licensing has an impact on how a person can create and maintain data models. First, Tableau has role-based pricing for individual users of the platform. Second, Tableau has different tiers for how organizations license Tableau, which impacts data modeling capabilities.
Tableau data models are used by analysts and developers to create Tableau workbooks