Introduction
Tableau is a popular Business Intelligence tool. It is trusted by numerous organizations to build dashboards monitoring performance indicators. Its main advantages are its ease of use, its capability to share results internally and externally, and the long list of data formats it can ingest (from text files to databases).
Although not originally designed to display geographic data, it has been extended with basic mapping capabilities. This is useful to monitor your performance per area, identify zones with high potential for growing your business, or re-evaluate your territory divisions. Zip code areas are a convenient way to map, as it’s easy to link to address data yet already provide an aggregated view to conduct your analyses.
In this article, we will see how zip maps can be created in Tableau, exploring 2 routes: leveraging geographic sources already integrated with Tableau or importing custom polygons to build your maps, with full control over all the displayed data. Examples with step-by-step operations will illustrate both options.
Using Tableau-embedded geographic data sources
Tableau includes, by default, geographic sources from several providers (including Mapbox, OpenStreetMap, and Geonames).
Similarly to PowerBI, Tableau matches data fields to pre-defined internal levels named Country/Region, State/Province, County, and City. In addition, it conveniently allows for mapping through widespread international geocode systems such as FIPS, ISO3166-1, and NUTS. For the USA, possibilities are extended to phone area codes, congressional districts, and CBSA/MSA (Core-Based Statistical Areas). Finally, Tableau can link to airport codes (IATA and ICAO).
The most important for this article is that Tableau also provides zip code mapping for 56 countries.
The list of available Map layers for each country is available here.
Tableau also offers functionality to manually match locations that would not be automatically linked to embedded data, for instance, in case of spelling differences or ambiguities (several entities sharing the same name or zip code may not be formatted the same way).
Finally, you can create groups of objects. Zips are frequently following a hierarchical model. So, you can create groups by zip prefixes if you want to aggregate your data at a higher level.
Step-by-step example: total income per zip code in Florida
Let’s see how we can create a map of the total income per zip in Florida. You can download the data from GeoPostcodes’ Github directory. It contains an extract of the 2020 tax report data published by the US IRS, providing the total income per postal code in 2020.
Follow these steps to build your map of income per zip in Florida:
1. To start, open Tableau and create a New dashboard. Then click on “Text File” (1) -> and select the file you have just downloaded (2).
2. From the data sources panel, first, make sure the file has been opened with the correct encoding: click on the arrow next to the file name (3), then on “Text File Properties” (4). In the “Locale” (5) property, select “English (United States” (6).
3. Ensure the Zip field has the Zip Geographic role: Click on the icon on top of the zip column (7) and select “ZIP Code/Postcode” as Geographic Role.
4. Go to your Worksheet (8), drag and drop the “Zip” field (9) into the center of the worksheet (“Drop field here”).
5. If the Zips are not automatically recognized, click on “… unknown” at the bottom right (10), then “Edit Locations” and make sure the “Country/Region” field (11) is set to the USA (12).
6. You should now see the US states with points inside them. To switch to a choropleth (colored polygons) view, click on “Show Me” (13) and select the “Map” type (14).
7. Convert the “Average Income” field to a measure: click on the arrow on its right (15) and select “Convert to measure” (16).
8. Now drag the “Average Income” field to the “Colour” icon (17).
9. There it is. You have built your map of total income per zip code in Florida. Note that the IRS does not report values for Zip codes with less than 100 returners, while there are a couple of empty zones on the map.
10. The map’s contrast is not great so far because of the distribution of average incomes (a few outliers with high values, but the vast majority of incomes are within a lower range). To change that, we can color according to the logarithm of the average incomes. Click on the arrow next to “Average Income” (18), then “Create” (19) and “Calculated field” (20).
11. Rename the field to “Log_income” (21) and enter “Log([Average Income]) as formula (22), then click on ok (23).
12. Drag the “Log_income” field over the “SUM[Average Income]” to replace the coloring variable (24).
13. To improve the displayed tooltip when you hover over the zip areas, drag the “Average Income” field to the “Tooltip” icon (25).
14. Click on the Tooltip icon (26), remove the line for log incomes (27), and, because amounts from the IRS field are in thousands of dollars, add “,000$” after the Average income variable (27).
15. Your map is ready now, with a bit more contrast thanks to the logarithmic coloring and a nice tooltip to show you the average incomes per zip when you hover each polygon.
As you can see, it was extremely easy to create that map, just using the tabular data (2 fields: ZIP and average income) we had downloaded and leveraging Tableau’s embedded geographical data. Unfortunately, setting up your dashboards is not always easy when you want to map data from different countries or at other aggregation levels. Let’s now talk a bit about the challenges you will face.
The challenges with Tableau’s embedded polygons
The main difficulties in using the embedded geographic data in Tableau relate to coverage, quality, and lack of control. Here are a few more details, both on the general cases of leveraging geographic mapping directly from Tableau (for administrative divisions or ZIPs) and the specific cases of Zip code mapping:
- Quality: in several countries, the administrative boundaries are not up-to-date. For example, in Belgium, there is level 3 encoding through the NUTS codes, but these don’t include the changes from 2019. This is shown in Figure Y, where the level 3 subdivisions appear in dark blue, and the missing areas in light blue (Belgium is covered entirely by level 3 subdivisions, but Tableau misses some, as indicated by the 8 “unknown” regions).
For Algeria, Provinces still refer to the 48 Provinces existing before 2019. Since then, there have been 58 Provinces in Algeria, but they are not all available through Tableau. Similarly, in Latvia, Tableau is still referring to the divisions before the administrative reorganization which entered into force in July 2021 (moving from 110 to 43 cities).
For Sweden, Tableau reports 10.006 postal codes available. This is over 500 postal codes short of the country’s total number of active postal codes. That means 5% of the postal codes you will not be able to display through Tableau, and there are some wrong shapes for the existing ones.
Furthermore, the (simplified) polygons are not perfectly aligned, as shown in the following figure, where 2 US zip codes overlap in one area.
Simplifying the shapes is a good idea given the business cases covered by Tableau: most likely, users want to see some high-level maps covering a relatively large area containing numerous polygons. Those polygons’ details are unnecessary and would probably not even be visible at the used scales. Nevertheless, gaps and overlaps between simplified polygons not only hinder the beautifulness of the map but can also create issues and confusion when analyzing the data. You can learn how to build an accurate postal code polygon database here.
- Coverage: Postal code boundaries are only available for 56 countries. Administrative boundaries frequently don’t go all the way down the hierarchy. For instance, in France, the mapping stops at level 2 (departments); you can’t link to arrondissements or municipalities. For Belgium, you can go down to arrondissements but not municipalities. For Italy, you can only link to 107 Provinces but are stuck if you want to display more granular administrative data. Going back to Belgium, the PLACES available in Tableau only include relatively (it’s still Belgium, which I may safely joke about as it’s my country) big towns but don’t include smaller localities and villages.
- Disambiguation work: it is very frequent for places and administrative divisions to share their names with other entities elsewhere. For instance, there are 30 counties and 1 Parish named “Washington” in the USA. More than half of the states have one subdivision named after George Washington. When you want to import data about counties, Tableau can’t automatically infer which County (in which State) you are referring to. It will hence indicate the locations are “unknown,” and you can fix it, for instance, by specifying the administrative hierarchy. This means you need to have that hierarchy, in the first place, linked to your original data. The same can happen with postal codes: although they should be unique per country (we know about one exception in Cambodia, but Tableau does not have postal codes for Cambodia anyway), postal codes from different countries can share the same values, so always make sure you also include a country field in your data.
- Lack of control: You can’t know in detail what’s happening behind the scenes. What exactly is available? Where does it come from? Is it correct and up-to-date? What if you need something slightly different, which you can’t easily build by grouping entities in Tableau? When is the underlying data going to be updated next? Note that updating such data can also create issues, as all your working dashboards relying on it may break until you include the new keys in your data (e.g., new Provinces).
- Expertise needed: as you don’t control what’s available in Tableau, and as it may not be accurate or up-to-date, you can’t know if what you see are reliable polygons unless you have expertise in the domain. Getting help and even more fixes if you’re questioning the served data is not straightforward.
For all those reasons, taking control of the used data is desirable. So you know exactly what is in it (and what’s not) and update it when you can, want, or have to. Luckily, Tableau offers possibilities to use other geographical data sources. Let’s now explore how custom geographical files can be imported into Tableau.
How to import custom zip code polygons into Tableau
Available file formats
Tableau can ingest several geographical data formats: Shapefile, geoJSON, KML, MapInfo, topojson. Note, however, that it can’t read the “ExtendedData” out of KML files at the time of writing, which hinders linking KML files to other data in the application: you can only leverage the ID field as a key.
Linking to other data
Linking geographical data to other sources is extremely easy as Tableau offers a relationship/join interface allowing users to choose which keys should be related to the joined files. Tableau can even perform joins through geographical data (linking datasets based on the relationship between their geometries).
Step-by-step example: showing the Population per postal code in South Korea
In this second example, we will join two files: a geographic file that includes the polygons of the postal codes in Sejong City and a CSV file that gives the population per postal code.
First, download all the data from our Github repository. It contains the simplified postal boundaries for Busanjin County (GPC-BNDR-PST-VIZ-Busanjin.*) and the population per zip code (KR_Busanjin_pop_per_zip.csv).
Then, you can upload the 2 files to Tableau. Start a new project and connect to a new data source. Select “spatial file” and then browse your hard drive to select the .shp file you downloaded. Next, click “Add,” select “Text file” and point to the population CSV. Then, drag the “KR_Busanjin_pop_per_zip.csv” file from the left menu towards the right pane (1).
Then, ensure the zip columns in both files are string data types (2 and 3 for the CSV file). Additionally, check the geographical roles for the Zips (and all fields except the geometry) are set to “None” (4).
Now, you can establish the relationship between the 2 data sources. Click on the link between them (5), then select the key for the first file (6) and choose “ZIP” (7). Repeat the operation for the second file (8), choosing zip, so your relationship reads “ZIP = ZIP.”
Now you can access your worksheet (Click “Sheet 1”, 9). Drag the Geometry field to the center of the worksheet (10).
Then, drag the “ZIP” field to the “Details” icon (11).
Finally, convert the Pop sum to a decimal number (12-13) and a measure (14), then drag it on the color icon (15).
You now have your choropleth map of population per postal code, combining the CSV data with the polygons you have imported through the shape file.
Conclusion
In this article, we have shown how easy it is to create zip-based choropleth maps in Tableau. First, we’ve leveraged the embedded polygons in Tableau and discussed the limitations of that approach. We have seen that the available data have inherent imperfections, likely affecting your project unless you’re happy with limited country coverage and possibly outdated data. The biggest problem is probably the lack of control related to those sources.
Then, we explored the options to import your geographic files into Tableau. Tableau can ingest geographical files in popular file formats and gives you full control over the joins you want to perform, which is a great asset. Although we have limited the geographic scope of these examples for simplicity, this works in international, multi-country setups exactly the same way.
All you need is consistent input data. However, remember that Tableau remains a BI tool: it has limits for the number of data points you can simultaneously display on a map. If you reach the limits and your dashboards become slow or unresponsive, you can use aggregations and/or filters to limit the number of displayed entities at any point. Although Tableau has many good points, other tools are better suited for territory mapping.
If you’re convinced you need to import geographical data to Tableau from files you have full control over, you may be looking for trustworthy geographical data sources. You have already found that GeoPostcodes maintains the most accurate postal database in the world. Our products include postal and administrative boundaries for all countries. Thanks to the internal use of a topological model, we also deliver simplified versions of the polygons, still perfectly matching (no gaps nor overlaps). Like the sample you downloaded for the second exercise.
These are perfect for building choropleth maps in Tableau. You can easily download our boundaries files, explore them, edit them if needed (e.g., a custom grouping of municipalities), upload them to Tableau, and create insightful maps. You have full control over that data and can decide when you update the geometries. We will be happy to assist you in finding the best option for your business, so don’t hesitate to reach out.
If you’re not convinced but still reached the end of this article, thank you for bearing with me. Please don’t hesitate to give me feedback, as I would love to hear your thoughts on the topic.