Create Geographic Bubble Chart from Tabular Data
Geographic bubble charts are a way to visualize data overlaid on a map. For data with geographic characteristics, these charts can provide much-needed context. In this example, you import a file into MATLAB® as a table and create a geographic bubble chart from the table variables (columns). Then you work with the data in the table to visualize aspects of the data, such as population size.
Import File as Table
Load the sample file counties.xlsx
, which contains records of population and Lyme disease occurrences by county in New England. Read the data into a table using readtable
.
counties = readtable('counties.xlsx');
Create Basic Geographic Bubble Chart
Create a geographic bubble chart that shows the locations of counties in New England. Specify the table as the first argument, counties
. The geographic bubble chart stores the table in its SourceTable
property. Use the 'Latitude'
and 'Longitude'
columns of the table to specify locations. The chart automatically sets the latitude and longitude limits of the underlying map, called the basemap, to include only those areas represented by the data. Assign the GeographicBubbleChart
object to the variable gb
. Use gb
to modify the chart after it is created.
figure gb = geobubble(counties,'Latitude','Longitude');
You can pan and zoom in and out on the basemap displayed by the geobubble
function.
Visualize County Populations on the Chart
Use bubble size (diameter) to indicate the relative populations of the different counties. Specify the Population2010
variable in the table as the value of the SizeVariable
parameter. In the resultant geographic bubble chart, the bubbles have different sizes to indicate population. The chart includes a legend that describes how diameter expresses size. Adjust the limits of the chart using geolimits
.
gb = geobubble(counties,'Latitude','Longitude',... 'SizeVariable','Population2010'); geolimits([39.50 47.17],[-74.94 -65.40])
geobubble
scales the bubble diameters linearly between the values specified by the SizeLimits
property.
Visualize Lyme Disease Cases by County
Use bubble color to show the number of Lyme disease cases in a county for a given year. To display this type of data, the geobubble
function requires that the data be a categorical
value. Initially, none of the columns in the table are categorical but you can create one. For example, you can use the discretize
function to create a categorical variable from the data in the Cases2010
variable. The new variable, named Severity
, groups the data into three categories: Low, Medium, and High. Use this new variable as the ColorVariable
parameter. These changes modify the table stored in the SourceTable
property, which is a copy of the original table in the workspace, counties
. Making changes to the table stored in the GeographicBubbleChart
object avoids affecting the original data.
gb.SourceTable.Severity = discretize(counties.Cases2010,[0 50 100 500],... 'categorical', {'Low', 'Medium', 'High'}); gb.ColorVariable = 'Severity';
Handle Undefined Data
When you plot the severity information, a fourth category appears in the color legend: undefined
. This category can appear when the data you cast to categorical
contains empty values or values that are out of scope for the categories you defined. Determine the cause of the undefined Severity
value by hovering your cursor over the undefined bubble. The data tip shows that the bubble represents values in the 33rd row of the Lyme disease table.
Check the value of the variable used for Severity, Cases2010, which is the 12th variable in the 33rd row of the Lyme disease table.
gb.SourceTable(33,12)
ans=table
Cases2010
_________
514
The High
category is defined as values between 100 and 500. However, the value of the Cases2010 variable is 514. To eliminate this undefined value, reset the upper limit of the High category to include this value. For example, use 5000.
gb.SourceTable.Severity = discretize(counties.Cases2010,[0 50 100 5000],... 'categorical', {'Low', 'Medium', 'High'});
Unlike the color variable, when geobubble
encounters an undefined number (NaN) in the size, latitude, or longitude variables, it ignores the value.
Choose Bubble Colors
Use a color gradient to represent the Low-Medium-High categorization. geobubble
stores the colors as an m-by-3 list of RGB values in the BubbleColorList
property.
gb.BubbleColorList = autumn(3);
Reorder Bubble Colors
Change the color indicating high severity to be red rather than yellow. To change the color order, you can change the ordering of either the categories or the colors listed in the BubbleColorList
property. For example, initially the categories are ordered Low-Medium-High. Use the reordercats
function to change the categories to High-Medium-Low. The categories change in the color legend.
neworder = {'High','Medium','Low'}; gb.SourceTable.Severity = reordercats(gb.SourceTable.Severity,neworder);
Adding Titles
When you display a geographic bubble chart with size and color variables, the chart displays a size legend and color legend to indicate what the relative sizes and colors mean. When you specify a table as an argument, geobubble
automatically uses the table variable names as legend titles, but you can specify other titles using properties.
title 'Lyme Disease in New England, 2010' gb.SizeLegendTitle = 'County Population'; gb.ColorLegendTitle = 'Lyme Disease Severity';
Refine Chart Data
Looking at the Lyme disease data, the trend appears to be that more cases occur in more densely populated areas. Looking at locations with the most cases per capita might be more interesting. Calculate the cases per 1000 people and display it on the chart.
gb.SourceTable.CasesPer1000 = gb.SourceTable.Cases2010 ./ ... gb.SourceTable.Population2010 * 1000; gb.SizeVariable = 'CasesPer1000'; gb.SizeLegendTitle = 'Cases Per 1000';
The bubble sizes now tell a different story than before. The areas with the largest populations tracked relatively well with the different severity levels. However, when looking at the number of cases normalized by population, it appears that the highest risk per capita has a different geographic distribution.
See Also
geobubble
| table
| readtable
| reordercats
| categorical
| discretize
| GeographicBubbleChart Properties