The Ins and Outs of Incorporating External Data

In September, we took a hiatus from The Public Library Blueprints to explore the 2023 Public Library Technology Survey Summary Report – an extensive analysis of technology services in public libraries throughout the United States. Although much of the information within this report expands beyond what is collected in the Public Library Annual Report (PLAR), it also gathers a baseline of information similar to Colorado’s PLAR data. Certain data, including that on hotspot lending, were collected in both 2023 surveys. Examining these data sets together can help us understand how technology in Colorado (CO) public libraries compares to the larger landscape of public libraries across the nation. 

Incorporating external data sources into an analysis can illuminate strengths or weaknesses of a data set that were otherwise hidden. There are, however, important practices to keep in mind when doing so. In this post, we’ll discuss a few things to watch out for when comparing data sets from different sources and learn a thing or two about technology in CO libraries along the way.

First things first, let’s define these data types to make sure we’re all on the same page:

  • Internal data: Data that is generated within one’s own organization, such as data collected by your library. In this post, the internal data is the PLAR data because it is collected and shared by Library Research Service (LRS).
  • External data: Available data gathered and shared by an entity outside of one’s own organization. In this post, the Public Library Technology Survey data gathered and shared by the Public Library Association (PLA) is the external data.

Differences in the Data

Now, our first tip for comparing internal and external data is to double check that the data points are even comparable measures at all! Although this may sound obvious, if we don’t take a moment to consider what the data actually represents, it can be surprisingly easy to end up comparing apples to oranges without realizing it. For example, in the PLAR we ask how many wireless hotspots are in libraries’ collections and also how many times wireless hotspots were circulated. But the Public Library Technology Survey more directly asks libraries whether they circulate internet hotspots for off-site use. When first looking at these two data sets, I tried relating the percentage of CO public libraries that reported internet hotspots in their collection on the PLAR to the percentage of public libraries across the nation that circulate them for off-site use as shared in the Public Library Technology Survey Summary Report. Then, after a closer look at the data, I discovered that not every CO public library that reported hotspots in their collection also reported circulating hotspots at all. Admittedly, this discovery did not change my calculation by much (only one library reported hotspots in their collection without circulating them). But this scenario still shows the importance of identifying any differences that might exist between the two data sets, such as the questions asked, how they were worded, or the collection method used.

Even asking a library how many times they circulated hotspots doesn’t directly answer the question of whether a library circulates hotspots for off-site use. In theory, a library could have hotspots available but still have a circulation count of 0. To determine the percentage of CO libraries that make hotspots available, I had to look at both the collection data and the circulation data side-by-side. Only after this closer look at the PLAR data, and carefully considering whether these data points were comparable to the Public Library Technology Survey Summary Report findings, did I finally run the numbers. About 36% of CO libraries check out hotspots, and this is below the national percentage reported in the Public Library Technology Survey Summary Report, which found that about 47% of public libraries lend hotspots for off-site use.

Unique Populations

This 11 percentage point difference in hotspot lending between CO and the nation overall led me to investigate further why a smaller proportion of CO libraries lend hotspots. I started by exploring characteristics of Colorado public libraries that may impact their ability to lend hotspots, particularly locale designations (rural, town, suburban, or urban). This led me to another important point to keep in mind when using external data sources: Not only is it important to consider differences in data collection between the data sets, it’s also necessary to understand differences in the survey populations and whether these were taken into consideration during analysis. The Public Library Technology Survey Summary Report’s findings were weighted to take into consideration a variety of factors including a library’s locale designation, and differences in technology services between library locales were shared throughout the report. Considering CO has a large proportion of rural and town libraries, I began to wonder whether this impacted hotspot lending. In fact, 79% of CO’s public libraries are town or rural libraries compared to 69% of the nation’s public libraries overall.

Hotspot Circulation by Locale

LocaleColoradoNation
Rural/Town24%39%
Suburban77%66%
City90%69%

When looking at hotspot lending across locales, both in CO and the entire U.S., it becomes clear that locale designation is an influencing factor. In the U.S., 69% of urban libraries and 66% of suburban libraries circulate hotspots, but only around 39% of town and rural libraries do. In Colorado, the disparity is even greater with 90% of city libraries and 77% or suburban libraries circulating hotspots, but only 24% of town and rural libraries doing so. This disparity and the high percentage of town and rural libraries in Colorado both likely play a role in the lower percentage of CO public libraries lending hotspots.

Zooming Out or Focusing In?

Our society is inundated with data, and it can be difficult to decide what to pay attention to. Between juggling differences in the collection process, the data points themselves, and the populations surveyed, when should we incorporate external data into our analysis? One way to approach external data is to think of it as the background of a photo giving important context to the subject of the image. In other words, external data can help place a library’s data within the larger landscape of library statistics, but it’s important not to lose focus on the internal data specific to the library itself.

Different scenarios are going to require different levels of focus on external data. Zooming out to relate the PLAR hotspot data to PLA’s findings on hotspot lending could motivate libraries to increase access to hotspots across CO. However, fixating on a single comparison to external data can be misleading and almost never tells the whole story. In other words, we shouldn’t assume that CO libraries aren’t keeping up with libraries across the nation when it comes to technology services just because hotspot lending is below the national percentage. A higher percentage of CO libraries  circulate laptops or tablets for checkout (45%). This data is difficult to compare with the 2023 Public Library Technology Survey Summary Report findings because PLA asked about off-site circulation of laptops and tablets separately, but the two are combined in the PLAR.

Lastly, it’s important to note that even when direct comparisons with external data are not possible, outside data sources can still be an informative and helpful part of an analysis. We can think of finding a balance between internal and external data as focusing a camera lens. Generally, you’ll want to focus on and center the subject (your organization’s data), but having the right background to frame the subject can make a world of difference. When used as a tool alongside internal data, external data can bring key insights, context, and direction to an analysis. That being said, finding external data sources, evaluating the quality of these sources, learning to use them, and incorporating them into your analysis to help inform decisions at your library can all be challenging tasks to take on. LRS is here to help with any external (and internal) data needs at your library!

 

LRS’s Colorado Public Library Data Users Group (DUG) mailing list provides instructions on data analysis and visualization, LRS news, and PLAR updates. To receive posts via email, please complete this form.