

Discover more from hrbrmstr's Daily Drop
Earlier this week, I dropped some R code to go along with scraping the Walgreens “hidden” API. It turns out that there's a spiffy, free API that will let you acquire a list of stores and store details.
So, today's WPE is all about helping you sharpen your data journalist skills and set up a way to monitor for Walgreens store closures in the 🇺🇸. There is enough data in the Walgreens API and U.S. Census corpus for aspiring data scientists to even go so far as to predict what stores might actually close. I'll toss some “prediction caveats” towards the end of this WPE.
Sign Up For The Walgreens API
Walgreens was incredibly cool and savvy when they decided to open up their platform to third-party developers. Somehow, though, I doubt they threat-modeled the use case we're digging into.
Please take a moment and sign up for the API (it's quick and free) and stick the API key into WALGREENS_API_KEY
(or use a different environment variable name). Any code references here, and all code in a repo I'm linking to in a bit, will assume you've done this already.
Honing In On The API Endpoints We Care About
We only care about store info, so can focus on the Store Locator API.
The first one we need to use is the one that provides a list of all the stores. Thankfully, it returns them all in one API call, which we can test via curl
:
curl -X POST "https://services-qa.walgreens.com/api/util/storenumber/v1" \
--header 'Content-Type: application/json' \
--data "{ \
\"apiKey\":\"${WALGREENS_API_KEY}\", \
\"affId\":\"Self\", \
\"act\": \"storenumber\" \
}"
That returns a JSON array of ~9K elements that are just store numbers. We'll talk about that more in the next section.
The next endpoint we care about is the one that gets all the info Walgreens has about a given store. It is equally straightforward to use (there was no rhyme or reason why I chose 2222
):
curl -X POST "https://services-qa.walgreens.com/api/stores/details/v1" \
--header 'Content-Type: application/json'\
--data "{ \
\"apiKey\":\"${WALGREENS_API_KEY}\",\
\"affId\":\"Self\",\
\"storeNo\":\"2222\"
}"
Now that we know what endpoints we need to use, let's discuss one method of keeping track of store closures.
Making Some Assumptions
Until this week, I had never worked with this API. Thus, I do not know if this initial suggestion will pan out. But, it seems to me that a way to just monitor for closures is to grab and store the list of stores daily and just “diff” them. Put that into a cron
or launchd
job, or set up a GitHub (et al.) workflow to do it. There are tons of resources on the internet to help you toss that curl
command into such things, so I'll leave you to do some of that research/work.
That won't do you much good if the API doesn't return info on a store number that's closed (I have no idea whether it does or not, yet). So, you'll need to do some work in the next section to maintain data on the stores themselves.
Building A Store Directory
The API has a pretty generous rate limit of ~500 requests per minute. That means you can stay comfortably within it with a sleep 0.15
on most *nix systems (that does ~400 per minute, but that way you won't necessarily draw attention to yourself if they monitor usage metrics).
Set up a similar cron
/launchd
/action to run the curl
once/day and store the JSON in individual files using the store name as the basename.
Getting A Bit Fancier
Maintaining my promise to GitLabs's chief data wrangler, I have a GitLab project (vs. an icky GitHub one) that contains some goodies for y'all. While curl
works, I do like me some task-specific command line binaries for focused tasks, so I've got both Rust and Golang versions of the store list and individual store data retrieval idioms in that repo. They show how to read environment variables, grab parameters from the command line, and do some JSON unmarshalling of targeted fields that include the full address, county, and lat/lng. You should add as many fields as you like or think/discover you'll need.
There is also some R code in there that shows how to hit the API in R via {httr} and plot the map you see at the beginning of this edition.
Carnac 🔮 Mode
Now, if you combine U.S. Census data to get population in various areas, you can perform some calculations to see coverage per “area”, population density, and even some locale-specific economic data. R users should strongly consider using {tidycensus} for that.
A piece of critical information that you cannot find is how profitable each Walgreens store is, since there's no way Walgreens is ever going to share that type of granular data publicly. So, use a bit of caution before claiming you've come up with a solid model for predicting which stores will close, even after you have data for stores they do close over the coming months.
You will also want to dig into the store data a bit, since each Walgreens has the potential to host a plethora of services beyond selling gossip magazines, candy bars, and soda pop to folks. Everything from pharmacy and clinic services, to shipping and receiving, and even photo processing are just a few of the other facilities that can be at a given store. So, you will almost certainly need to include some of that in your models.
20,000 Meter View
The store map does mirror U.S. population density maps, and one thing that hit me whilst reviewing it was just how insane the logistics have to be for managing the supply chain to each of these stores. Merely stop and think about what's involved to get a single box of band-aids to each and every store. Extending that to all the other medicines, first aid, food, supplements, household items, and more can really get the minds of systems thinkers truly spinning wildly.
FIN
I hope y'all have some fun with this project, and please reach out if anything needs clarification.
I included one run of the data scraping in the repo, and I'm hoping my datavis CEO does some cool stuff with it in Observable. ☮