Here’s a Brief Primer On Where to Get Good Data


A reader emails to ask me where I get my data:

I’m curious as to what your process is….Do you usually start with Google? LexisNexis? Something else? You seem to have a preference for citing public sources, but how often do you start with a private aggregator like LexisNexis, and then find a public link from that? I guess what I’m asking with that one is, how much does it help to have access to private sources like LexisNexis? Is it instrumental in this kind of thing, or just nice to have, or not really that big of a deal?

I don’t have access to any private sources. I just have a computer and a web browser. That’s the hub of my data-driven empire.

But what are my favorite sources? Maybe some people would be interested. And it would be kind of fun to list them. So here they are.

IMPORTANT WARNING: Knowing where to find data is very helpful. However, what’s really important is knowing which data is appropriate to your purposes. You have to develop a feel for which sources are trustworthy. You have to know which data you need. (GDP? Real GDP? GDP per capita? GDP at purchasing power parity?) Sometimes you have to be creative. But the bottom line is that access to data doesn’t do any good unless you understand it first. There are no shortcuts to that. That said, here are the sources I use most often. Since I spend a lot of time writing about the economy, this list is very top heavy with economic data sites.

  • FRED is by far my most frequently used source. It’s run by the St. Louis Fed, and it aggregates tens of thousands of economic data series in a single place. It’s pretty flexible, it produces nice charts, and it lets you download the data to Excel so you can play with it yourself. If you’re looking for US economic data, it’s usually your first stop. It’s got some overseas economic data too.  
  • The Bureau of Economic Analysis and the Bureau of Labor Statistics are also good sources. Most of their data is in FRED, but not all of it. The BLS jobs report is released on the first Friday of every month, along with all supporting data. The BEA’s GDP report is released each quarter on the last Friday of the following month (i.e. the end of April for the Q1 report). The BEA release calendar is here. The BLS release calendar is here.
     
  • The Census Bureau collects historical data on household income that isn’t available on FRED. Ditto for trade data, though it’s clunky and frustrating to use. I really wish the trade data was presented more cleanly and made available to FRED.
     
  • The Federal Reserve has a ton of data, some available on FRED but some not. Their Flow of Funds report is basically a balance sheet for the United States.
     
  • For US crime statistics, go to the FBI’s Uniform Crime Reports. Their data delivery tool provides a lot of flexibility, allowing you to get data for specific crimes, specific localities, and specific time periods. Unfortunately, it’s usually two years behind the latest release, so you have to wade through the most recent PDF reports if you want current data. If you need a complete series, start with the data tool and then fill in the most recent couple of years by hand from the relevant reports.
     
  • I almost hate to mention the OECD data portal because it’s such a pain to use. However, it’s gotten better, and it’s your first stop for data about other countries. They only cover OECD countries, of course, which basically means the 35 richest countries in the world. The OECD tries hard to present uniform data for all countries, but that’s a difficult task. When comparing countries, it’s worth being even more careful than usual about what data you use and how different countries account for different things.
     
  • Needless to say, I use Google a lot too. Obviously you need to have some idea of what you’re looking for so you can use the right search terms, and often you have to iterate. That is, do a search, find a word or a reference that seems close to what you want, do another search using that word, rinse and repeat. You’ll usually get to something reliable and relevant eventually. Tips for best results: use Google Advanced Search. Make use of all its fields. Go to Settings and set your results to 50 or 100 per page. After you get results, click on Tools to restrict your search to a specific time period.
     
  • There are also some miscellaneous sites that aren’t technically data portals but still provide a lot of useful information. EIA has good energy data. The White House Office of Management and Budget has tons of historical budget data, but the Trump administration doesn’t have a useful OMB site yet. Go to the archived Obama OMB site instead. Google’s Ngram viewer has pitfalls, but it’s a lot of fun for tracking the rise and fall of words and phrases. The Tax Policy Center has loads of useful data on taxes. The Center on Budget and Policy Priorities has a terrible name but lots of good analysis. Ditto for the Economic Policy Institute. Both are left-wing, so keep that in mind. Gallup has lots of good poll data going back a long way, and Pollster does a good job of poll aggregation. Wikipedia is also great. It’s a genuinely useful site if you want a brief primer on something or other, and every article has lots of links to its sources. I always check its data back to the primary source, but it often points me in a direction I hadn’t considered.
     
  • Finally, this isn’t data per se, but the site I probably use the most often is Thesaurus.com. I head over there something like 20 or 30 times a day. It’s fantastically better than any printed thesaurus because you can quickly hyperlink through synonyms until you find just the right one. I use it so much that I have it set up as one of the standard searches in my browser’s address bar.

I’m probably forgetting a few places that I use a lot. I’ll update this post if any come to mind.