I came across an excellent article from last week on price elasticity for gasoline consumption from Lutz Kilian and Xiaoqing Zhou, two economists in the research department at the Dallas Fed.
See this link for the piece, which is a quick read that really got me thinking. If you are in retail or involved in pricing, I’d encourage you to read this - twice even. I was actually surprised to see that the conventional wisdom on fuel consumption assumed that short-term demand was essentially inelastic to price. For those who are not pricing nerds like me, this basically means that economists say people will buy essentially the same amount of fuel in the short-term even if price changes, and that demand shifts only occur over much longer time scales where folks might decide to buy a more fuel efficient vehicle or switch to some other mode of transportation. One can see this conventional wisdom rooted in statements like “people have to commute to work, so they will buy fuel even as price changes.” Digging deeper into the assertions the two economists make here brought me some insights that I think may be helpful for other kinds of demand modeling.
So what is the big deal about this discrepancy in demand? As I was reading the article, I thought about being in the car with my dad as a kid during the 70s. My father always had (and still does have) an encyclopedic knowledge of the local gas prices, from the times when a local Spur station had regular at $0.18 USD per gallon (circa $0.05 USD / liter) while the competition might have been at $0.26 USD (circa $0.07 USD / liter, 44% higher). A quick call to him refreshed me on his mental decision tree, timing and shifting fuel purchases so if necessary he could drive a bit further to get cheaper gas (fuel). He also reminisced on the call about his mother making his father stop when traveling only at stations that sold Green Stamps - an interesting testament to the power of loyalty programs that is 75 years old.
While some of my dad's tactical purchasing was probably driven by weekly or monthly budgeting, I recall him saying things like “for 10 cents a gallon, it is worth going 5 miles further” (obviously with the numbers changing depending on fuel price). I remember him only putting $5 in the tank if he had a low fuel level and had to buy at a more expensive location, just to be sure he had enough to fill up later at a less expensive location.
I didn’t know what price elasticity was at the time, but it sure seems like the fuel purchase decision was highly elastic to price for my father at least. This turns out to be an important lesson, though I didn’t know it then. As we will see shortly, we can (and should) model demand separately from modeling specific purchase decisions. Doing so can unmask some of what we call “aggregation bias”, where we might draw conclusions on individual behavior based on the results of the group. How, you might ask? What if your model doesn’t consider localized variations in price? Taking a monthly average or using an overly wide geographic estimate might blur the effect of individual purchase decisions.
Consider what happens if one were to use monthly prices instead of daily prices. Maybe I am watching my spending (like my father did in planning those fuel purchases). Maybe I can’t wait two weeks to buy fuel, but I can wait two days. Maybe I will postpone some trips (again for a few days) until I buy fuel later. Or, maybe I drive across town to fuel up at a cheaper location. Over the measurement period, I bought gas, and so we see that demand evident in the data. If the pricing were blurred by averaging over a month, we might not detect that I either waited for prices to drop for a few days (temporal blurring) or drove a bit further (spatial blurring). To a model built on aggregated data, I “bought like the mean” (not precisely correct, but close enough to illustrate the point). At a high level, it looks as though the demand response is relatively insensitive to price changes. This is wrong, and we can learn from the mistakes here to ensure our models are better.
If you work in data science or economics around pricing or demand modeling, I’d encourage you to read the articles referenced in this piece. As an example, a study by Levin et al referenced from the article is a powerful reminder of the importance of using data that is appropriately granular. The study shows how daily pricing fluctuations are much higher than would be caught by a simple smoothing technique such as a period average. This is not a minor or trivial conclusion, as I suspect anyone charged with pricing can assert. Pricing is tactical as much as strategic. While the movements over time may average out, it does not follow that we can just “zoom out” and look at behavior over longer time periods. Such smoothing or averaging is analytically risky (in my view) wherever someone can delay (for a short period) or alter a purchase decision (i.e. go to a different place) easily. If we aren’t measuring closely enough, we will miss the actual decisions that consumers make in the blur of averaged prices and purchases made within a larger locale like a city.
In this same study, the authors noted a very strong “within-week” pattern for fuel purchases. I can imagine a headline quoting what they said in the article: “...consumers buy roughly 17 percent more gasoline on Fridays than the daily average and buy 15 percent less on Sundays than the daily average…” On the surface, the conclusion is interesting, but digging deeper alerts me to another potential fallacy relating to data depth. In this case "purchase amount" blurs the transactional frequency and the per-transaction expenditure in favor of only total spend. A consumer might reasonably decide to buy less at a moment in time (like my father only putting a few dollars worth in the tank until he could find a cheaper source later). The Levin et al study concludes that expenditure-per-transaction doesn’t vary nearly as much as the transaction frequency for "within-week" analysis. Put another way, most of the variation by day in total expenditure comes from the frequency of transaction, rather than from buying more or less in a transaction. You might not notice this if you were not careful in your model to analyze both the frequency of transaction and the amount-per-transaction, and in the same line of reasoning you might miss the opposite result for the same reason in different situations. Blurring to “total expenditures” could (and probably would) create another aggregation bias which could mask important insights about demand, hiding the true number of purchase decisions and the intensity or propensity-to-buy for each.
As if this were not enough, then consider the problem of different grades. The Levin et al study asserts that about 15% of the fuel purchased in the US is for a higher grade than standard. How might the choice of grade vary with price or price delta? Without considering the specifics of each transaction, we'd lose this in the blur of averages.
Maybe consumers (in the US for this study) buy fuel in relatively the same amounts each time, at the same grade. We can’t just assume this to be true for every situation. By considering transactional frequency, basket size, and product choice as separate (if not completely independent) variables, we get a much clearer picture of how, when and where the purchasing happens. How much you care about these insights depends on whom you are in the food chain of fuel distribution and marketing. If you are the one selling fuel on the street, I can guarantee you that it is important to understand these variations since you win or lose at a specific location based on what happens each day.
Using the economists’ scaling (which is different from what I see as typical convention for retailers - results in the end are the same), an elasticity of -1 means for every 1 percent increase in the price of fuel, consumption falls by one percent. A value of 0 means that a change in consumption does not occur in response to a change in price. The researchers at the Fed who wrote this piece I referenced note several sources that estimate elasticity for fuel in the US at between -0.01 and -0.08, or nearly inelastic demand. Several recent studies quoted by the Dallas Fed researchers covering the US and Japan using much more granular data concluded the elasticity was in a range of -0.27 to -0.37.
If you are a pricing or economics wonk, your jaw may be hitting the floor as you read this. If not, let me just say that this is an enormous difference - on the level of the difference between ice cold and pretty warm. From my days serving convenience and fuel retailers, I know how seriously the retailers take pricing of fuel. The frequency and degree of price changes that most retailers make, with high variation across outlets, tell me that they knew already what these studies confirmed - local and day-over-day differences present a very different picture than a broadly summarized model would. Were we to make the same mistaken assumptions in pricing for other goods and services, we would be fighting a trend and probably driving the wrong strategy for those products.
Coming full circle to those trips with my dad when I was learning about price elasticity without even realizing it, I’m making three mental notes for my next demand estimation and modeling exercise. Remembering how he would decide whether to drive a bit further or not, and whether to wait a few days or not, for his next fuel purchase, I can see a real example of aggregation bias which I didn’t even know existed. This lesson of misunderstanding fuel price elasticity should be a reminder to us of the dangers of aggregation bias. If my fuel retailer colleagues had taken these incorrect assumptions to heart, they’d have been losing their shirts to the competition who were (correctly) treating pricing as both tactical and strategic.
Here’s how I am summarizing my takeaways, as good reminders:
- Acquire data at the transactional level wherever possible, and in any case do not aggregate too highly when blending in other data.
For most goods and services, I’d say daily aggregation would be a good tradeoff for comparison with other data sources. Less frequent (i.e. higher interval) purchases can probably use higher aggregations, while commodities and financial assets would likely need intra-day.
- Use high spatial, channel or geographic resolution so you capture the tactics consumers might use to get a better deal by choosing a different source.
Even better, see if it is possible to model the additional effort or cost of using a different source as a possible secondary, differential. or "up-sell" demand indicator. For retail goods, you would want to consider a level below MSA (in the States), postal code or other locale - think about how to understand “intra-city” or “cross channel” travel that might be lost in aggregated purchase statistics. Even “intra-business” differences might be important, such as a customer choosing a different fulfillment method or visiting a different format (hyper vs local store).
- Determine how to model differences in purchase quantity over space and time, especially for goods and services where it is possible to buy more or less because of an ability to store.
We’ve all seen the near hoarding of paper products during COVID-19 as the most extreme example of “pantry loading”. How much our customers buy when they do make a decision is also very important, particularly when they have a lot of control over timing.
If economists can (apparently) estimate fuel price elasticity incorrectly for decades, then I think we can all agree we might have other similar problems in our own respective models. I’m taking an extra look now at aggregation bias in everything I do, thanks to the wisdom of my father and some very good recent work by economists who dug deeper in the data to uncover a rich variability previously hidden by averages.