October 26, 2020
A few weeks ago I wrote the post “Do we really have to lower our Safe Withdrawal Rate to 0.5% now?” about the pretty ridiculous claim that the Safe Withdrawal Rate should go all the way down to just 0.5%, in light of today’s ultra-low interest rates. The claim was transparently false and it was great fun to debunk it. But recently I came across another proclamation of the type “We have to rethink the Safe Withdrawal Rate” – this time proposing to raise it all the way up to 5% and even 5.5%! Well, count me a skeptic on this one, too. Though I’d have to tread a bit more cautiously here because the 5.5% SWR claim doesn’t come from some random internet troll but from the “Father of the 4% Rule” himself, Bill Bengen. He’s been doing the rounds recently advocating for a 5% and even 5.5% Safe Withdrawal Rate:
- In September in a piece he wrote for FA-mag with a recommendation to raise the SWR to 5%.
- On October 1, the same article, reprinted almost verbatim under a different title in the same magazine: “Choosing The Highest Safe Withdrawal Rate At Retirement”
- On October 13 on Michael Kitces’ podcast, Bengen made another explicit SWR recommendation: “[I]n a very low inflation environment like we have now, if we had modest stocks, I wouldn’t be recommending 4.5%, I’d probably be recommending 5.25%, 5.5%” It’s not clear what made him raise the SWR by another 0.25-0.50%, though.
And the whole discussion was quickly picked up in the personal finance and FIRE community:
- On Ben Carlson’s blog a few days ago: “What If The 4% Rule For Retirement Withdrawals is Now the 5% Rule?”
- On the “My Own Advisor” blog: “Weekend Reading – Biggest stocks and ETFs, OAS, 4% or 5% rules”
- Retire by 40: Did FIRE Just Get Much Easier?
- And in a recent article on MarketWatch, even though in that article, Bengen took down the SWR to 5.0% again.
The main rationale for increasing the SWR: inflation has been really tame recently and will stay subdued over the coming years and even decades. That’s his forecast, not mine! Hence, Bengen makes the case that we’d have to make smaller “cost-of-living adjustments” (COLA) to our withdrawals. Smaller future aggregate withdrawals afford you larger initial withdrawals, according to Bengen. But as you might have guessed, the calculations that justify the significantly higher withdrawal rate don’t appear so convincing once look at the details…
How does Bill Bengen justify the new 5+% number?
First, let’s take a look again at how Bill Bengen justifies the upward revision. Very simple. The bump from 4% to 5+% comes from two major adjustments proposed by Bengen:
- First, we can increase the SWR from the baseline level of 4% to about 4.5% if we get a bit more adventurous with our asset allocation. An overweight to small-cap stocks would have indeed increased your failsafe withdrawal rate to about 4.5% in historical simulations.
- Second, inflation is low right now. Lower inflation implies lower COLA in your withdrawals. And since the cumulative future nominal withdrawals are smaller we can bump up that initial withdrawal.
So far, so good. The only problem: both adjustments are standing on very, very shaky foundations. Let’s look at some of the problems in that logic…
Problem 1: Small-cap hindsight bias
If you’re an avid ERN blog reader you may know that I’ve written about the Small-Cap style (as well as value style and small-cap-value style as well) before, most prominently in 2019: “My thoughts on Small-Cap and Value Stocks”. Yeah, sure, small-cap stocks would have indeed outperformed large-cap stocks and also the total market index for many decades, see the chart below:
But incidentally and ironically, ever since the small-cap outperformance bias was pointed out in the academic literature, about 40 years ago, small-cap stocks would have had only underwhelming success. And most recently, small-cap stocks would have vastly underperformed during the last few years, including the 2020 recession & bear market. So, pretending that the small-cap outperformance of the past will last indefinitely, almost like some fundamental law of physics, seems like bad advice to me.
Problem 2: Which small-cap index?
There isn’t one single generally accepted small-cap index. Bengen displays data for a portfolio with 30% large-cap stocks, 20% small-cap stocks, and 50% intermediate Treasury bonds. In my own simulations, I rely on the Fama-French SMB factor (SMB) to simulate small-cap stocks. The way to implement this in my Google Sheet (see Part 28 for details) is to set the stock market share to 50%, the bond share to 50%, and then add a 20% style shift to small-cap stocks, see below. (the idea here is that 30% is for large-cap and then 20% large-cap plus the SMB factor tilt gives you 20% small-cap, i.e. Big+Small-Big=Small!). Also notice that for this exercise we have to set all of the supplemental cash flows (see tab “Cash Flow Assist”) to zero!
In the chart below, I plot the monthly safe withdrawal rates between 1925 and 1990 for the Bengen portfolio but also for a standard 60/40 and 50/50 portfolio (i.e., with only large-cap/S&P 500 index, no small-cap tilt). Notice that the 30/20/50 portfolio has asset returns only since July 1926. It appears that small-cap stocks indeed help you with the SWR on some occasions, very prominently in 1929. But small-cap stocks don’t seem like a panacea either. In 1968, your failsafe withdrawal rate dropped to 3.88%. So the claim that small-cap stocks necessarily raise your SWR all the time seems very doubtful, even for some of the cohorts that retired well before 1980!
What exactly explains the discrepancy between my failsafe calculations and Bengen’s SAFEMAX is not clear. Bengen didn’t elaborate on what exact index returns he employed. It’s possible that the S&P500+SMB gives you different results from other Small-Cap indexes, even though I found that the IVV (S&P 500) plus the Fama-French SMB in equal shares will very closely resemble the iShares IWM (Russell 2000 Small-Cap Index) using actual return data since 2000. IVV+SMB even outperformed the IWM ETF a bit between 2000 and 2020. At the very least, the small-cap boost to SWRs is not all that consistent if the alpha depends so much on which exact small-cap flavor you use.
It’s also possible that because my analysis is more granular (monthly frequency of both the start dates and withdrawals), I’d pick up some of the lower monthly SWRs that Bengen (quarterly frequency) or other researchers (often annual frequency) might miss.
But just to make sure, for today’s post, I don’t like to delve much more into the small-cap issue. I like to focus more on the laundry list of problems pertaining to Bengen’s second claim, that low inflation predicts high SWRs. Thus, in the remainder of the post, I’ll just simply work with the historical data, with the significant and persistent small-cap alpha built-in, and show that the case for higher SWRs due to lower inflation rates still doesn’t hold water, despite including small-cap stocks. So, let’s move on to the next problem I found…
Problem 3: One-year trailing inflation is not a very good predictor for inflation over the next 30 years!
The rationale for “low inflation => high safe withdrawal rate” doesn’t work quite as well as Bengen wants us to believe, more on that below. But for the sake of the argument, let’s assume that his logic is indeed sound and a lower inflation rate allows you to raise your safe withdrawal rate. Do you notice a hole in his story, though? I do! We don’t know the inflation rate over the next 30 years. All we have right now, and all Bengen uses in his analysis is the final 12-month window before retirement to form the different buckets in Figures 5 through 10. But since it’s the future 30-year inflation rate that matters for my retirement cost-of-living adjustments, the following important question comes to my mind:
How much predictive power does the trailing 12-month inflation number have for the 30-year future inflation rate?
Unfortunately, not much. Let’s look at a Scatter Plot with the 12-month trailing CPI rate on the x-axis and the subsequent 30-year future realized annualized inflation rate on the y-axis. See the chart below. I call this a pretty spurious relationship with a positive but very low correlation. I display this for two different starting points 1872 (full sample, top chart) and 1926 (partial sample, used by Bengen due to the small-cap data availability, bottom chart). Notice the final observation is 1990 because that’s where the final 30-year realized inflation window starts.
The R-squared measures are minuscule, between 0.02 and 0.04. In statistics, we sarcastically call them the “Irish R-squareds” (O’One, O’Two, O’Three, O’Four, etc.) and they are the stats lingo for “ya got nothing there, buddy!”
I should also point out that even though the correlation and slope go in the “right” direction, the positive sign of the slope is not statistically significant. If you calculate the Newey-West, heteroscedasticity-adjusted t-stats (due to overlapping windows we can’t use the naïve OLS t-stats!) I get 1.32 for the full sample and 0.85 for the 1926-1990 sample. Doesn’t make the threshold for a significantly positive slope.
And even if the t-stats were statistically significant, the slope estimates are only 0.0507 for the full sample and 0.0375 for the shorter sample in the Bengen sample, respectively. Forget about the weak statistical significance, I call those beta estimates “not economically significant” because that means that if our current 12-month trailing CPI rate (1.4%) is about 0.6 percentage points lower than the multi-decade average, then our estimate for 30-year forward inflation rates should move down by only about 0.02% to 0.03%. Even if you assume that a lower long-term inflation rate increases your expected real return one-for-one (it might not, more on that below), why would a 0.03% higher real return increase your safe withdrawal rate by between 0.50 to 1.00 percentage points? The numbers don’t add up!
Problem 4: Bengen seems to inadvertently agree with my “Problem 3” above!
I was intrigued by the 6 tables (Figures 5 through 10) where Bengen proposes withdrawal rates in different buckets. First, he creates six tables for the six different inflation regimes, and then within each table, he buckets again by the CAPE regime. Consistent with Michael Kitces’ research, different initial CAPE readings had a huge impact on your SAFEMAX, so in each table, the SAFEMAX declines as the CAPE rises.
So far, so good. But if you read along the other direction, different CPI-regimes for a fixed CAPE value, there doesn’t seem to be too much variation in the SAFEMAX values.
For example, let’s compare the two CPI bucket 0%-2.5% and 2.5%-5.0% (two very important buckets as we shall see below). Here are Bengen’s proposed SAFEMAX values for CAPE values of 14, 16, 18, and 20, see the chart below. You will notice that the difference in SAFEMAX values for the two CPI regimes 0-2.5% vs. 2.5-5.0% is minuscule. And for the CAPE=14 and 16 regimes, the SAFEMAX even goes down when we move to the lower CPI regime, exactly contradicting much of the Bengen intuition of lower inflation leading to higher SAFEMAX estimates.
So, moving between those two inflation buckets, gives you essentially zero impact on your SAFEMAX (-0.03% on average), even though you move the CPI by a whopping 2.5 percentage points.
Even when considering the entire range of the CPI buckets you rarely get a noticeable difference in the SWR for a fixed CAPE. For example, for CAPE=15 you get a SAFEMAX of 7% in the lowest CPI bucket and 6% in the highest CPI bucket. The mean CPI annual rate in those buckets was -9.3% and +6.5%, respectively. If you spread out that 1% gain in the SWR over a 15.8% decline in the CPI you get a “slope” of about 0.063 per percentage CPI point. In the same ballpark as the 0.04 to 0.05 CPI betas I estimated above in the 1y vs. 30y inflation scatterplots. How can a tiny drop in the inflation rate justify raising the SWR to 5% or even 5.5%?
Problem 5: Ignoring modern statistical data analysis techniques
One problem that immediately jumped at me when I saw all the tables in Bengen’s FA article: When you start with the historical retirement cohorts and you sort them and put them into different bins along two dimensions (inflation and CAPE regime), you might end up with very few observations in each bucket. How many of the findings are actually statistically significant? I took the time to calculate the number of observations in the following buckets:
- 7 CAPE regimes <12, 12-14, 14-16, 16-18, 18-20, 20-22, >22, roughly in line with most of the bins used in Bengen’s paper
- 6 inflation regimes as in the Bengen paper: <-5, -5 to -2.5, -2.5 to 0, 0-2.5, 2.5-5, >5
The number of cohorts starting in the various buckets are in the table below. Do you notice a problem here? Even though we have 768 total observations (cohorts starting retirement between July 1926 and June 1990), the number of observations gets very sparse in most of the buckets. Sometimes down to zero. So, if someone shows me SAFEMAX rates that differ by about a whole percentage point when going from -5% inflation to +5% inflation I wonder if this is just spurious. Can Bengen quantify his confidence in any of his numbers?
But it gets even worse. Someone might argue that even for the high-CAPE regime (>22) we do have 10, 35, and 11 observations in three different inflation regime bins. It’s a low number, but it’s something you can work with if you know your statistics. But there’s a catch. Let’s look at how the 10+35+11=56 observations are distributed over time. In the chart below, I plot the entire SWR time series for the 30/20/50 asset allocation and I mark in blue, green, and maroon the three different inflation regimes for the CAPE>22 buckets. Notice something? Even though we have 56 different observations they all cluster around the three market peaks, 1929, 1937, and the 1960s. These are not all independent observations because most of the return data windows overlap here. So, the number of true observations just went from 10, 35, and 11 in the three bins to 1, 3, and 2 if we kick out the observations that are not truly independent. With such a low number of observations, we’re entering statistical la-la-land if we want to draw sweeping conclusions like raising the safe withdrawal amount by a whole percentage point!
To get a handle on how much of the inflation story in Bengen’s paper actually holds water from a quantifiable, statistical point of view, I propose the following: We run a regression to find out how much of the variation in SWRs is explained by the CAPE and the inflation rate. And is the inflation-beta even statistically significant? Granted, the regression analysis does not exactly generate the SAFEMAX/failsafe we’re interested in. Think of a univariate regression where the regression line goes through the cluster of points, not underneath it. But the slopes that refer to the mean/point forecast for the SWR aren’t materially different from those that apply to the tail estimates. (more details for the stats wonks: the out-of-sample point forecast moves linearly according to the betas, of course. The variance around a forecast for out-of-sample independent variable x0 is s^2*[1+x0’*inv(X’X)*x0], and will change for different value of x0 but only very little for the CPI values that we’re interested in, say, between 0% and +5%)
The regression results are in the table below. Obviously, the earnings yield (=the inverse of the Shiller CAPE) has very meaningful and statistically significant explanatory power for the SWR. But 1-year trailing inflation is again both statistically and economically insignificant! The t-stat is only about 1.25 and the magnitude of the slope is so low that going from one CPI bin to another doesn’t seem to make much of a difference. Going from the 2.5-5.0 bin to the 0.0-2.5 bin would warrant a 2.5×0.019%=0.05% higher SWR. And of course, we knew that already, even from Bengen’s own data!
And by the way, when I say modern data analysis tools, I don’t even mean anything really modern. The concept of regression analysis has been around for about 100 years. The seminal research by Whitney Newey and Ken West to adjust t-stats for heteroskedasticity was published in 1987 – before some of the readers here were born. So, not exactly rocket science. But it certainly beats bumbling and fumbling around SAFEMAX guesstimates in the 38 different bins when most bins have only between 1 and 3 true, independent observations!
Problem 6: Data-snooping
Initially, I had trouble wrapping my head around some of Bengen’s results. Why is the failsafe withdrawal rate (SAFEMAX in his nomenclature) so high right now? But then I read more carefully and found the explanation; Bengen writes in the FA-mag article
“I ignored data for which the 30-year price-earnings ratios were 30% more or less than the long-term average of 17.05; in other words, I anticipated approximate reversion to the mean of the Shiller CAPE. I felt that much larger deviations created unrealistically high or low safe withdrawal values. Even so, my permissible 30-year CAPE ratio has a wide range, from approximately 12 to 22.”
Okkkayyy? The only problem with this approach is: We currently do have a CAPE significantly more than 30% above the long-term. At the current level of about 30, we are a cool 75% above the long-term average of 17. Why would I want to ignore previous market peaks at which we had similarly overvalued CAPE ratios? It’s like you want to estimate the probability of rainfall but you ignore all prior historical data when you had cloudy skies. And then you apply those probabilities to a situation today when you have some dark clouds rolling in and you can hear some thunder in the distance already. What use is that forecast?
But I don’t want to rub in this issue too much. Maybe Bengen just misspoke (miswrote?) on this issue because in Figures 8 and 9 he does have categories “22 or Greater” and “23 or Greater” for the CAPE bins. So, maybe he does factor in the high CAPE ratios. For example, he also seems to concede that a 4.5% SAFEMAX was necessary during the high-CAPE regime in 1968. And again, with my own calculations, I find a 3.88% failsafe (Dec 1, 1968 cohort), even when using the 20% small-cap stocks, much lower than his estimate. But even if Bengen operates with a 4.5% SAFEMAX in 1968 when the CAPE was 22.2 (Nov 30, 1968), how can he then recommend 5% or even 5.5% today when the CAPE is standing at 31, almost 50% higher than in 1968?
Bengen seems to suggest that because the previous SAFEMAX low of 4.5% occurred while the CPI was in the 2.5-5.0% range, we can now completely ignore today’s high CAPE ratio because we’re in the 0.0-2.5% CPI range. I find this claim almost as offensive as completely ignoring the CAPE>22 observations. For all the other CAPE ratios (14, 16, 18, and 20) moving between the two inflation bins had essentially zero impact on the SWR, ranging from -0.25% to +0.25% with an average of -0.03%). But for a CAPE just 2 points higher (22 in 1968) then suddenly the inflation regime makes a tremendous difference of a whole percentage point? That sounds really odd.
Think about it this way: the fatality rate while climbing K2, one of the deadliest mountains in the world is almost 30%. But there have been zero fatalities among 6’6″ tall German-Americans from Camas, Washington on that peak so far. So I can just start tackling a summit push on K2 without any risk, right? Uhm, no! You can always slice and dice away previous disasters by inventing artificial and arbitrary additional constraints and eliminating some or even all of the data points that don’t fit your message. But you can’t slice and dice away the underlying fundamental risk. Just like you can’t eliminate the risk of a high CAPE ratio by pointing out that today’s CPI environment is different from 1968. One shouldn’t do that when the CPI has a very spurious relationship with SWRs in the overall sample.
Problem 7: Not even the 30-year “perfect foresight” inflation number would have helped you that much in pinning down the SWR!
As we saw in the plot above, the 1-year trailing CPI inflation number is pretty much uncorrelated with the safe withdrawal rate (SAFEMAX) over the next 30 years. There’s just too much noise in annual CPI numbers in the historical data. But let’s do a thought experiment to really drive home the futility of linking SWRs to inflation. Imagine all of the historical retirement cohorts had known for sure(!) what’s the realized annualized CPI inflation rate over the next 30 years. Then how much would that have helped in pinning down the safe withdrawal rate? Let’s run the same SWR linear regression as above, with the SWR as the dependent variable and a constant, the CAPE earnings yields, and the 30-year ahead perfect foresight CPI inflation rate (annualized). Would that be picked up in a statistically meaningful way? Let’s take a look at the regression results. I do this for three different dependent variables, i.e., the historical SWRs for the Bengen 30/20/50 Portfolio and but also without small caps just for the 50/50 and 60/40 Stock/Bond portfolios. The results are in the table below:
So, even if you could have perfectly estimated the future annualized CPI rate over the next 30 years, you can’t raise your SWR one-for-one. You can increase your SWR by between 0.25 and 0.54 percentage points for every 1 percentage point reduction in the annual inflation rate. The R^2 goes from 0.673 in the baseline regression using trailing CPI to only 0.795. The CAPE earnings yield is still the overwhelmingly important factor in accounting for the SWRs. That’s quite amazing considering the CAPE is actually known at the start of retirement, while the 30-year CPI number is not! The punchline here: don’t even attempt to read much into different inflation regimes. Even if the future inflation rate is lower by x%, you can’t take the entire x% to the bank and translate that into an x% higher SWR!
Problem 8: Don’t put the nominal cart in front of the real horse
The logic “lower inflation implies higher withdrawal rates” doesn’t hold up in the data as we saw in the calculations above. That’s odd! Remember, there is the Fisher Equation:
Real Return = Nominal Return – Inflation
Isn’t that the definitive proof that the real return should move one-for-one (or one for minus one) with inflation? Inflation goes down by a percentage point, then the real return has to go up, right? Wrong. It’s only true if the nominal return doesn’t change in response to the lower inflation. But that’s not guaranteed!
Look at the following prominent example where inflation went down and you clearly didn’t get to pocket a nice fat raise in real returns. In the early 1990s, Brazil had double-digit inflation. Every month! Which translated into about 4,000-5,000% annualized inflation. Sure, the equity returns were pretty impressive, around 7,000% year-over-year in 1994, but unfortunately, a lot of that came just from the inflation mirage. And of course, we all know what happened. Brazil got its act together, cleaned up its monetary policy, and achieved single-digit annualized inflation for almost the entire next 25 years (only two short stints with Y/Y inflation slightly above 10%). So, according to Bill Bengen’s logic, could people have retired in 1994 with a 7,000% annual withdrawal rate? Hey, inflation went down, right? Unfortunately, nominal returns went down, too. Who would have thought?!
So, this case study from Brazil provides a stark example of a simple economic fact: real returns, especially real equity returns are often tied to real economic fundamentals: productivity growth, population growth, etc. If we move to a different inflation regime you don’t necessarily get to pocket the entire difference in inflation as a guaranteed boost in the real return.
It’s also possible that real returns stay roughly the same and nominal returns just move hand-in-hand with the inflation regime. True, Brazil probably noticed a bit of a boost in real productivity in light of getting its macroeconomic policy act together. But I have a hard time rationalizing how a move from a 2% trend inflation to 1.4% inflation provides much of an economic game-changer. If anything, I’d be more concerned about lower real growth prospects in most of the developed world going forward, in light of rising debt levels and the resulting political uncertainty, rising inequality, etc.
But just to be sure, I’m also the first to concede that low and stable inflation is generally a good environment for real bond returns and even stocks. So, I’m not surprised that there’s a slightly positive slope between realized inflation and your SWR, as we saw in the regression above. But again, the relationship is not 1-for-1. The concern I’d have for bond investments, in particular, is that we’ve already had very low and stable 2% inflation for the last 20 years. Significantly below 2% over the last 10 years. A low inflation regime is already baked into today’s asset valuation landscape. How much lower are we going to go to pump up bond returns much more? How about the risk of all that debt blowing up in our face and creating a huge inflation wave over the next 30 years? Then we might even be grateful that the beta is lower than 1 and we don’t have to reduce our SWR one-for-one if future inflation goes through the roof!
Problem 9: Not seeing the forest for the trees.
Independent of the analysis above, the whole idea of a 5.5% safe withdrawal rate in today’s environment just doesn’t pass the smell test. Here’s another way to illustrate how dangerous the 5+% safe withdrawal rate is in today’s financial environment. You have a 50% bond share. Current Treasury yields are well below the inflation rate. For example, the nominal 10-year yield was 0.841% as of Friday, October 23. The most recent 12-month rolling CPI inflation rate was 1.41% (9/2020 vs 9/2019); a real yield of around -0.57% based on trailing inflation. The current TIPS yield is even worse: -0.89%, which seems to indicate that the financial market 10-year inflation forecast is actually a bit higher than the 12-month trailing figure.
If 50% of the portfolio has a negative real yield, your equity portfolio has to do double-duty. How likely is that with a CAPE above 30? So, in other words, you can slice and dice all your historical data any way you want and hide the historical bad Sequence Risk episodes of cohorts that retired during high-CAPE-ratio regimes. But you can’t escape simple math. Today, you’d need significantly above-average real equity returns to make this work, which seems highly unlikely when the CAPE is 75% above its long-term average. I’m not saying that above-average equity returns are impossible, I’m just saying that, historically, that’s how returns and valuations in equity markets have worked out. When I calibrate a Safe Withdrawal Rate, the emphasis is on the word “Safe”. Setting your withdrawal rate to something you “hope” might happen if you get lucky over the next few decades is not a SAFEMAX. It’s a HOPEMAX.
The Bengen Study in the FA-mag and the follow-up interviews on Kitces.com and MarketWatch have not convinced me that we can materially increase our Safe Withdrawal Rate. First, and that’s a bit of a side-issue, small-cap stocks might have given you a bit of a return boost between 1926 and 1980. But you’d be best served not taking that for granted going forward, because a) the small-cap outperformance party might be over now and b) even for some of the earlier retirement cohorts (e.g. Nov/ Dec 1968), the Fama-French SMB bias didn’t give you a very consistent boost in your failsafe rate.
Secondly, and most importantly, the low-inflation story doesn’t make any sense. A lot of people reached out and asked me for my opinion and I’d normally give a quick response in line with item #9 above. But I thought it’s worthwhile to do a more careful analysis and see where the skeletons are hidden in Bill Bengen’s study. The connection between 12-month CPI and future 30-year safe withdrawal rates is just too spurious.
Of course, as always, I want to point out that in some of the case studies I’ve done, I’ve routinely recommended initial rates of 5% or even 5.5% for early retirees who have to bridge only 10-15 years until generous pension and Social Security benefits start. But as a baseline over 30 years of flat withdrawals, I’d find 5% and especially 5.5% really irresponsible. And the recommendation of 50% bonds and 5.5% SWR would be particularly irresponsible for my friends in the FIRE community with a retirement horizon of 50+ years.
Thanks for stopping by today! Please leave your comments and suggestions below! Also, make sure you check out the other parts of the series, see here for a guide to the different parts so far!
Title picture credit: Pixabay.com