Comparison of Different P2P Lending Investment Strategies

[Editor’s note: This is a guest post by Bryce Mason, the founder of P2P-Picks (go here to read my review of his service). He has done some groundbreaking research here by analyzing various Lending Club investment strategies from different bloggers and calculating their real return based on historical loan data. While Bryce is obviously trying to promote his own service, I believe the analysis he has done here can benefit all investors.]

There are many opinions about P2P portfolio selection strategies, but not much analysis that allows transparent comparisons among them. At least a dozen bloggers have published strategies that aim to increase return on investment (ROI). There are also a number of pay services with proprietary strategies aimed at accredited investors (via investment funds) and regular account holders (via research subscription sites like InterestRadar and P2P-Picks). P2P investors must try to put different strategies on a level playing field and judge their value fairly, such as in the following graphic.

P2P Picks risk vs reward analysis

The purpose of this blog post is to provide a framework for individual investors to make more valid comparisons across strategies.

1 Strategies

Strategies for this post include those authored by various people around the web who have published filters—the criteria by which they select loans for investment. We followed their rules and developed lists of all LendingClub loans issued from 2008-Q3 to 2009-Q3 that each strategy would have selected, and then we simulated a $25 investment in each one. Due to the nature of proprietary strategies, we could only include our own, the P2P-Picks Profit Maximizer, and tested its three levels of recommendations: the predicted Top 1%, Top 5%, and Top 10% of all loans. So as not to use information that would have been unknowable at the time, we developed a special P2P-Picks model just for this purpose, based entirely on loans issued prior to 2008-Q3.

2 Outcomes to Consider

At least three outcomes must be considered for an investment strategy.

First is a measure of its ROI. A good measure of ROI should be based on actual performance over the entire course of the portfolio (or monthly performance of portfolios that have an average age well beyond the point at which defaults have peaked). This is important because some strategies advertise high performance on youthful portfolios, or paper over older defaults with an increasing portfolio size. Under those cases, it’s easy to hide future losses that will eventually come home to roost. Additional considerations include interpretability (such as using an annualized measure of performance) and specificity (measuring only the performance of the strategy and not confounding it with investor behavior—like allowing cash drag to eat away at returns under one strategy but not another).

For ROI we used Excel’s XIRR function, which takes a periodic set of cash flows (investments and repayments on various dates) and uses a numeric approximation method to generate an annualized return. Thankfully, we have payment dates and amounts for all LendingClub loans. This is an excellent measure of ROI for all of the reasons above. It is based on actual performance. It covers the entire investment period because loans issued at the tail end of our experiment have already matured and their outcomes are known (it was useful to wait a few additional months past maturity for late payments to trickle in). It is extremely interpretable. And it solely measures the strategy performance because only the dates of investment and repayment are considered (no cash drag is measured and no assumptions about reinvestment need be made). All ROI calculations are net of the 1% LendingClub servicing fee and, for the P2P-Picks models, the approximate 1% subscription fee.

Second is a measure of volume—or how much money can realistically be invested in a given period of time. Deploying $1,000 is easier than $1M, and different investors will have different needs. For this measure, we simply looked at the percentage of all LendingClub loans that each strategy recommended.

Third is a measure of the portfolio’s risk. Few investors consider risk explicitly, or assume that interest rates compensate for risk entirely. For the same reasons as in ROI, a good risk measure should look over the entire course of the portfolio and measure variability in the outcome. We chose to construct a simple risk metric based on each loan’s ultimate repayment amounts, and then aggregated them up to the portfolio level. More information about the definition of this metric can be found in the P2P-Picks white paper. Portfolios with more loan losses received a higher risk score.

3 Results

From the above figure, one can see that there is large variation in portfolio performance. In this case, we see that a few strategies appear to provide outsized ROI and less risk, and a few provide worse ROI and more risk than the LendingClub Index (an equal investment in every note). Once performance has been plotted in this way, preferred strategies become obvious. No strategy to the right and below the LendingClub Index would have been a worthwhile investment, as it offered lower returns, more risk, and less volume.

In terms of loan volume, the following chart demonstrates which strategies were able to offer investors a higher volume of loans, increasing the chances that in a real-world application there would be loans available for investment. Again, there is substantial variation in ROI for every level of loan availability. Any strategy below the LendingClub Index ROI line was strictly inferior.

P2P Lending performance analysis

4 Conclusion

Apples-to-apples comparisons of strategy performance can help investors make more informed decisions about how to deploy their funds. In addition to ROI, one should consider portfolio risk as well as volume produced by the strategy.

Further, a credit-model approach to selecting notes—such as P2P-Picks— appears on balance to be superior to filter-based approaches, which from the first figure can be seen to have lower returns and more risk. One reason for this is that filter-based strategies are necessarily too restrictive in terms of how they identify “good” loans. Many borrower attributes are quite fine grained and small movements in them only affect the likelihood of default a little bit. Having an absolute and arbitrary cutoff (e.g., zero inquiries) when many other borrower attributes look positive may reject an excellent investment opportunity.

The above post is a summary of Bryce’s research. You can access the complete report here.

[Update: Based on the feedback in the comments Bryce added two more quarters of data to the Risk vs Reward chart. The ROI picture has changed quite a bit – most likely because the sample size is now larger and one or two charged off loans have a less dramatic impact.]

Notify of
Newest Most Voted
Inline Feedbacks
View all comments
May. 8, 2013 8:27 pm

Nice research Bryce, and thank you for including my criteria in your statistical analysis. Out of curiosity, which of my two criteria did you utilize? Same question in regards to Peter’s criteria where he has multiple options.

I do know that Brave New Life has seen quite a following with his criteria so it is quite surprising to see how low his ranks in your analysis.

Bryce Mason
May. 8, 2013 8:40 pm

If an author had multiple filters, I used the union of the resultant sets. Originally I had plotted them separately but after an iteration or two, I merged them.

May. 8, 2013 8:50 pm
Reply to  Bryce Mason

What was the reason you merged them? And would you be able to provide any of those separately plotted results? I am just curious how my two filters fared separately, and I am sure Peter would also like to see how his filters would have individually fared.

Bryce Mason
May. 8, 2013 9:09 pm

Feel free to contact me and we can work it out. My recollection is that they were similar.

B. Mason
May. 8, 2013 9:54 pm

1) Multiple filters within author were combined. If the filters identified the same loan multiple times, I only invested in it once.

2) Combining your filters in particular was useful because (a) the performance we not too far apart, and (b) individually they only produced 16 (lo) and 19 (hi) notes each. I felt compelled to aggregate them on a sample size basis.

I have emailed you the spreadsheet with the raw payment data.


May. 9, 2013 6:46 am
Reply to  B. Mason

Thanks Bryce! Combining them obviously made sense given the low number of notes selected.

May. 9, 2013 8:48 am

According to the first graph, you have my filter (Brave New Life) clearly with the highest risk and lowest return, a bad combination.

But where’s your data to support this? I have 22 months of data to show in real life (not based on past theoretical performance) that I’m returning 12.59%. Analysis of my filter on shows similar results. And yet, your graph indicates that my filter will return less than 2%? If you’re trying to be factual and analytical, it seems to me that this is a discrepancy that you need to explain if you’re taking yourself seriously and not just trying to sell a service.

Then again, this post is an advertisement for a paid service while I’m giving away my filter and real results for free. So it’s not surprising that the P2P-Picks results would be lowest risk, highest return with no data to back it up.

Also, the second graph is mostly pointless. My filter has no problem keeping 400+ loans active, and I suspect it could easily be 1000+ loans. (For example, this morning I had $100 to reinvest and 12 loans to choose from). At that point, diversification (the whole reason for having so many different loans) is long past met. If you need to scale higher, then you can invest more per loan. There is no need to continue to invest at $25/loan. Even at $10K, investing at $25/loan is over-diversified.

I appreciate what this article attempted to accomplish, but it seems to me that it’s flawed in that it’s not consistent with reality and makes no attempt to explain that discrepancy.

Investor Junkie
May. 9, 2013 9:49 am
Reply to  BNL

I agree with BNL. If you have more than $10,000 you should consider $50 notes. The only issue then becomes trying to sell larger notes on the secondary market.

Larry Ventura
Larry Ventura
May. 9, 2013 10:04 am

I keep hearing that people are having trouble selling $100 notes on Folio. I have seen no statistical proof of this in my account, which is about 50% $100 notes, and 50% $25/50. I did see a noticeable reduction in “sellability” when i went to $200 notes.

Investor Junkie
May. 9, 2013 10:10 am
Reply to  Larry Ventura

It is obvious though if you want the most in liquidity, the $25 is still the best note size to use. I don’t know the stats of investment size, but I assume most LC investors only have $1-2k invested, if that. At least when I see anyone mention their account online. Most individuals are still testing the waters with LC and also many individuals don’t have much money to invest.

Bryce M.
May. 9, 2013 10:55 am

The note size would not have made any difference in the ROI or risk calculations. It was arbitrary–the only important thing was to keep it constant.

Bryce M.
May. 9, 2013 10:38 am
Reply to  BNL

I appreciate the critique. All readers should bear in mind that period studied was the a bulk of the Great Recession. Strategies may have performed worse during that time period than today. However, the method is transparent and I would be pleased to share copies of the scripts and original data sets for examination–consistent with APA standards. Or simply a list of the loan IDs that were used based on the published filter criteria. Anyone can reconstruct these results.

May. 10, 2013 11:10 am
Reply to  Bryce M.

I’d be interested in seeing the data. Heck, if I’m missing something I want to know sooner rather than later.

Regarding the period you studied, that does make some sense. However I think the bigger issue with the statistical analsysis is that the time period is so short. That’s not a flaw you could control, of course, since P2P is so new. With the stock market it’s easy to backtest 100 years of data when creating a personal investment strategy.

Bryce M.
May. 11, 2013 7:28 am
Reply to  BNL

Please just email me from here, or via personal message on the forum. We can figure out what you want and I’ll send it your way.

May. 9, 2013 9:16 am

One more comment on this Bryce: Why did you only run through 2009 Q3? I am curious why you didn’t utilize data through Q1 2010 given those loans have fully matured at the time of analysis as we are comparing using 36-month notes.

Using NickelSteamroller, there were 4,325 loans issued in the five quarters compared versus 8,335 over the seven quarters I’ve just mentioned. Given that my criteria (combined) only had 35 notes from the original five quarters, it would be interesting to see this data expanding to included the additional six months.

Five quarters:
Seven quarters:

Bryce M.
May. 9, 2013 10:41 am

There are two reasons. First was that the original payment file I had was constructed in December 2012. Second was that when I got an updated payment history file reflective of April 2013, I felt that allowing late payments more time to come in was better reflective of returns. I could have included 1 more quarter of loans, but at the cost of not having as much payment trickle. It wouldn’t be hard to add this quarter in, if people want.

May. 9, 2013 10:49 am
Reply to  Bryce M.

I would encourage you to update the research to maximize the possible conflicts with small sample size. As you have previously when discussing P2P-picks, at minimum 100 notes should be examined.

Bryce M.
May. 9, 2013 10:58 am

With the April file in hand, I’d be happy to include the extra quarter. Every strategy would be under the same disadvantage, I suppose, of not having the extra time for payment trickle.

I 100% agree on the small sample size. During this time period some strategies produced just 30-50 notes (including my own 1%). An extra quarter might bump that up quite a bit. On the to-do.

Investor Junkie
May. 9, 2013 9:42 am

While I like the theme of this post, I too feel the research is flawed. According to the graphs I should have sub 4% returns while my ROI is around 10%. I’ve been doing Lending Club for over 3 years. So I’m not sure I understand the big discrepancy in real returns from your estimated.

In addition, your graphs put my returns as one of the highest risk measure (not sure I understand the detail of this), but yet my quarterly results I documented showed very little variance in returns. Is that what you are determining (beta) is risk? If not what is it?

Investor Junkie
May. 9, 2013 9:54 am

Could it be because in the pool of all loans available at one time, there is additional selection?

Meaning if I have $50 to invest at $25 each I need to pick two loans. If my filter I have a pool of say 15 to choose from I’ll pick which ones I believe are the best. This has some subjective qualities, but I do try to be objective in my filtering. The issue is with my existing filtering is the limits on Lending Club’s filtering options. Lending Club does not allow for me to filter more specifically to my liking.

I assume this research doesn’t take either the dripping of new money to invest, or the sub-selection of loans to invest at any given time.

The estimated and real returns are so far off to consider this research even close to valid.

Bryce M.
May. 9, 2013 10:52 am

Yes, if one used additional factors to refine based on the choices provided by the initial filter, then the results may differ. I simply used every loan produced by the published strategy.

Investor Junkie
May. 9, 2013 10:32 am

Ok I submitted my latest data to service to get true ROI. Here’s my returns per year:

2009: 7.86%
2010: 8.44%
2011: 9.93%
2012: 10.87%
2013: 14.91%

Overall currently 9.66% ROI. So even just including 2009 in comparison, how is the data almost 400 basis points lower than my real return?

Bryce M.
May. 9, 2013 10:47 am

Here are some reasons why it would vary.

1) Did you invest in every single note that your strategy would have predicted during the time period under study?

2) Did your actions differ from that of the stated filter? All I could go on was what was published on your blog.

3) Your real world returns are including the effects of compounding, which was not considered here.

Investor Junkie
May. 9, 2013 10:52 am
Reply to  Bryce M.

I believe #1 is the biggest flaw in comparison to estimated to real returns. No I did not, nor could I.

#2 is possible, as there was some variance and my filtering has been modified since I first started. Though from my back testing new returns should be better not worse.

#3 for 2009 results that shouldn’t be a factor. At least to make it statistically significant.

Bryce M.
May. 9, 2013 10:43 am

Recall that the past returns were reflective of the IRR for the period studied—of that during the Great Recession. Returns today may very well be much higher.

One can calculate risk in a number of ways. Many finance people use the variance in monthly returns (like you suggest). I chose a simpler method that was easily calculable from the data at hand that should correlate well.

Investor Junkie
May. 9, 2013 10:02 am

Your research paper states:

“For this article, we examine note-selection strategy performance on LendingClub loans issued between 1 July 2008 and 30 September 2009. These
five quarters saw about 4200 3-year loans issued. We did not examine loans
prior to this period in order to have a training data set on which to build
the P2P-Picks risk model, and we did not examine loans after this period
in order to give the chosen loans sufficient time to mature fully and receive
most late payments.”

Yet this is May 2013? You included a time I just started investing in Lending Club, in addition to one of the worst periods in economic history. In addition, while you state loans didn’t have time to fully mature, I already have loans that have matured from a later period. So why not go to May 2010?? Or at least say Jan 2010?

Bryce M.
May. 9, 2013 10:49 am

There are two reasons. First was that the original payment file I had was constructed in December 2012. Second was that when I got an updated payment history file reflective of April 2013, I felt that allowing late payments more time to come in was better reflective of returns. I could have included 1 more quarter of loans, but at the cost of not having as much payment trickle. It wouldn’t be hard to add this quarter in, if people want.

May. 9, 2013 10:26 am
Reply to  Peter Renton

I agree Peter. Spouting one’s NAR or XIRR return is irrelevant if it includes less than fully matured notes as an example as to why things are wrong doesn’t make sense. To get a true apples to apples you need to compare fully matured loans, or loans from a similar time period, which is what Bryce has done.

As I’ve expressed above, sample size and length of the testing period to me are more important questions to be examined.

Dan B
Dan B
May. 9, 2013 3:19 pm

You would think that I’d have a (long winded) comment on an article of this type, but no, as I don’t feel it necessary to comment/analyse every article………………….or semi-infomercial.

All I’m going to add at this point is to reiterate that some of the criticisms here are coming from investors who have long term above average real world returns. In my opinion the shortcomings etc. that are being pointed out by such investors should not be taken too lightly.

Bryce Mason
May. 9, 2013 8:08 pm

For what it’s worth, I’ve recreated the analysis with two extra quarters of data. It about doubled the number of loans. Will compile them into graphics and send Peter’s way. The lowest roi strategies came up a good deal, thanks to being out of the Great Recession by 2010. With the increased volume, risk measure became tighter overall.

May. 9, 2013 8:53 pm
Reply to  Bryce Mason

Thanks for the update Bryce, and thank you Peter for updating the post!

May. 9, 2013 9:01 pm
Reply to  Peter Renton

Thanks Peter! I was completely stunned and shocked to see that the results turned out the way they did!

Bryce M.
May. 9, 2013 9:22 pm

The more quarters that are included the more possible that the data are being over fit on the training data sets. To be perfectly comparable, all authors would need to take pre 2008 Q3 loans and devise the best filter they could. But, I’m still pretty satisfied with the analysis. Trying to make picks at higher volume is quite the task.

If people have further comments I’ll keep checking for a few days. The analysis is fairly automated at this point.

May. 10, 2013 12:59 pm


This could be a bit of a messy endeavor, but I’m curious how far forward towards current you can bring your analysis. I am currently invested around 90% in 60 month loans (most common for E-G). Most of my 36s are from the D grades. The default rate drops dramatically after about 25% of the principal is paid off for 60 month notes. If you were able to use all loans where the principal balance is 75% or less, I think we may be able to draw a reasonable conclusion as to where the current strategies are going. I realize the 36 month loans will throw that off because they are much most steady in their default rate throughout the loan life. Also with LC changing up rules, I don’t even have “LEARNING & TRAINING” loans as an option to screw up my returns :-), or factoring in the period without 98%+ utilization loans, etc… Now we have major derogatory in the mix (my stats show these to be on par as those without thus far).

PS: I have no background in statistics, just a guy with Excel & Pivot Tables to do some damage 🙂

Bryce M.
May. 10, 2013 11:13 pm
Reply to  Kowser

One could probably move it forward a bit, but I’m not going to bother until things mature. I might start modeling 60 month loans early just to get a head start on it.

May. 10, 2013 7:34 pm

So…. with the larger testing set the performance of all the models improved except for Bryce’s.

Bryce M.
May. 10, 2013 11:12 pm
Reply to  RJ

Reasonable expectation, given that I used no future information in my choices, and all other filters were built using lendstats (probably) in 2011 to 2012, which included the test period. The fact I’m ahead of any of them says something.

Dan B
Dan B
May. 11, 2013 5:55 am
Reply to  Bryce M.

Well sure it does say “something”………….I’m just not sure that it is saying the identical thing to all of us. Speaking as a neutral, who frankly couldn’t care less who among you finish top or bottom of any ranking :), ………………the manner in which this article has been presented to readers also “says something” about the presenter. Now on this point, I’m certain that it is not saying the same thing to me as it is to you! Trust me, you’ll want to take my word on that.

Bryce M.
May. 11, 2013 7:26 am
Reply to  Dan B

I’m sure you’re right, Dan! One must do a little promotion.

But I think most reasonable people would also see the openness in the methods and willingness to respond to questions and rerun analyses also says something. There is a point beyond self promotion which ought to resound with people—nobody is comparing these strategies on an equal footing.