Why Number of Inquiries is My Favorite P2P Lending Filter

Long time readers of this blog will know about my love of the number of inquiries filter. I have talked about it a great deal but I have never explained in detail why I use this as my primary filter when choosing loans.

Today, I will give you some analysis that will back up my claims about this filter. But first let’s explain what I mean by number of inquiries. This is something that is recorded on everyone’s credit report and is used by Lending Club and Prosper in their underwriting. When you apply for a credit card, a car loan, a home loan or almost any other kind of loan your credit report gets marked with an inquiry. Any loan application where they do what is called a hard pull of your credit will stay on your credit report for six months.

What is interesting to me is that when you filter loans on the number of inquiries you find that the more inquiries a borrower has the more likely they are to default and the lower your expected ROI is going to be. This rings true for both Lending Club and Prosper borrowers. Here are the numbers for Lending Club. Keep in mind I am only looking at data from July 2009 onwards and, as always, I used data from Lendstats to create these charts.

Lending Club Analysis

All Loans ROIAll Loans Default RateGrade D-G ROIGrade D-G Default Rate
Inquiries = 07.84%1.34%9.21%2.19%
Inquiries <= 17.17%1.59%8.36%2.36%
Inquiries <= 27.00%1.71%7.99%2.54%
Inquiries <= 36.73%1.85%7.63%2.66%
Inquiries >= 43.33%4.51%N/AN/A

If you choose all loan grades on Lending Club that have zero inquiries on their credit report, according to Lendstats, you can expect a 7.84% ROI (as of this writing) on your money. Then for each additional inquiry the ROI goes down and the default rate goes up. This trend gets even more noticeable when inquiries are greater than three.

The number of credit inquiries has an even bigger difference when you look the lower grade (higher risk) loans. When taking grades D-G only on Lending Club you can see that the range is from 9.21% down to 7.63%. I didn’t include the numbers for Grades D-G for inquiries >= 4 because there were only 53 loans on the entire database so it wasn’t a large enough sample to give meaningful data.

Prosper Analysis

All loans ROIAll loans Default RateD-HR ROID-HR Default Rate
Inquiries = 010.12%2.71%15.55%3.82%
Inquiries <= 110.06%3.11%14.70%4.16%
Inquiries <= 29.98%3.23%14.18%4.27%
Inquiries <= 39.80%3.42%13.70%4.51%
Inquiries >= 47.33%6.01%9.02%6.19%

We find the same trend on Prosper (using data from July 2009 onwards) although admittedly it is not quite as marked. But for every additional inquiry that a borrower makes on their account we see the same linear progression: estimated ROI goes down and default rate goes up. And if you are investing in higher risk loans then again the difference is even more noticeable.

This Makes Logical Sense

When you think about it, this makes logical sense. If someone is out shopping for credit in several places and then they come to Lending Club or Prosper looking for a loan then they are a higher risk borrower. They may have some serious financial problems if they are shopping for a lot of credit. But if someone is looking for a loan and comes to p2p lending first then they are a better credit risk. One inquiry on a credit report doesn’t make a big difference but the numbers show that these borrowers are still a higher risk than borrowers with no inquiries at all in the past six months.

This is why I continue to make the number of inquiries a key part of my investment strategy and I encourage other investors to do the same.

[Update: several people indicated that the above charts gave an unclear indication of the differences between number of inquiries because I wasn’t isolating each inquiry number in my analysis. So in order to give a complete picture for investors I have redone the charts below with the number of inquiries isolated (as in inquiries = 1 instead of inquiries <= 1). The same trends still apply although it is not quite as linear. Please note that the charts below were created on a different day to the original charts which is why the numbers for inquiries = 0 and inquiries >= 4 are slightly different.]

Lending Club Analysis 2

All Loans ROIAll Loans Default RateGrade D-G ROIGrade D-G Default Rate
Inquiries = 07.85%1.36%9.31%2.22%
Inquiries = 15.99%2.05%7.00%2.36%
Inquiries = 26.14%2.38%6.47%3.34%
Inquiries = 33.77%3.60%4.58%3.72%
Inquiries >= 43.38%4.51%N/AN/A

Prosper Analysis 2

All loans ROIAll loans Default RateD-HR ROID-HR Default Rate
Inquiries = 010.06%2.72%15.32%3.86%
Inquiries = 19.88%3.87%13.52%4.60%
Inquiries = 29.47%3.94%11.80%4.67%
Inquiries = 36.40%6.12%8.45%6.95%
Inquiries >= 47.22%5.94%8.87%6.10%
Notify of
Newest Most Voted
Inline Feedbacks
View all comments
Charlie H
Charlie H
Jul. 27, 2011 1:20 pm

First off, I love Lendstats.
The “problem” with the filters is that there are too many levers to play with.
So far the depth of my analysis has been “If I change this, does ROI go up or down?”. So while I have played with each variable to see what happens, that is not good enough.

What is needed is a multivariant analysis or Monte Carlo simulation to find the best combination.

Anyone done anything like this?

Brian B
Brian B
Jul. 27, 2011 1:42 pm

why do you have the top columns as being =? It is very deceptive in making 4 look much worse than the others, when this is not actually true. Having 4 inquiries is nearly the same as having 3, but you would never recognize that from your charts.

“>=3” gives a result of 3.69% for all loans ROI which is nearly as bad as >=4.

“<=4" gives a result of 6.73% for all loans ROI which is identical to "<=3".

Statistics should all cover similarly directioned ranges in order to avoid this deception. Essentially you are making the 3 inquiries look much better than they are by grouping them with the zeros and making the 4's look much worse by not grouping them with zeros.

Jul. 27, 2011 1:51 pm

Great job on the analysis! Indeed there seems to be a strong correlation between the maximum # of inquiries and Default rate.

Do you think it would be possible to make another chart with the row labels: inquiries =1, inquiries =2, inquiries = 3, etc.? Or is the sample size not big enough?

That way, it becomes incredibly clear that the higher the # of inquiries a borrower has, the higher the chance of default.

Charlie H
Charlie H
Jul. 27, 2011 2:11 pm

@ Brian B
Indeed by including people with 0,1,2 in the </= 3 catagory you are giving alot of weight to the 0,1,&2 in your overal ROI calculation.

The stats look differnt if you look at =0 , = 1 , = 2 , = 3 , =4 then if you look at 0, </=1 , </=2 , /= 4

Dan B
Dan B
Jul. 27, 2011 7:55 pm

@Peter………… Considering that last week you actually cited the figure 500 as an example when discussing the Law Of Large Numbers, I’m guessing that your former Stat prof would retroactively lower your grade if he could ! 🙂

Jul. 28, 2011 7:46 am


Thanks for making the new table!

Interesting that the difference in risk of default from 1-inquiry and 0-inquiry borrowers on Lending Club is only .14%. But it’s much greater on Prosper at .74%. (yet, by looking at the ROI differences, Prosper must really compensate well for the risk.)

Jul. 28, 2011 7:47 am

I meant. Sorry, the last message I read was Dan’s and accidentally typed it

Lee B
Lee B
Jul. 28, 2011 5:32 pm

I agree with Charlie H about the need for multivariate analysis and Monte Carlo simulations. These can be used to better assess financial risk. I strongly suspect the institutional investors who are involved P2P lending are already doing this.

For those who are interested, here are some good sites discussing the Monte Carlo method:

For those that want to brush up on their statistics, check out:

Any starving graduate students in statistics or finance looking for a great thesis project? 😉

Charlie H
Charlie H
Jul. 29, 2011 8:36 am

@Lee B
I have a Medical stats back ground and am familure with MVA and MC from that perspective.

I had no idea that they were being applied to finance problems, but it makes sense since the there are so many degrees of freedom in finance problems.

Lendingstats has so much information to data mine. Doing single varriant analysis (If I change this varriable does ROI go up or down?) is useful, but it only scratches the surface.

Jul. 29, 2011 8:56 pm

Charlie H and Lee B,

I am the creator of lendstats and I’ve added a new feature which you may find very useful. Now when you change a variable, you can generate a complete analysis report that shows the ROI’s of most criterion by segments. So not only will you just see how changing one variable affects overall ROI, you can also see how it affects most other segmented criterion. This is great for noticing what the interaction effects are. Right now I have almost 200 different segments for which you can see ROI’s and I will be adding a few others.

For example this link will show you the analysis report (or as I call it at lendstats, the “Complete Performance Breakdown”).
It’s only ready for LendingClub data, I’m still working on the Prosper version (but there already is a lite version for the Prosper data).