当前位置: X-MOL 学术J. R. Stat. Soc. A › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Inferring the outcomes of rejected loans: an application of semisupervised clustering
The Journal of the Royal Statistical Society, Series A (Statistics in Society) ( IF 1.5 ) Pub Date : 2019-11-14 , DOI: 10.1111/rssa.12534
Zhiyong Li 1 , Xinyi Hu 2 , Ke Li 1 , Fanyin Zhou 1 , Feng Shen 1
Affiliation  

Rejection inference aims to reduce sample bias and to improve model performance in credit scoring. We propose a semisupervised clustering approach as a new rejection inference technique. K‐prototype clustering can deal with mixed types of numeric and categorical characteristics, which are common in consumer credit data. We identify homogeneous acceptances and rejections and assign labels to part of the rejections according to the label of acceptances. We test the performance of various rejection inference methods in logit, support vector machine and random‐forests models based on data sets of real consumer loans. The predictions of clustering rejection inference show advantages over other traditional rejection inference methods. Inferring the label of the rejection from semisupervised clustering is found to help to mitigate the sample bias problem and to improve the predictive accuracy.

中文翻译:

推断拒收贷款的结果:半监督聚类的应用

拒绝推理旨在减少样本偏差并改善信用评分中的模型性能。我们提出了一种半监督聚类方法作为一种新的拒绝推理技术。ķ原型聚类可以处理数字和类别特征的混合类型,这在消费者信用数据中很常见。我们确定同质的接受和拒绝,并根据接受标签将标签分配给部分拒绝。我们基于实际消费者贷款的数据集,在logit,支持向量机和随机森林模型中测试了各种拒绝推理方法的性能。聚类拒绝推理的预测显示出优于其他传统拒绝推理方法的优势。发现从半监督聚类中推断出拒绝的标签有助于减轻样本偏差问题并提高预测准确性。
更新日期:2019-11-14
down
wechat
bug