当前位置: X-MOL 学术Big Data › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Data-Driven Investment Strategies for Peer-to-Peer Lending: A Case Study for Teaching Data Science.
Big Data ( IF 2.6 ) Pub Date : 2018-09-01 , DOI: 10.1089/big.2018.0092
Maxime C Cohen 1 , C Daniel Guetta 2 , Kevin Jiao 1 , Foster Provost 1
Affiliation  

Abstract We develop a number of data-driven investment strategies that demonstrate how machine learning and data analytics can be used to guide investments in peer-to-peer loans. We detail the process starting with the acquisition of (real) data from a peer-to-peer lending platform all the way to the development and evaluation of investment strategies based on a variety of approaches. We focus heavily on how to apply and evaluate the data science methods, and resulting strategies, in a real-world business setting. The material presented in this article can be used by instructors who teach data science courses, at the undergraduate or graduate levels. Importantly, we go beyond just evaluating predictive performance of models, to assess how well the strategies would actually perform, using real, publicly available data. Our treatment is comprehensive and ranges from qualitative to technical, but is also modular—which gives instructors the flexibility to focus on specific parts of the case, depending on the topics they want to cover. The learning concepts include the following: data cleaning and ingestion, classification/probability estimation modeling, regression modeling, analytical engineering, calibration curves, data leakage, evaluation of model performance, basic portfolio optimization, evaluation of investment strategies, and using Python for data science.

中文翻译:

对等贷款的数据驱动投资策略:教学数据科学案例研究。

摘要我们开发了许多数据驱动的投资策略,这些策略演示了如何使用机器学习和数据分析来指导对等贷款的投资。我们从从对等借贷平台获取(真实)数据开始,一直到基于各种方法的开发和评估投资策略,都详细介绍了该过程。我们专注于在实际业务环境中如何应用和评估数据科学方法以及由此产生的策略。本文介绍的材料可供在本科或研究生级别教授数据科学课程的讲师使用。重要的是,我们不仅仅评估模型的预测性能,还使用真实的,公开可用的数据评估策略实际执行的效果。我们的处理是全面的,从定性到技术,但也是模块化的,这使教师可以灵活地专注于案例的特定部分,具体取决于他们想要涵盖的主题。学习概念包括以下内容:数据清理和提取,分类/概率估计模型,回归模型,分析工程,校准曲线,数据泄漏,模型性能评估,基本投资组合优化,投资策略评估以及使用Python进行数据科学。
更新日期:2018-09-01
down
wechat
bug