当前位置: X-MOL 学术Br. J. Math. Stat. Psychol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Feature selection in feature network models: finding predictive subsets of features with the Positive Lasso.
British Journal of Mathematical and Statistical Psychology ( IF 1.5 ) Pub Date : 2008-05-17 , DOI: 10.1348/000711006x119365
Laurence E Frank 1 , Willem J Heiser
Affiliation  

A set of features is the basis for the network representation of proximity data achieved by feature network models (FNMs). Features are binary variables that characterize the objects in an experiment, with some measure of proximity as response variable. Sometimes features are provided by theory and play an important role in the construction of the experimental conditions. In some research settings, the features are not known a priori. This paper shows how to generate features in this situation and how to select an adequate subset of features that takes into account a good compromise between model fit and model complexity, using a new version of least angle regression that restricts coefficients to be non-negative, called the Positive Lasso. It will be shown that features can be generated efficiently with Gray codes that are naturally linked to the FNMs. The model selection strategy makes use of the fact that FNM can be considered as univariate multiple regression model. A simulation study shows that the proposed strategy leads to satisfactory results if the number of objects is less than or equal to 22. If the number of objects is larger than 22, the number of features selected by our method exceeds the true number of features in some conditions.

中文翻译:

特征网络模型中的特征选择:使用正套索找到特征的预测子集。

一组功能是通过功能网络模型(FNM)实现的邻近数据的网络表示的基础。特征是二进制变量,用于表征实验中的对象,并以某种程度的接近度作为响应变量。有时特征是理论提供的,并且在实验条件的构建中起着重要作用。在某些研究环境中,这些功能不是先验的。本文展示了如何在这种情况下生成特征,以及如何使用新版本的最小角度回归(将系数限制为非负值)来选择适当的特征子集,从而充分考虑了模型拟合和模型复杂性之间的折衷关系,称为正套索。将显示,可以使用自然链接到FNM的格雷码有效地生成要素。模型选择策略利用了FNM可被视为单变量多元回归模型的事实。仿真研究表明,如果对象数量小于或等于22,则所提出的策略将获得令人满意的结果。如果对象数量大于22,则我们的方法选择的特征数量超过了真实的特征数量。一些条件。
更新日期:2019-11-01
down
wechat
bug