Leveraging crowd knowledge to curate documentation for agile software industry using deep learning and expert ranking,Multimedia Systems

当前位置： X-MOL 学术 › Multimedia Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Leveraging crowd knowledge to curate documentation for agile software industry using deep learning and expert ranking
Multimedia Systems ( IF 3.5 ) Pub Date : 2021-01-25 , DOI: 10.1007/s00530-020-00741-x
Akshi Kumar

Agility gives the ability to be highly responsive to changes and make improvements in the way the agile project is documented. The just-in-time and just barely good enough documentation may miss out on important executable specifications. Interestingly, the frequently asked ‘how-to-do’ questions on the popular question answering (Q&A) websites like stack overflow are strong indicators of gaps in documentation and respondent answers can complement conventional software documentation practices. Social interaction within these QA websites generates partially structured content commonly referred to as crowd knowledge that can offer a peer-reviewed re-documentation by integrating the answers of these ‘how-to-do’ concerns. But finding the best, value-added answer to the question which can contribute toward an enriched and curated documentation is computationally difficult. Moreover, query duplicates can cause seekers to spend more time finding these best answers. As a solution, the research proffers a novel question-answering crowd documentation model (QACDoc) which is based on a socially mediated documentation mechanics involving the dynamics of community-based web. Firstly, duplicate questions are detected using Siamese neural architecture where two identical hierarchical attention networks are used to generate vectors for similarity matching. Semantic matching is done using Manhattan distance function, and a multi-layer perceptron is trained to output the predictions. Next, all respondents of semantically matched questions are grouped to form an intent-based crowd and lastly, top k-experts are identified using representative social presence features for expert ranking. The crowd documentation is then filtered to only include answers of these identified experts.

中文翻译：

利用人群的知识通过深度学习和专家排名为敏捷软件行业策划文档

敏捷性使人们能够对变更做出高度响应，并以敏捷项目的记录方式进行改进。及时且勉强够好的文档可能会错过重要的可执行文件规范。有趣的是，在常见问题解答（Q＆A）网站上经常被问到的“如何做”问题（如堆栈溢出）是文档差距的有力指标，而受访者的回答可以补充传统软件文档实践。这些质量检查网站中的社交互动产生了部分结构化的内容，通常称为人群知识，这些知识可以通过整合这些“如何做”问题的答案来提供经同行评审的重新记录。但是找到最好的该问题的增值答案可能会导致文档内容丰富和编排困难，这在计算上是困难的。而且，查询重复会导致搜索者花费更多的时间来找到这些最佳答案。作为解决方案，该研究提供了一种新颖的问答人群文档模型（QACDoc），该模型基于社交媒体的文档机制，涉及基于社区的网络的动态变化。首先，使用暹罗神经体系结构检测重复的问题，其中使用两个相同的分层注意力网络生成用于相似度匹配的向量。使用曼哈顿距离函数进行语义匹配，并训练多层感知器以输出预测。接下来，将所有语义匹配问题的受访者分组，形成一个基于意图的人群，最后，查询重复项可能会使搜寻者花费更多时间寻找这些最佳答案。作为解决方案，该研究提供了一种新颖的问答式人群文档模型（QACDoc），该模型基于社交媒体的文档机制，涉及基于社区的网络的动态变化。首先，使用暹罗神经体系结构检测重复的问题，其中使用两个相同的分层注意力网络生成用于相似度匹配的向量。使用曼哈顿距离函数进行语义匹配，并训练多层感知器以输出预测。接下来，将所有语义匹配问题的受访者分组，形成一个基于意图的人群，最后，查询重复项可能会使搜寻者花费更多时间寻找这些最佳答案。作为解决方案，该研究提供了一种新颖的问答人群文档模型（QACDoc），该模型基于社交媒体的文档机制，涉及基于社区的网络的动态变化。首先，使用暹罗神经体系结构检测重复的问题，其中使用两个相同的分层注意力网络生成用于相似度匹配的向量。使用曼哈顿距离函数进行语义匹配，并训练多层感知器以输出预测。接下来，将所有语义匹配问题的受访者分组，形成一个基于意图的人群，最后，该研究提供了一种新颖的问答人群文档模型（QACDoc），该模型基于社交媒体的文档机制，涉及基于社区网络的动态。首先，使用暹罗神经体系结构检测重复的问题，其中使用两个相同的分层注意力网络生成用于相似度匹配的向量。使用曼哈顿距离函数进行语义匹配，并训练多层感知器以输出预测。接下来，将所有语义匹配问题的受访者分组，形成一个基于意图的人群，最后，该研究提供了一种新颖的问答人群文档模型（QACDoc），该模型基于社交媒体的文档机制，涉及基于社区网络的动态。首先，使用暹罗神经体系结构检测重复的问题，其中使用两个相同的分层注意力网络生成用于相似度匹配的向量。使用曼哈顿距离函数进行语义匹配，并训练多层感知器以输出预测。接下来，将所有语义匹配问题的受访者分组，形成一个基于意图的人群，最后，使用暹罗神经体系结构检测重复的问题，其中使用两个相同的分层注意网络来生成用于相似度匹配的向量。使用曼哈顿距离函数进行语义匹配，并训练多层感知器以输出预测。接下来，将所有语义匹配问题的受访者分组，形成一个基于意图的人群，最后，使用暹罗神经体系结构检测重复的问题，其中使用两个相同的分层注意网络来生成用于相似度匹配的向量。使用曼哈顿距离函数进行语义匹配，并训练多层感知器以输出预测。接下来，将所有语义匹配问题的受访者分组，形成一个基于意图的人群，最后，使用具有代表性的社交状态功能来识别k专家，以进行专家排名。然后过滤人群文档，仅包括这些已确定专家的答案。

更新日期：2021-01-28

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11