Image Privacy Prediction Using Deep Neural Networks,ACM Transactions on the Web

当前位置： X-MOL 学术 › ACM Trans. Web › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Image Privacy Prediction Using Deep Neural Networks
ACM Transactions on the Web ( IF 2.6 ) Pub Date : 2020-04-09 , DOI: 10.1145/3386082
Ashwini Tonge ₁ , Cornelia Caragea ₂

Affiliation

Images today are increasingly shared online on social networking sites such as Facebook, Flickr, and Instagram. Image sharing occurs not only within a group of friends but also more and more outside a user’s social circles for purposes of social discovery. Despite that current social networking sites allow users to change their privacy preferences, this is often a cumbersome task for the vast majority of users on the Web, who face difficulties in assigning and managing privacy settings. When these privacy settings are used inappropriately, online image sharing can potentially lead to unwanted disclosures and privacy violations. Thus, automatically predicting images’ privacy to warn users about private or sensitive content before uploading these images on social networking sites has become a necessity in our current interconnected world. In this article, we explore learning models to automatically predict appropriate images’ privacy as private or public using carefully identified image-specific features. We study deep visual semantic features that are derived from various layers of Convolutional Neural Networks (CNNs) as well as textual features such as user tags and deep tags generated from deep CNNs. Particularly, we extract deep (visual and tag) features from four pre-trained CNN architectures for object recognition, i.e., AlexNet, GoogLeNet, VGG-16, and ResNet, and compare their performance for image privacy prediction. The results of our experiments obtained on a Flickr dataset of 32,000 images show that ResNet yeilds the best results for this task among all four networks. We also fine-tune the pre-trained CNN architectures on our privacy dataset and compare their performance with the models trained on pre-trained features. The results show that even though the overall performance obtained using the fine-tuned networks is comparable to that of pre-trained networks, the fine-tuned networks provide an improved performance for the private class. The results also show that the learning models trained on features extracted from ResNet outperform the state-of-the-art models for image privacy prediction. We further investigate the combination of user tags and deep tags derived from CNN architectures using two settings: (1) Support Vector Machines trained on the bag-of-tags features and (2) text-based CNN. We compare these models with the models trained on ResNet visual features and show that, even though the models trained on the visual features perform better than those trained on the tag features, the combination of deep visual features with image tags shows improvements in performance over the individual feature sets. We also compare our models with prior privacy prediction approaches and show that for private class, we achieve an improvement of ≈ 10% over prior CNN-based privacy prediction approaches. Our code, features, and the dataset used in experiments are available at https://github.com/ashwinitonge/deepprivate.git.

中文翻译：

使用深度神经网络的图像隐私预测

如今，图像越来越多地在 Facebook、Flickr 和 Instagram 等社交网站上在线共享。图像共享不仅发生在一群朋友内，而且越来越多地发生在用户的社交圈之外，以进行社交发现。尽管当前的社交网站允许用户更改他们的隐私偏好，但这对于 Web 上的绝大多数用户来说往往是一项繁琐的任务，他们在分配和管理隐私设置方面面临困难。当这些隐私设置使用不当时，在线图像共享可能会导致不必要的披露和隐私侵犯。因此，在将这些图像上传到社交网站之前，自动预测图像的隐私以警告用户隐私或敏感内容已成为我们当前互联世界的必要条件。私人的要么民众使用仔细识别的图像特定特征。我们研究从卷积神经网络 (CNN) 的各个层派生的深度视觉语义特征，以及从深度 CNN 生成的用户标签和深度标签等文本特征。特别是，我们从四个用于对象识别的预训练 CNN 架构中提取深度（视觉和标签）特征，即 AlexNet、GoogLeNet、VGG-16 和 ResNet，并比较它们在图像隐私预测方面的性能。我们在包含 32,000 张图像的 Flickr 数据集上获得的实验结果表明，ResNet 在所有四个网络中为这项任务提供了最好的结果。我们还在我们的隐私数据集上微调了预训练的 CNN 架构，并将它们的性能与在预训练特征上训练的模型进行了比较。结果表明，尽管使用微调网络获得的整体性能与预训练网络相当，但微调网络为私有类提供了改进的性能。结果还表明，在从 ResNet 中提取的特征上训练的学习模型在图像隐私预测方面优于最先进的模型。我们使用两种设置进一步研究了从 CNN 架构派生的用户标签和深度标签的组合：（1）在标签袋特征上训练的支持向量机和（2）基于文本的 CNN。我们将这些模型与在 ResNet 视觉特征上训练的模型进行比较，结果表明，即使在视觉特征上训练的模型比在标签特征上训练的模型表现更好，深度视觉特征与图像标签的结合显示出对单个特征集的性能改进。我们还将我们的模型与之前的隐私预测方法进行了比较，并表明对于私有类，我们比之前基于 CNN 的隐私预测方法实现了约 10% 的改进。我们在实验中使用的代码、功能和数据集可在 https://github.com/ashwinitonge/deepprivate.git 获得。

更新日期：2020-04-09

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11