Skip to main content
Log in

A TextCNN and WGAN-gp based deep learning frame for unpaired text style transfer in multimedia services

  • Special Issue Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

With the rapid growth of big multimedia data, multimedia processing techniques are facing some challenges, such as knowledge understanding, semantic modeling, feature representation, etc. Hence, based on TextCNN and WGAN-gp (improved training of Wasserstein GANs), a deep learning framework is suggested to improve the efficiency of discriminating the specific style features and the style-independent content features in unpaired text style transfer for multimedia services. To redact a sentence with the requested style and preserve the style-independent content, the encoder-decoder framework is usually adopted. However, lacking of same-content sentence pairs with different style for training, some works fail to capture the original content and generate satisfied style properties accurately in the transferred sentences. In this paper, we adopt TextCNN to extract the style features in the transferred sentences, and align the style features with the target style label by the generator (encoder and decoder). Meanwhile, WGAN-gp is utilized subtly to preserve the content features of original sentences. Experiments demonstrate that the performances of our framework on automatic evaluation and human evaluation are much better than the former works. Thus, it provides an effective method for unpaired text style transfer in multimedia services.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Li, B., Peng, Z., Hou, P., He, M., Anisetti, M., Jeon, G.: Reliability and capability based computation offloading strategy for vehicular ad hoc clouds. Journal of cloud computing 1(21), 1–14 (2019)

    Google Scholar 

  2. He, M., Guan, Z., Bao, L., Zhou, Z., Anisetti, M., Damiani, E., Jeon, G.: Performance analysis of a polling based access control combining with sleeping schema in v2i vanets for smart cities. Sustainability 11(2), 503 (2019)

    Article  Google Scholar 

  3. Li, B., He, M., Wu, W., Sangaiah, A.K., Jeon, G.: Computation offloading algorithm for arbitrarily divisible applications in 5g mobile edge computing environments. Sustainability 10(5), 1611 (2018)

    Article  Google Scholar 

  4. Ahmed, I., Din, S., Jeon, G., Piccialli, F.: Exploring deep learning models for overhead view multiple object detection. IEEE Internet of Things Journal 7(7), 5737–5744 (2020)

    Article  Google Scholar 

  5. Ma, L., Wu, J., Zhang, J., Wu, Z., Jeon, G., Zhang, Y.: Research on sea clutter reflectivity using deep learning model in industry 4.0. IEEE transactions on industrial informatics 16(9), 5929–5937 (2020)

    Article  Google Scholar 

  6. Chaves, E., Gonagalves, C.B., Albertini, M.K., Lee, S., Jeon, G., Fernandes, H.: Evaluation of transfer learning of pre-trained CNNS applied to breast cancer detection on infrared images. Applied Optics 59(17), E23–E28 (2020)

    Article  Google Scholar 

  7. Shen, T., Lei, T., Barzilay, R., Jaakkola, T.: Style transfer from non-parallel text by cross-alignment. In: Advances in neural information processing systems, pp. 6830–6841 (2017)  

    Google Scholar 

  8. Hu, Z., Yang, Z., Liang, X., Salakhutdinov, R., Xing, E.P.: Toward controlled generation of text. In: International conference on machine learning, pp. 1587–1596 (2017)

    Google Scholar 

  9. Tian, Y., Hu, Z., Yu, Z.: Structured content preservation for unsupervised text style transfer. In: arXiv preprint arXiv:1810.06526 (2018)

  10. Li, J., Jia, R., He, H., Liang, P.: Delete, retrieve, generate: a simple approach to sentiment and style transfer. In: Meeting of the association for computational linguistics, pp. 1865–1874 (2018)

  11. Xu, J., Sun, X., Zeng, Q., Zhang, X., Ren, X., Wang, H., Li, W.: Unpaired sentiment-to-sentiment translation: a cycled reinforcement learning approach. In: Meeting of the association for computational linguistics, pp. 979–988 (2018)

  12. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Meeting of the association for computational linguistics, pp. 311–318 (2002)

  13. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in neural information processing systems, pp. 2672–2680 (2014)

  14. Jang, E., Gu, S., Poole, B.: Categorical reparameterization with gumbel-softmax. In: International conference on learning representations (2017)

  15. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: IEEE conference on computer vision and pattern recognition, pp. 818–833 (2013)

  16. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: IEEE Conference on computer vision and pattern recognition (2014)

  17. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp. 1097–1105 (2012)

  18. Kim, Y.: Convolutional neural networks for sentence classification. In: Empirical methods in natural language processing, pp. 1746–1751 (2014)

  19. Zhang, Y., Sun, X., Xu, J., Yang, P., Ren, X.: Learning sentiment memories for sentiment modification without parallel data. In: Empirical methods in natural language processing, pp. 1103–1108 (2018)

  20. John, V., Mou, L., Bahuleyan, H., Vechtomova, O.: Disentangled representation learning for non-parallel text style transfer. In: Meeting of the association for computational linguistics, pp. 424–434 (2019)

  21. Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: International conference on computer vision, pp. 2242–2251 (2017)

  22. Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training gans. In: Advances in neural information processing systems, pp. 2226–2234 (2016)

  23. Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: Infogan: interpretable representation learning by information maximizing generative adversarial nets. In: Advances in neural information processing systems, pp. 2180–2188 (2016)

  24. Kusner, M.J., Hernandezlobato, J.M.: Gans for sequences of discrete elements with the gumbel-softmax distribution. arXiv preprint arXiv:1611.04051 (2016)

  25. Yizhe Zhang Zhe Gan, L.C.: Generating text via adversarial training. In: NIPS 2016 workshop on adversarial training (2016)

  26. Yu, L., Zhang, W., Wang, J., Yu, Y.: Seqgan: sequence generative adversarial nets with policy gradient. In: AAAI conference on artificial intelligence, pp. 2852–2858 (2016)

  27. Zhao, J., Kim, Y., Zhang, K., Rush, A.M., Lecun, Y.: Adversarially regularized autoencoders. In: International conference on machine learning, pp. 9405–9420 (2017)

  28. Martin Arjovsky Soumith Chintala, L.B.: Wasserstein gan. In: International conference on machine learning (2017)

  29. Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.: Improved training of wasserstein gans. In: Advances in neural information processing systems, pp. 5769–5779 (2017)

  30. Cho, K., Van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using rnn encoder-decoder for statistical machine translation. In: Empirical methods in natural language processing, pp. 1724–1734 (2014)

  31. Li, D., Zhang, Y., Gan, Z., Cheng, Y., Brockett, C., Dolan, B., Sun, M.: Domain adaptive text style transfer. In: International joint conference on natural language processing, pp. 3302–3311 (2019)

  32. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: International conference on learning representations (2014)

  33. Grandvalet, Y., Bengio, Y.: Semi-supervised learning by entropy minimization. In: Advances in neural information processing systems, pp. 529–536 (2004)

  34. Gan, C., Gan, Z., He, X., Gao, J., Deng, L.: Stylenet: generating attractive visual captions with styles. In: IEEE conference on computer vision and pattern recognition, pp. 955–964 (2017)

  35. Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: International conference on learning representations (2015)

  36. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Empirical methods in natural language processing, pp. 1532–1543 (2014)

  37. Fu, Z., Tan, X., Peng, N., Zhao, D., Yan, R.: Style transfer in text: Exploration and evaluation. In: AAAI conference on artificial intelligence, pp. 663–670 (2018)

Download references

Acknowledgements

The authors would like to thank the editors and the reviewers for their insightful advice on the paper.

Funding

This work is supported by the Science and Technology Plan of Yunnan Province (No. 2014AB016).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Min He.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, M., He, M., Su, W. et al. A TextCNN and WGAN-gp based deep learning frame for unpaired text style transfer in multimedia services. Multimedia Systems 27, 723–732 (2021). https://doi.org/10.1007/s00530-020-00714-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-020-00714-0

Keywords

Navigation