Abstract
Style transfer in text, changing text that is written in a particular style such as the works of Shakespeare to be written in another style, currently relies on taking the cosine similarity of the sentence embeddings of the original and transferred sentence to determine if the content of the sentence, its meaning, hasn't changed. This assumes however that such sentence embeddings are style invariant, which can result in inaccurate measurements of content preservation. To investigate this we compared the average similarity of multiple styles of text from the Corpus of Diverse Styles using a variety of sentence embedding methods and find that those embeddings which are created from aggregated word embeddings are style invariant, but those created by sentence embeddings are not.
| Original language | English |
|---|---|
| Title of host publication | Data Mining: 20th Australasian Conference, AusDM 2022, Western Sydney, Australia, December 12-15, 2022, Proceedings |
| Editors | Laurence A. F. Park, Heitor Murilo Gomes, Maryam Doborjeh, Yee Ling Boo, Yun Sing Koh, Yanchang Zhao, Graham Williams, Simeon Simoff |
| Place of Publication | Singapore |
| Publisher | Springer |
| Pages | 3-14 |
| Number of pages | 12 |
| ISBN (Electronic) | 9789811987465 |
| ISBN (Print) | 9789811987458 |
| DOIs | |
| Publication status | Published - 2022 |
Bibliographical note
Publisher Copyright:© 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.