Getting Meta: A Multimodal Approach for Detecting Unsafe Conversations within Instagram Direct Messages of Youth

Shiza Ali; Afsaneh Razi; Seunghyun Kim; Ashwaq Alsoubai; Chen Ling; Munmun De Choudhury; Pamela J. Wisniewski; Gianluca Stringhini

doi:10.1145/3579608

journal article Apr 14, 2023

Getting Meta: A Multimodal Approach for Detecting Unsafe Conversations within Instagram Direct Messages of Youth

Shiza Ali

Proceedings of the ACM on Human-Computer Interaction Vol. 7 No. CSCW1 pp. 1-30 · Association for Computing Machinery (ACM)

View at Publisher Save 10.1145/3579608

Abstract

Instagram, one of the most popular social media platforms among youth, has recently come under scrutiny for potentially being harmful to the safety and well-being of our younger generations. Automated approaches for risk detection may be one way to help mitigate some of these risks if such algorithms are both accurate and contextual to the types of online harms youth face on social media platforms. However, the imminent switch by Instagram to end-to-end encryption for private conversations will limit the type of data that will be available to the platform to detect and mitigate such risks. In this paper, we investigate which indicators are most helpful in automatically detecting risk in Instagram private conversations, with an eye on high-level metadata, which will still be available in the scenario of end-to-end encryption. Toward this end, we collected Instagram data from 172 youth (ages 13-21) and asked them to identify private message conversations that made them feel uncomfortable or unsafe. Our participants risk-flagged 28,725 conversations that contained 4,181,970 direct messages, including textual posts and images. Based on this rich and multimodal dataset, we tested multiple feature sets (metadata, linguistic cues, and image features) and trained classifiers to detect risky conversations. Overall, we found that the metadata features (e.g., conversation length, a proxy for participant engagement) were the best predictors of risky conversations. However, for distinguishing between risk types, the different linguistic and media cues were the best predictors. Based on our findings, we provide design implications for AI risk detection systems in the presence of end-to-end encryption. More broadly, our work contributes to the literature on adolescent online safety by moving toward more robust solutions for risk detection that directly takes into account the lived risk experiences of youth.

Topics

No keywords indexed for this article. Browse by subject →

References

134

[1]

10.1145/3334480.3383073

[2]

Shiza Ali, Afsaneh Razi, Seunghyun Kim, Ashwaq Alsoubai, Joshua Gracie, Munmun De Choudhury, Pamela J Wisniewski, and Gianluca Stringhini. 2022. Understanding the Digital Lives of Youth: Analyzing Media Shared within Safe Versus Unsafe Private Conversations on Instagram. (2022), 1--14.

[3]

10.48185/jaai.v1i1.30

[4]

10.1145/3500868.3559710

[5]

10.1145/3555136

[6]

Philip Anderson Zheming Zuo Longzhi Yang and Yanpeng Qu. 2019. An Intelligent Online Grooming Detection System Using AI Technologies. (2019) 1--6. https://doi.org/10.1109/FUZZ-IEEE.2019.8858973 10.1109/fuzz-ieee.2019.8858973

[7]

Sumaira Ashraf and Toqeer Ahmed. 2020. Machine Learning Shrewd Approach For An Imbalanced Dataset Conversion Samples. Journal of Engineering and Technology 11 (2020).

[8]

Karla Badillo-Urquiola Diva Smriti Brenna McNally Evan Golub Elizabeth Bonsignore and Pamela J Wisniewski. 2019. Stranger danger! social media app features co-designed with children to keep them safe online. (2019) 394--406. 10.1145/3311927.3323133

[9]

Multimodal Machine Learning: A Survey and Taxonomy

Tadas Baltrusaitis, Chaitanya Ahuja, Louis-Philippe Morency

IEEE Transactions on Pattern Analysis and Machine... 10.1109/tpami.2018.2798607

[10]

Francesco Barbieri Miguel Ballesteros Francesco Ronzano and Horacio Saggion. 2018. Multimodal Emoji Prediction. (2018). 10.18653/v1/n18-2107

[11]

Jessica Baron. 2019. The key to gen Z is video content. Forbes (Jul 2019). https://www.forbes.com/sites/jessicabaron/2019/07/03/the-key-to-gen-z-is-video-content/'sh=e92cf1534848

[12]

Nadine Barrett-Maitland and Jenice Lynch. 2020. Social media, ethics and the privacy paradox. Security and privacy from a legal, ethical, and technical perspective (2020).

[13]

Shannon Bond and Bobby Allyn. 2021. Facebook whistleblower tells Congress products hurt kids and weaken democracy NPR. (2021). https://www.npr.org/2021/10/05/1043207218/whistleblower-to-congress-facebook-products-harm-children-and-weaken-democracy

[14]

Timothy Buck. 2022. Updates to end-to-end encrypted chats on Messenger. Meta (Jan 2022). https://about.fb.com/news/2022/01/updates-to-end-to-end-encrypted-chats-messenger/

[15]

Xavier V Caddle, Afsaneh Razi, Seunghyun Kim, Shiza Ali, Temi Popo, Gianluca Stringhini, Munmun De Choudhury, and Pamela J Wisniewski. 2021. MOSafely: Building an Open-Source HCAI Community to Make the Internet a Safer Place for Youth. (2021), 315--318.

[16]

10.1080/0267257x.2015.1047466

[17]

Noé Cecillon, Vincent Labatut, Richard Dufour, and Georges Linarès. 2019. Abusive language detection in online conversations by combining content-and graph-based features. Frontiers in big Data 2 (2019), 8.

[18]

10.1145/3343484

[19]

Vikas S Chavan and Shylaja S S. 2015. Machine learning approach for detection of cyber-aggressive comments by peers on social media network. (2015) 2354--2358. https://doi.org/10.1109/ICACCI.2015.7275970 10.1109/icacci.2015.7275970

[20]

Ying-Yu Chen and Shukai Hsieh. 2020. An Analysis of Multimodal Document Intent in Instagram Posts. (2020).

[21]

10.1007/s10844-020-00599-5

[22]

Miriam Cihodariu. 2022. Best encrypted messaging apps of 2021 and Why you should use them. Heimdal Security Blog (Jun 2022). https://heimdalsecurity.com/blog/the-best-encrypted-messaging-apps/

[23]

Ronan Collobert, Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. 2011. Natural language processing (almost) from scratch. Journal of machine learning research 12, ARTICLE (2011), 2493--2537.

[24]

Glen A. Coppersmith, Ryan Leary, Patrick Crutchley, and Alex B. Fine. 2018. Natural Language Processing of Social Media as Screening for Suicide Risk. Biomedical Informatics Insights 10 (2018).

[25]

Antigone Davis. 2021. Our approach to safer private messaging. Meta (Nov 2021). https://about.fb.com/news/2021/12/metas-approach-to-safer-private-messaging/

[26]

10.1145/2858036.2858207

[27]

Bart Desmet Kirsten Pauwels and Veronique Hoste. 2015. Online suicide risk detection using automatic text classification. (2015).

[28]

Rebecca A DiBennardo. 2018. Ideal victims and monstrous offenders: How the news media represent sexual predators. Socius 4 (2018).

[29]

10.5555/648054.743935

[30]

Rakkrit Duangsoithong and Terry Windeatt. 2009. Relevance and redundancy analysis for ensemble classifiers. (2009) 206--220. 10.1007/978-3-642-03070-3_16

[31]

Michele P. Dyson, Lisa Hartling, Jocelyn Shulhan, Annabritt Chisholm, Andrea Milne, Purnima Sundar, Shannon D. Scott, and Amanda S. Newton. 2016. A Systematic Review of Social Media Use to Discuss and View Deliberate Self-Harm Acts. PLoS ONE 11 (2016).

[32]

Venkatesh Edupuganti. 2017. Harassment detection on twitter using conversations. (2017).

[33]

10.1080/07370024.2018.1437544

[34]

Isvani Frías-Blanco Alberto Verdecia-Cabrera Agustín Ortiz-Díaz and Andre Carvalho. 2016. Fast adaptive stacking of ensembles. (2016) 929--934. 10.1145/2851613.2851655

[35]

10.1145/2808797.2809421

[36]

Joshua Garland, Keyan Ghazi-Zahedi, Jean-Gabriel Young, Laurent Hébert-Dufresne, and Mirta Galesic. 2020. Countering hate on social media: Large scale classification of hate and counter speech. arXiv preprint arXiv:2006.01974 (2020).

[37]

General Data Protection Regulation (GDPR). 2021. Art. 20 GDPR -- Right to data portability | General Data Protection Regulation (GDPR). (2021). https://gdpr-info.eu/art-20-gdpr/

[38]

Anastasia Giahanou Guobiao Zhang and Paolo Rosso. 2020. Multimodal Fake News Detection with Textual Visual and Semantic Information. (2020). 10.1007/978-3-030-58323-1_3

[39]

10.1109/icpr.2010.793

[40]

10.3233/idt-140212

[41]

10.1001/jamapediatrics.2015.0944

[42]

10.1145/3449116

[43]

10.1145/3462204.3481739

[44]

Cormac Herley. 2012. Why do nigerian scammers say they are from nigeria? (2012).

[45]

Alex Hern. 2021. Priti Patel v facebook is the latest in a 30-year fight over encryption. The Guardian (Apr 2021). https://www.theguardian.com/technology/2021/apr/19/priti-patel-v-facebook-is-the-latest-in-a-30-year-fight-over-encryption

[46]

Jiani Hu Toshihiko Yamasaki and Kiyoharu Aizawa. 2016. Multimodal learning for image popularity prediction on social media. (2016) 1--2. https://doi.org/10.1109/ICCE-TW.2016.7521017 10.1109/icce-tw.2016.7521017

[47]

Zainab Iftikhar, Osama Younus, Taha Sardar, Hammad Arif, Mobin Javed, Suleman Shahid, et al. 2021. Designing Parental Monitoring and Control Technology: A Systematic Review. In IFIP Conference on Human-Computer Interaction. Springer, 676--700.

[48]

10.1145/1455770.1455774

[49]

10.1080/10714421.2017.1343068

[50]

Seunghyun Kim Afsaneh Razi Gianluca Stringhini Pamela Wisniewski and Munmun De Choudhury. 2021. You Don't Know How I Feel: Insider-Outsider Perspective Gaps in Cyberbullying Risk Detection. (2021).

Showing 50 of 134 references

Metrics

26

Citations

134

References

Details

Published: Apr 14, 2023
Vol/Issue: 7(CSCW1)
Pages: 1-30
License: View

Authors

S

Shiza Ali

Boston University, Boston, MA, USA

A

Afsaneh Razi

Drexel University, Philadelphia, PA, USA

S

Seunghyun Kim

Georgia Institute of Technology, Atlanta, GA, USA

A

Ashwaq Alsoubai

Vanderbilt University, Nashville, TN, USA

C

Chen Ling

Boston University, Boston, MA, USA

M

Munmun De Choudhury

Georgia Institute of Technology, Atlanta, GA, USA

P

Pamela J. Wisniewski

Vanderbilt University, Nashville, TN, USA

G

Gianluca Stringhini

Boston University, Boston, MA, USA

Funding

National Science Foundation Award: IP-1827700, IIS-1844881, CNS-1942610

William T. Grant Foundation Award: 187941

Cite This Article

Shiza Ali, Afsaneh Razi, Seunghyun Kim, et al. (2023). Getting Meta: A Multimodal Approach for Detecting Unsafe Conversations within Instagram Direct Messages of Youth. Proceedings of the ACM on Human-Computer Interaction, 7(CSCW1), 1-30. https://doi.org/10.1145/3579608

Getting Meta: A Multimodal Approach for Detecting Unsafe Conversations within Instagram Direct Messages of Youth

You May Also Like