Detecting drinking-related contents on social media by classifying heterogeneous data types

Omar ElTayeby, Todd Eaglin, Malak Abdullah, David Burlinson, Wenwen Dou, Lixia Yao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

One common health problem in the US faced by colleges and universities is binge drinking. College students often post drinking related texts and images on social media as a socially desirable identity. Some public health and clinical research scholars have surveyed different social media sites manually to understand their behavior patterns. In this paper, we investigate the feasibility of mining the heterogeneous data scattered on social media to identify drinking-related contents, which is the first step towards unleashing the potential of social media in automatic detection of binge drinking users. We use the state-of-the-art algorithms such as Support Vector Machine and neural networks to classify drinking from non-drinking posts, which contain not only text, but also images and videos. Our results show that combining heterogeneous data types, we are able to identify drinking related posts with an overall accuracy of 82%. Prediction models based on text data is more reliable compared to the other two models built on image and video data for predicting drinking related contents.

Original languageEnglish (US)
Title of host publicationAdvances in Artificial Intelligence
Subtitle of host publicationFrom Theory to Practice - 30th International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2017, Proceedings
EditorsMoonis Ali, Salem Benferhat, Karim Tabia
PublisherSpringer Verlag
Pages364-373
Number of pages10
ISBN (Print)9783319600444
DOIs
StatePublished - Jan 1 2017
Event30th International Conference on Industrial, Engineering, and Other Applications of Applied Intelligent Systems, IEA/AIE 2017 - Arras, France
Duration: Jun 27 2017Jun 30 2017

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10351 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other30th International Conference on Industrial, Engineering, and Other Applications of Applied Intelligent Systems, IEA/AIE 2017
CountryFrance
CityArras
Period6/27/176/30/17

Fingerprint

Social Media
Public health
Medical problems
Support vector machines
Students
Neural networks
Scattered Data
Public Health
Prediction Model
Mining
Support Vector Machine
Health
Classify
Neural Networks
Model-based
Text
Model

Keywords

  • Binge drinking
  • Image classification
  • Machine learning
  • Social media
  • Text classification
  • Video classification

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

ElTayeby, O., Eaglin, T., Abdullah, M., Burlinson, D., Dou, W., & Yao, L. (2017). Detecting drinking-related contents on social media by classifying heterogeneous data types. In M. Ali, S. Benferhat, & K. Tabia (Eds.), Advances in Artificial Intelligence: From Theory to Practice - 30th International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2017, Proceedings (pp. 364-373). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10351 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-319-60045-1_38

Detecting drinking-related contents on social media by classifying heterogeneous data types. / ElTayeby, Omar; Eaglin, Todd; Abdullah, Malak; Burlinson, David; Dou, Wenwen; Yao, Lixia.

Advances in Artificial Intelligence: From Theory to Practice - 30th International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2017, Proceedings. ed. / Moonis Ali; Salem Benferhat; Karim Tabia. Springer Verlag, 2017. p. 364-373 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10351 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

ElTayeby, O, Eaglin, T, Abdullah, M, Burlinson, D, Dou, W & Yao, L 2017, Detecting drinking-related contents on social media by classifying heterogeneous data types. in M Ali, S Benferhat & K Tabia (eds), Advances in Artificial Intelligence: From Theory to Practice - 30th International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2017, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 10351 LNCS, Springer Verlag, pp. 364-373, 30th International Conference on Industrial, Engineering, and Other Applications of Applied Intelligent Systems, IEA/AIE 2017, Arras, France, 6/27/17. https://doi.org/10.1007/978-3-319-60045-1_38
ElTayeby O, Eaglin T, Abdullah M, Burlinson D, Dou W, Yao L. Detecting drinking-related contents on social media by classifying heterogeneous data types. In Ali M, Benferhat S, Tabia K, editors, Advances in Artificial Intelligence: From Theory to Practice - 30th International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2017, Proceedings. Springer Verlag. 2017. p. 364-373. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-319-60045-1_38
ElTayeby, Omar ; Eaglin, Todd ; Abdullah, Malak ; Burlinson, David ; Dou, Wenwen ; Yao, Lixia. / Detecting drinking-related contents on social media by classifying heterogeneous data types. Advances in Artificial Intelligence: From Theory to Practice - 30th International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2017, Proceedings. editor / Moonis Ali ; Salem Benferhat ; Karim Tabia. Springer Verlag, 2017. pp. 364-373 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{18cf1feb21574b8e8495bd5bb61d36b1,
title = "Detecting drinking-related contents on social media by classifying heterogeneous data types",
abstract = "One common health problem in the US faced by colleges and universities is binge drinking. College students often post drinking related texts and images on social media as a socially desirable identity. Some public health and clinical research scholars have surveyed different social media sites manually to understand their behavior patterns. In this paper, we investigate the feasibility of mining the heterogeneous data scattered on social media to identify drinking-related contents, which is the first step towards unleashing the potential of social media in automatic detection of binge drinking users. We use the state-of-the-art algorithms such as Support Vector Machine and neural networks to classify drinking from non-drinking posts, which contain not only text, but also images and videos. Our results show that combining heterogeneous data types, we are able to identify drinking related posts with an overall accuracy of 82{\%}. Prediction models based on text data is more reliable compared to the other two models built on image and video data for predicting drinking related contents.",
keywords = "Binge drinking, Image classification, Machine learning, Social media, Text classification, Video classification",
author = "Omar ElTayeby and Todd Eaglin and Malak Abdullah and David Burlinson and Wenwen Dou and Lixia Yao",
year = "2017",
month = "1",
day = "1",
doi = "10.1007/978-3-319-60045-1_38",
language = "English (US)",
isbn = "9783319600444",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "364--373",
editor = "Moonis Ali and Salem Benferhat and Karim Tabia",
booktitle = "Advances in Artificial Intelligence",
address = "Germany",

}

TY - GEN

T1 - Detecting drinking-related contents on social media by classifying heterogeneous data types

AU - ElTayeby, Omar

AU - Eaglin, Todd

AU - Abdullah, Malak

AU - Burlinson, David

AU - Dou, Wenwen

AU - Yao, Lixia

PY - 2017/1/1

Y1 - 2017/1/1

N2 - One common health problem in the US faced by colleges and universities is binge drinking. College students often post drinking related texts and images on social media as a socially desirable identity. Some public health and clinical research scholars have surveyed different social media sites manually to understand their behavior patterns. In this paper, we investigate the feasibility of mining the heterogeneous data scattered on social media to identify drinking-related contents, which is the first step towards unleashing the potential of social media in automatic detection of binge drinking users. We use the state-of-the-art algorithms such as Support Vector Machine and neural networks to classify drinking from non-drinking posts, which contain not only text, but also images and videos. Our results show that combining heterogeneous data types, we are able to identify drinking related posts with an overall accuracy of 82%. Prediction models based on text data is more reliable compared to the other two models built on image and video data for predicting drinking related contents.

AB - One common health problem in the US faced by colleges and universities is binge drinking. College students often post drinking related texts and images on social media as a socially desirable identity. Some public health and clinical research scholars have surveyed different social media sites manually to understand their behavior patterns. In this paper, we investigate the feasibility of mining the heterogeneous data scattered on social media to identify drinking-related contents, which is the first step towards unleashing the potential of social media in automatic detection of binge drinking users. We use the state-of-the-art algorithms such as Support Vector Machine and neural networks to classify drinking from non-drinking posts, which contain not only text, but also images and videos. Our results show that combining heterogeneous data types, we are able to identify drinking related posts with an overall accuracy of 82%. Prediction models based on text data is more reliable compared to the other two models built on image and video data for predicting drinking related contents.

KW - Binge drinking

KW - Image classification

KW - Machine learning

KW - Social media

KW - Text classification

KW - Video classification

UR - http://www.scopus.com/inward/record.url?scp=85026286042&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85026286042&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-60045-1_38

DO - 10.1007/978-3-319-60045-1_38

M3 - Conference contribution

AN - SCOPUS:85026286042

SN - 9783319600444

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 364

EP - 373

BT - Advances in Artificial Intelligence

A2 - Ali, Moonis

A2 - Benferhat, Salem

A2 - Tabia, Karim

PB - Springer Verlag

ER -