Computation time optimization on hashtag segmentation for social media data

Malka N. Halgamuge, Huseyin Caliskan, Azeem Mohammad

Research output: Book chapter/Published conference paperConference paperpeer-review


Despite sentiment analysis or contextual mining of text that recognizes and extracts subjective information from a source, it is considered necessary to estimate human behavior. A hashtag is a metadata tag used to classify data into a category. However, there has been little discussion on segmenting hashtags so far. We propose an algorithm to segment hashtags by optimizing computation time. We create candidates according to a given corpus, containing 1-gram (unigram) and 2-gram (bigram) data. The proposed algorithm allows a reduction in the computation time of generating segments by limiting the candidates in a given corpus. The fewer candidates there are, the shorter the calculation is, leading to a decreased duration. In this study, we gather food-related unstructured tweets (N = 951,255) from Twitter. Our results demonstrate that the proposed algorithm allows a computation time reduction of 29.7%. However, if the segment could not be found with the proposed algorithm, the original method for hashtag segmentation, which includes identifying all possible candidates, is used as a fallback method. The proposed approach improves the hashtag segmentation technique, minimizing computation time, which could be utilized in real-time tweet analysis. The result of our study shows that the trend of sentiments for both raw data and segmented data is similar, which also verifies the method’s accuracy. This study’s discoveries uncover that, despite the fact that computers are getting faster, computational resources should be utilized effectively. Our work also provides a data collection model for future surveys, which could also shorten the data retrieval process with multi-threading programming concepts.

Original languageEnglish
Title of host publication2021 IEEE Wireless Communications and Networking Conference, WCNC 2021
PublisherIEEE, Institute of Electrical and Electronics Engineers
Number of pages6
ISBN (Electronic)9781728195056
ISBN (Print)9781728195063 (Print on demand)
Publication statusPublished - 05 May 2021
Event2021 IEEE Wireless Communications and Networking Conference, WCNC 2021 - Nanjing and online, Nanjing, China
Duration: 29 Mar 202101 Apr 2021 (Conference website) (Call for papers)

Publication series

NameIEEE Wireless Communications and Networking Conference, WCNC
ISSN (Print)1525-3511


Conference2021 IEEE Wireless Communications and Networking Conference, WCNC 2021
Abbreviated titleShaping the wireless future
OtherWelcome to the 2021 IEEE Wireless Communications and Networking Conference (WCNC 2021), an IEEE premier annual event in the wireless research arena, bringing together researchers, academics, and industry professionals from all over the worls. IEEE WCNC 2021 will take place from 29 March to 1 April in Nanjing, China. Due to the on-going COVID-19 pandemic and the resulting travel restrictions, the conference will be in the form of a hybrid conference with authors and participants from China attending the conference in Nanjing in person, and all others attending remotely. The technical program will include physical sessions with on-site presentations for the local participants and virtual (on-line) sessions for remote participants.
Internet address


Dive into the research topics of 'Computation time optimization on hashtag segmentation for social media data'. Together they form a unique fingerprint.

Cite this