0% found this document useful (0 votes)
21 views4 pages

Lexicon-Based Financial Sentiment Analysis

The document discusses analyzing sentiment from financial social media data using lexicon-based approaches. It reviews related works applying machine learning and sentiment analysis to Twitter data. The paper then examines a labeled StockTwits dataset to determine if lexicon-based sentiment analysis methods can effectively classify the data.

Uploaded by

aidynn.enoc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views4 pages

Lexicon-Based Financial Sentiment Analysis

The document discusses analyzing sentiment from financial social media data using lexicon-based approaches. It reviews related works applying machine learning and sentiment analysis to Twitter data. The paper then examines a labeled StockTwits dataset to determine if lexicon-based sentiment analysis methods can effectively classify the data.

Uploaded by

aidynn.enoc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

2018 12th IEEE International Conference on Semantic Computing

Financial Sentiment Lexicon Analysis


Sahar Sohangir Nicholas Petty Dingding Wang
Department of Computer and Electrical Department of Computer and Electrical Department of Computer and Electrical
Engineering and Computer Sciece Engineering and Computer Sciece Engineering and Computer Sciece
Boca Raton, Florida, USA Boca Raton, Florida, USA Boca Raton, Florida, USA
Email: ssohangir2014@[Link] Email: npetty2014@[Link] Email: wangd@[Link]

Abstract—The modern stock market is a popular place to then compare sentiment analysis approaches through machine
increase wealth and generate income, but the fundamental learning and sentiment lexicons. The following section will
problem of when to buy or sell shares, or which stocks to buy provide our experimental results, which show that lexicon-
has not been solved. With the availability of the Internet and its
financial social networks, such as StockTwits and SeekingAlpha, based approaches can offer improved performance over ma-
investors around the world have new opportunities to gather chine learning methods. In the last section, we summarize our
and share their experiences. Individual experts can predict the conclusions and recommend the VADER system of lexicon-
movement of the stock market in financial social networks with based sentiment analysis for classification of StockTwits
reasonable accuracy, but how accurate is a large group of tweets.
such experts in aggregate? One way to answer this question
is by examining the sentiment of a massive group of these II. R ELATED W ORK
authors towards various stocks. By extracting the sentiment
of the whole group, a collective prediction can be observed. Early work on Twitter and sentiment analysis comes from
Although sentiment extraction is a major technical challenge, Bollen, et al. in [1], with their use of OpinionFinder and
the lexicon-based approach is an effective method of determining Google Profile of Mood States (GPOMS). These tools took
how positive or negative the content of a text document is. In
this paper, we investigate if we can improve the performance of tweet input and produced the author’s sentiment, which was
sentiment extraction from financial social media data by using then compared against the performance of a stock market
lexicon-based approaches. index. The authors showed that sentiment analysis of a large
Keywords-Sentiment analysis; opinion retrieval; natural lan- Twitter dataset regarding stock movement is possible. Addi-
guage processing; sentiment lexicon tionally, they found that this analysis can be used for market
predictions, with an accuracy of around 87%.
I. I NTRODUCTION By expanding on the work Bollen, Mittal, and Goel in [2]
The Internet has become a tool of open communication looked further into sentiment analysis when applied to Twitter
for billions of people around the world, allowing interaction data. They realized that having a good sentiment analysis
between individuals who may have never been able to connect system was extremely important for their task, and evaluated
previously. Crowdsourcing uses the collective wisdom of a multiple analyzers, including OpinionFinder and SentiWord-
large group of people to achieve a specific goal and has Net. By stressing the importance of sentiment analysis on
brought about a social revolution. financial tweets, this work also leads us to examine the topic
One website which brings these opportunities to its users is more closely. One of the most popular works in this field is by
StockTwits. By leveraging Twitter’s 140 character tweet sys- Loughran and McDonald [3]. They used the U.S. Security and
tem, StockTwits aggregates market analyses from the Twitter Exchange Commission portal from 1994 to 2008 to make a
social media platform and condenses them into a focused, financial lexicon and manually create six-word lists including
curated stream of data. If this stream were examined in positive, negative, litigious, uncertainty, model strong and
full, it would be possible to determine the crowd’s collective model weak.
sentiment towards the market and make predictions from it. Supervised classification methods, such as Support Vector
What makes StockTwits special is its users’ ability to add a Machines, Naı̈ve Bayes or ensembles [4], [5] have been
tag to their tweets to indicate whether their post is ”Bullish” deployed to perform sentiment analysis in multiple research
and they think the stock or market will improve, or ”Bearish” projects. Machine learning techniques mainly use the bag-
and they think the stock or market will get worse. of-words [6] model. In the bag-of-words model, a text is
In this paper, we will examine a labeled dataset from represented as the collection of its words, disregarding the
StockTwits and determine whether lexicon based sentiment order of those words in their sentences. In addition, we do
analysis methods are effective for classification. need feature engineering in machine learning methods.
We will begin by reviewing a selection of works related Wang, et al. in [7] applied machine learning approaches,
to the application of machine learning and sentiment analysis including Support Vector Machine, Naive Bayes, and Decision
on financial social media data. The next section covers our Tree, to classify StockTwits tweets as ”bullish” or ”bearish.”
methodology. We will discuss the significance of our dataset, They found that the SVM model was the most accurate at

0-7695-6360-0/18/$31.00 ©2018 IEEE 286


DOI 10.1109/ICSC.2018.00052
Authorized licensed use limited to: Konya Teknik Universitesi. Downloaded on May 05,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
76.2%. Our research builds on this work by re-evaluating var- With the growing popularity of social media, huge datasets
ious machine learning models and then investigating lexicon- of reviews, blogs, and social network feeds are being generated
based sentiment analyzers to see if better accuracy can be continuously. Concepts and methods from sentiment analysis
attained. With an improved method of determining the overall that can help us to extract information from these areas have
feelings of StockTwits users, more accurate predictions can be become increasingly important as businesses, organizations,
made from their aggregate data. and individuals seek to make better use of their data.
In the following section, we investigate the performance of
III. M ETHODOLOGY
sentiment lexicon to extract sentiment of users in financial
A. DataSet social media.
StockTwits is a financial social network which was estab-
lished in 2009. Information about the stock market, like the C. Sentiment Lexicon
latest stock prices, price movement, stock exchange history,
A sentiment lexicon is a list of lexical features which
buying or selling recommendations, and so on, are available to
are generally labeled according to their semantic orientation
StockTwits users. In addition, as a social network, it provides
as either positive or negative [10]. Due to the challenge of
the opportunity for sharing experience among traders in the
creating a lexicon, most research in sentiment analysis relies
stock market. Through the StockTwits website, investors,
heavily on preexisting manually constructed lexicons. The
analysts, and others interested in the market can contribute
three most common lexicons in use are LIWC 1 , GI 2 , and
a tweet - a short message limited to 140 characters about the
Hu-Liu04 3 . In the following section, we briefly provide an
stock market. This message will be posted to a public stream
overview of two most commonly-used sentiment lexicons -
visible to all site visitors. Moreover, messages can be labeled
VADER and SentiWordNet. VADER uses a combination of
Bullish or Bearish by the authors to specify their sentiment
qualitative and quantitative methods, and SentiWordNet is an
towards their chosen stocks.
extension of WordNet [11].
In our experiment, we used messages which were posted in
the whole year of 2015 and the first six months of 2016. Each 1) VADER: Valence Aware Dictionary for sEntiment Rea-
message includes a messageID, a userID, the author’s number soning: VADER, as a parsimonious rule-based model for
of followers, a timestamp, the current price of the stock, and sentiment analysis, can be used in multiple domains. It is
other record-keeping attributes. constructed from a generalized, valence-based, human-curated
If the sentiment of top authors is known, we can predict gold standard sentiment lexicon. In addition, the impact of
stock prices with an accuracy of 75%. Unfortunately, only grammatical and syntactical rules including punctuation, cap-
10% of messages in StockTwits are labeled, so we can’t rely on italization, contrastive conjunction, etc. on the sentiment of
self-reported sentiment only. To increase the accuracy of stock text is considered. VADER is fast enough to use online with
price prediction, we need a powerful method to determine streaming data and also it does not suffer from a speed-
the sentiment of top authors. Therefore, sentiment lexicon is performance trade-off. These features make VADER one of the
adopted to do sentiment analysis on StockTwits messages. popular methods for sentiment analysis, especially on social
We believe that using sentiment lexicon can vastly improve media-related data.
correct classification in sentiment analysis regarding various In VADER, a group of well-established sentiment lexicons,
stock picks and thus exceed the current accuracy of stock price like LIWC, ANEW, and GI, are used to construct a list.
prediction. Incorporation of this list with lexical features common to
sentiment expression in microblogs, including Western-style
B. Sentiment Analysis emoticons 4 , sentiment related acronyms and initialisms 5 , and
Following the early work in sentiment analysis done in [8], commonly used slang 6 with sentiment value, provides over
[9], we examine source materials and apply natural language 9000 lexical feature candidates.
processing techniques to determine the attitude of the writer The wisdom-of-the-crowd is used to find an estimate for the
towards a subject. Generally speaking, the main goal in sentiment valence of each candidate feature. Ten independent
sentiment analysis is determining the attitude of a writer with humans rate each of the features on a scale from -4 for
respect to some topic or the overall contextual polarity to a extremely negative to 4 for extremely positive, and 0 for
document. neutral. Only a lexical feature that has a non-zero mean rating,
There are different methods in sentiment analysis that and whose standard deviation is less than 2.5, as determined
can help us to measure sentiments, including lexical-based by the aggregate of ten independent raters, is kept. These
approaches and supervised machine learning. processes provide a set of 7,500 lexical features with valence
Machine learning require training data, which may also be
1 [Link]
difficult to acquire. In addition, the training process is time-
2 [Link] inquirer
consuming and computationally expensive in terms of CPU 3 [Link] liub/FBS/[Link]
and memory requirements. Moreover, machine learning only 4 [Link]
depends on the training set to find features, and this selection 5 [Link]

may be incomplete. 6 [Link]

287

Authorized licensed use limited to: Konya Teknik Universitesi. Downloaded on May 05,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
TABLE I
P ERFORMANCE OF THE MACHINE LEARNING MODELS ON SENTIMENT ANALYSIS IN THE S TOCK T WITS DATASET

Accuracy Precision Recall F-measure AUC


Logistic Regression 0.814 0.822 0.981 0.894 0.716
Naı̈ve Bayes 0.808 0.809 0.996 0.893 0.714
Linear SVM 0.814 0.820 0.984 0.895 0.716

TABLE II
P ERFORMANCE OF THE T EXT B LOB ON SENTIMENT ANALYSIS IN THE S TOCK T WITS DATASET

Accuracy Precision Recall F-measure AUC


TextBlob 0.810 0.842 0.726 0.780 0.804

TABLE III
P ERFORMANCE OF THE S ENTI W ORD N ET ON SENTIMENT ANALYSIS IN THE S TOCK T WITS DATASET

Accuracy Precision Recall F-measure AUC


SentiWordNet 0.870 0.837 0.661 0.739 0.806

TABLE IV
P ERFORMANCE OF THE VADER ON SENTIMENT ANALYSIS IN THE users’ messages into either Bullish or Bearish sentiment. As
S TOCK T WITS DATASET Unigrams are used as features and infrequent unigrams that oc-
cur less than 300 times over all messages have been removed.
Accuracy Precision Recall F-measure AUC
VADER 0.944 0.847 0.745 0.793 0.861
In Table I, we provide the performance of Naı̈ve Bayes, Linear
Support Vector Machine (SVM), and Logistic Regression on
StockTwits data based on different performance metrics. Based
scores, which indicate the sentiment polarity and the sentiment on Table I the performance of logistic regression, linear SVM,
intensity on a scale from -4 to +4 [12]. and Naive Bayes to classify messages to Bullish and Bearish
2) SentiWordNet: SentiWordNet is a lexical resource which is very close. The Accuracy of prediction is around 80%, F-
uses sets of synonyms, or synsets, instead of individual terms. measure around 90% and, Area Under the Curve is around
Their reasoning for this switch is that different senses of 70%. In the following section, we try to see if we can adopt
the same term may have different opinion-related properties. lexicons to improve the performance of the prediction.
SentiWordNet assigns three numerical scores - Obj(s), Pos(s),
B. Lexicon Based Approaches
and Neg(s) - to each synset of WordNet (version 2.0). These
scores describe how Objective, Positive, and Negative the 1) TextBlob: The first method we used to extract the
terms contained in the synset are. sentiment of messages in StockTwits data was TextBlob [14].
SentiWordNet works based on training a set of ternary clas- It uses a sentiment lexicon and the [Link] sentiment
sifiers. These classifiers produce different results because they analysis engine. [Link] leverages WordNet to score sen-
each train with a different training set and semi-supervised timent according to the English adjective used in the text.
learning method. If all the ternary classifiers agree to assign When TextBlob runs sentiment analysis on text, it returns
the same label to a synset, that label will be assigned to that a tuple of the form (polarity, subjectivity), where polarity
synset. Otherwise, each label will have a score proportional to is a float within the range [-1,1]. We first establish if there
the number of classifiers that have assigned it [13]. is any correlation between positive polarity and Bullish, and
then negative polarity and Bearish. In order to compare the
IV. E XPERIMENTS result of the machine learning approach to the lexicon-based
In this section, we will describe how our experiment ap- approach, we apply TextBlob to the 2,522,557 messages that
plies machine learning and lexicon based approaches to the we used in the machine learning methods (Bearish around
StockTwits dataset. 500,000 and Bullish more than 2,000,000). From this set,
Our experiment investigates if there is any relation between TextBlob found 1,125,130 neutral messages. We remove all
Bullish tweets and positive polarity, or Bearish tweets and of the neutral messages and provide the result of comparing
negative polarity. In the following section, we seek to deter- TextBlob sentiment on StockTwits data with the actual label of
mine whether lexicon based models improve the accuracy of messages in Table II. Based on the results shown in Table II,
sentiment analysis of StockTwits data compared to machine TextBlob is not an effective method for extracting sentiment
learning approaches. from StockTwits data. TextBlob’s ineffectiveness is due to it
labeling too many messages as neutral, and its performance
A. Machine Learning Approaches metrics not being considerably improved in comparison to
As we mentioned before, 10% of messages in our dataset machine learning approaches.
are labeled. In our experiment, we use these messages and 2) SentiWordNet: Again, we consider a positive message
supervised machine learning methods to classify StockTwits as Bullish and a negative message as Bearish. Among all

288

Authorized licensed use limited to: Konya Teknik Universitesi. Downloaded on May 05,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.
TABLE V V. C ONCLUSIONS
N UMBER OF NEUTRAL MESSAGES
Knowing the sentiment of top authors, we can predict stock
TextBlob SentiWordNet VADER prices with accuracy of 75% but unfortunately, only 10% of
neutral 1,125,130 214,972 899,503 messages in StockTwits are labeled. To increase the accuracy
of stock price prediction, we need a powerful method for
the sentiment analysis of top authors. Sentiment analysis has
of 2,522,557 messages, SentiWordNet found 214,972 neutral two main approaches - lexicon-based and machine learning.
messages. All such neutral messages were removed, and then The primary drawback to machine learning is the training
the result of SentiWordNet sentiment for each message was process, which is very time-consuming and computationally
compared with the actual label of that message. expensive. On the other hand, the lexicon-based approach does
The result of applying SentiWordNet on StockTwits data is not need training data, and so it is favorable, particularly in
provided in Table III. Comparing Table III and I, it is clear that tasks that involve high-dimensional data. There are a variety of
SentiWordNet can improve accuracy, precision and area under lexicon-based methods that can be used to perform sentiment
the curve values in comparison to machine learning models analysis. In this paper, we applied VADER, SentiWordNet,
but still, the difference is not considerable. Although accuracy and TextBlob on StockTwits data to see if they can increase
and AUC grow up around 9% f-measure reduce more than the accuracy of sentiment analysis. Logistic regression, Linear
10%. SVM, and Naive Bayes classification was used as our baseline
and compared to the results of applying lexicon-based models
3) VADER: Among all of 2,522,557 messages, VADER
alongside machine learning models. Based on our results, not
found 899,503 neutral messages and labeled them with zero.
only does VADER outperform machine learning methods in
We remove all of the messages that VADER found as neutral
extracting sentiment from financial social media, like Stock-
and then compared VADER’s determined sentiment with the
Twits, it is also faster.
actual label of each sentence.
Our results are shown in Table IV. We found that using R EFERENCES
VADER to predict the sentiment of the StockTwits users can [1] J. Bollen, H. Mao, and X. Zeng, “Twitter mood predicts the stock
improve accuracy, and area under the curve when compared market,” Journal of Computational Science, vol. 2, pp. 1–8, March 2011.
[2] A. Mittal and A. Goel, “Stock prediction using twitter sentiment
to machine learning methods (Table I) TextBlob II and Senti- analysis,” 2011.
WordNet III . [3] T. Loughran and B. McDonald, “When is a liability not a liability?
textual analysis, dictionaries, and 10-ks,” THE JOURNAL OF FINANCE,
4) Combined Results: Figure 1 compares the ROC curves vol. LXVI, no. 1, 2011.
between machine learning methods and sentiment lexicon [4] E. H. Nadia Silva and E. Hruschka, “Tweet sentiment analysis with
methods, including VADER, SentiWordNet, and TextBlob. classifier ensembles,” Journal of Computational Science, vol. 66, pp.
170–179, 2014.
Sentiment lexicons outperform machine learning methods [5] E. M. E. Fersini and F. Pozzi, “Automatic construction of financial
based on these ROC curves. In Table V, we provide the semantic orientation lexicon from large scale chinese news corpus,”
number of messages that were labeled as neutral by TextBlob, Decision Support Systems, vol. 68, pp. 26–38, 2014.
[6] C. Potts and K. Pearson, “From frequency to meaning: Vector space
SentiWordNet, and VADER. Fewer neutral messages indicate models of semantics,” Journal of Artificial Intelligence Research, vol. 37,
better performance from an analyzer, and so SentiWordNet pp. 141–188, 2010.
clearly gives the best results here. However, Tables II, III, and [7] G. Wang, T. Wang, B. Wang, D. Sambasivan, Z. Zhang, H. Zheng, and
B. Y. Zhao, “Crowds on wall street: Extracting value from collaborative
IV reveal that, among the sentiment lexicon methods studied, investing platforms,” The 18th ACM conference on Computer-Supported
VADER’s higher performance metrics make it the best method Cooperative Work and Social Computing (CSCW), March 2015.
for use in predicting StockTwits users’ sentiment. [8] L. L. Bo Pang and shivakumar vaithyanathan, “Thumbs up? sentiment
classification using machine learning techniques,” Proceedings of the
Conference on Empirical Methods in Natural Language Processing,
vol. 66, pp. 79–86, 2002.
1.0
Receiver operating characteristic
[9] M. Abadi, A. Agarwal, P. Barham, and E. Brevdo, “Thumbs up or
thumbs down? semantic orientation applied to unsupervised classifi-
0.8 cation of reviews,” Proceedings of the Association for Computational
Linguistics, vol. 66, pp. 417–424, 2002.
[10] B. Liu, “Sentiment analysis and subjectivity,” in Handbook of Natural
True Positive Rate

0.6
Language Processing, Second Edition. Chapman and Hall/CRC, 2010,
pp. 627–666.
0.4 [11] C. Fellbaum, WordNet. Wiley Online Library, 1998.
VADER (area = 0.86) [12] T. Mitra, C. J. Hutto, and E. Gilbert, “Comparing person-and process-
0.2
SentiWordNet (area = 0.81) centric strategies for obtaining quality data on amazon mechanical turk,”
TextBlob (area = 0.80)
Logistic Regression (area = 0.72) in Proceedings of the 33rd Annual ACM Conference on Human Factors
MultinomialNB (area = 0.71) in Computing Systems. ACM, 2015, pp. 1345–1354.
0.0
0.0 0.2 0.4 0.6 0.8 1.0 [13] A. Esuli and F. Sebastiani, “Sentiwordnet: A high-coverage lexical
False Positive Rate
resource for opinion mining,” Evaluation, pp. 1–26, 2007.
[14] S. Loria. (2017) Textblob: Simplified text processing. [Online].
Fig. 1. Comparative Area Under the ROC curve for Lexicon versus Machine Available: [Link]
Learning based sentiment analysis

289

Authorized licensed use limited to: Konya Teknik Universitesi. Downloaded on May 05,2024 at [Link] UTC from IEEE Xplore. Restrictions apply.

Common questions

Powered by AI

Sentiment extraction from financial social media involves technical challenges, such as determining the accuracy of sentiment polarity and handling informal, often context-dependent language . In the case of stock predictions, inaccurately captured sentiment can lead to misleading conclusions, impacting investors' decisions based on incorrect market sentiment analysis . Moreover, limitations like the low percentage of labeled data further complicate accurate sentiment analysis, affecting the prediction's reliability .

Labeled data is crucial for machine learning systems as these systems rely on labeled datasets to train algorithms for identifying sentiment patterns . A low percentage of labeled data, such as the 10% in StockTwits, can limit the machine learning model's ability to accurately classify sentiments, leading to potential misclassifications . In contrast, lexicon-based systems do not depend on labeled data, thus often retaining effectiveness in data-scarce environments. However, the accuracy of lexicon-based systems may still be influenced indirectly if the available lexicons lack comprehensiveness or accuracy in context .

VADER outperforms both SentiWordNet and TextBlob on StockTwits data in several key performance metrics. It achieves higher accuracy, precision, recall, and area under the curve (AUC) scores . VADER also identifies fewer messages as neutral compared to TextBlob, though SentiWordNet labels even fewer messages as neutral, indicating efficiency in capturing sentiment polarity . However, VADER's comprehensive performance metrics establish it as the superior method for this dataset .

The performance of sentiment analysis systems is affected by the number of messages labeled as neutral, as excessive neutral labeling can dilute the clarity of detected sentiment trends. Lexicon systems that identify many messages as neutral may fail to capture significant sentiment polarity, leading to less actionable insights . For instance, TextBlob found over a million neutral messages, which detracted from its effectiveness compared to SentiWordNet and VADER. SentiWordNet had fewer neutral messages, indicating better capability in discerning sentiment intensity .

The sentiment of top StockTwits authors is pivotal in stock price prediction accuracy, as these authors often have considerable influence and followers . Their sentiments likely reflect well-reasoned perspectives and can significantly sway the crowd's sentiment, making aggregate author sentiment a valuable predictor of market movements . This highlights the importance of accurately capturing and interpreting these sentiments to enhance predictive models' reliability for investment decisions .

Lexicon-based sentiment analysis methods, such as VADER, offer advantages over machine learning in several ways. They do not require training data, which avoids the time-consuming and computationally expensive training process characteristic of machine learning methods . Lexicon-based methods can handle high-dimensional data effectively, making them suitable for analyzing the broad and dynamic datasets found in social media like StockTwits . VADER, in particular, has shown high performance metrics, surpassing machine learning models in accuracy and speed when extracting sentiment from financial social media .

StockTwits leverages social media functionalities by using Twitter's system to aggregate and condense market analyses into a focused data stream. Users can label their tweets as 'Bullish' or 'Bearish,' indicating their sentiment regarding stocks or the market . This integration allows StockTwits to capture real-time sentiment trends, enabling investors to benefit from collective insights and potentially improve their market predictions by considering the crowd's sentiment patterns .

Receiver Operating Characteristic (ROC) curves are significant for evaluating the performance of sentiment analysis models as they illustrate the trade-off between true positive and false positive rates across various thresholds. They are crucial for assessing the discriminatory power of models in differentiating between classes. In the StockTwits analysis, ROC curves showed that lexicon-based methods, particularly VADER, outperformed traditional machine learning models such as Logistic Regression and Naïve Bayes, with a greater area under the curve, indicating superior predictive capability .

VADER improves sentiment analysis accuracy by delivering higher performance metrics like precision, recall, and AUC compared to traditional machine learning methods . It is specifically designed to capture nuanced sentiment expressions often found in informal social media language, which enhances its classification precision. Additionally, VADER does not require extensive computational resources for training, leading to faster analysis and quicker runtime in processing large datasets like those from StockTwits .

Lexicon-based methods are favored for sentiment analysis in financial social media due to their independence from training data, which is especially advantageous with high-dimensional datasets prevalent in this environment . These methods avoid the resource-intensive training process required by machine learning approaches, enabling them to efficiently process and extract sentiment information from large volumes of data generated by platforms like StockTwits .

You might also like