0% found this document useful (0 votes)

15 views6 pages

End-to-End Multimodal Sentiment Analysis

The paper presents a novel end-to-end model for Multimodal Aspect-Based Sentiment Analysis (MABSA) that integrates text and images using the Boosting technique (RoBERTa-LGBM) to improve sentiment analysis accuracy. The model demonstrates superior performance on Twitter datasets, achieving higher accuracy compared to existing methods. The study emphasizes the importance of combining textual and visual data to gain comprehensive insights into customer opinions and preferences.

Uploaded by

aloha1ga23ci024

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views6 pages

End-to-End Multimodal Sentiment Analysis

Uploaded by

aloha1ga23ci024

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

2023 Seventh International Conference on Image Information Processing (ICIIP)

A Transformer Model for end-to-end Image and

Text Aspect-Based Sentiment Analysis
2023 Seventh International Conference on Image Information Processing (ICIIP) | 979-8-3503-7140-6/23/$31.00 ©2023 IEEE | DOI: 10.1109/ICIIP61524.2023.10537622

Amit Chauhan Aman Sharma Rajni Mohana

Department of Computer Science Department of Computer Science Department of Computer Science
JUIT, Waknaghat JUIT, Waknaghat Amity School of Engineering, Mohali
Solan, India Solan, India Punjab, India
chauhanamit37@[Link] [Link]@[Link] [Link]@[Link]

Abstract—Opinion mining is an increasingly important field or feature of a product or service that a user may express
with tremendous potential, including End-to-End Multimodal an opinion about. For instance, in a restaurant review, the
Aspect-Based Sentiment Analysis (MABSA). MABSA aims to aspects could be the quality of food, the service, the ambience,
identify aspect-sentiment pairs from a combination of text and
images. However, many MABSA methods do not incorporate the price, etc. ABSA involves a combination of techniques
aspect and sentiment information in their textual and visual such as rule-based, unsupervised, and supervised methods to
representations, which limits their ability to account for the draw valuable insights into customer opinions and preferences.
distinct effects of visual elements on each word or aspect. In this Rule-based methods use a set of predefined rules to identify
paper, we propose an end-to-end model for multimodal tasks the aspects and their corresponding sentiments in the text [4].
that combines text and images using the Boosting technique
(RoBERTa-LGBM). We achieved state-of-the-art results for both Clustering algorithms group similar aspects and sentiments,
datasets, with a higher accuracy of 1.55% for the Twitter 2015 while machine learning algorithms predict text’s aspect and
dataset and a 1.78% increase for the Twitter 2017 dataset. sentiment through labelled data.
Index Terms—Aspect Based Sentiment Analysis, LGBM,
BERT, Ensemble

I. I NTRODUCTION
Sentiment analysis (SA) involves identifying and extract-
ing subjective information from text data and analyzing and
classifying opinions as positive, negative, or neutral. [1] SA
has gained immense significance in recent years owing to the
vast amount of user-generated content on various platforms
like social media, product reviews, etc. Aspect-based sentiment
analysis (ABSA) is an advanced form of sentiment analysis
that considers the different aspects or features of a product or
service that a user may be expressing their opinion about.
[2]. An extension of SA, ABSA measures more than just
how a product or service is perceived overall. Each distinct
component or attribute of the good or service is intended to
be identified by ABSA in terms of sentiment. Factors may
comprise cost, excellence, client support, timeliness, etc. By Fig. 1. Steps Involved in Aspect Based Sentiment Analysis
examining these particular facets, ABSA offers a more in-
depth understanding of consumer preferences and attitudes. In Figure 1, we can see how ABSA (Aspect-Based Senti-
Businesses looking to improve their goods or services may ment Analysis) is related to the necessary steps. ABSA mod-
find great value in this information [3]. ABSA is a technique els, which use deep learning techniques such as RNNs, CNNs,
that identifies opinions and emotions in text data. It is used and Transformer models, are state-of-the-art. They can process
by businesses to understand customer feedback/reviews. complex and variable-length inputs and capture context and
In ABSA, the text is analyzed at a more detailed level to dependencies between different aspects and sentiments in the
identify the sentiment polarity associated with each aspect text.
Generally, ABSA is used in industries including e-
Identify applicable funding agency here. If none, delete this. commerce, healthcare, and Online reviews. It enables busi-

Authorized licensed use limited to: Global Academy Of Technology. Downloaded on September 12,2025 at [Link] UTC from IEEE Xplore. Restrictions apply.
979-8-3503-7140-6/23/$31.00 ©2023 IEEE 277
2023 Seventh International Conference on Image Information Processing (ICIIP)

nesses to monitor customer feedback and improve product demonstrate the superiority of their method through exten-
development. sive experiments on annotated Twitter datasets. [6] A new
Multimodal Aspect-Based Sentiment Analysis (MABSA) dataset called Multimodal Aspect-Category Sentiment Anal-
is a newly popular field in sentiment analysis. By fusing ysis (MACSA) has been recently introduced by the authors.
textual and visual data, MABSA’s primary goal is to determine This dataset contains over 21,000 text and image pairs with
the sentiment polarity of various characteristics or qualities detailed annotations for both textual and visual content. The
of a good or service. Because it incorporates data from dataset uses the aspect category as a pivot to align the elements
several modalities, MABSA is more complicated than regular between the two modalities. The Multimodal ACSA task and
ABSA. However, it might also provide businesses with a more a graph-based aligned model called the Multimodal Graph-
accurate and comprehensive insight of the preferences and based Aligned Model (MGAM) have been proposed by the
opinions of their customers. This could enable them to enhance authors using this dataset. The model employs a fine-grained
their products or services. cross-modal fusion method to achieve excellent results. The
MABSA offer more information that isn’t contained in the experimental outcomes imply that the proposed method can
text alone. This can include information about a product’s serve as a baseline for future research on this dataset. [7] The
appearance and quality or a service provider’s behaviour that authors have proposed an interactive fusion network that uses
can be used to deduce the sentiment behind a particular recurrent attention to improve image classification accuracy.
feature. Furthermore, visual data can help clarify the polarity Firstly, two encoders are used to encode text and image data.
of ambiguous words or phrases in the text, increasing the Then, the attention mechanism is employed to obtain the
precision of sentiment analysis. semantic information of the image at the token level. After
that, the GRU filters out the noise in the image and fuses
A. Motivation information from different modalities. Finally, the authors
The application of Multimodal Aspect-Based Sentiment design a decoder with recurrent attention to progressively
Analysis (MABSA) has attracted much interest lately. Re- learn aspect-specific sentiment features for classification. The
searchers are conducting many investigations to create study results on two Twitter datasets demonstrate that the
MABSA models that are more efficient. There are still several proposed method outperforms all baselines. [8] The authors
difficulties in this subject despite these efforts. Accurately have presented A new dataset for Aspect-Based Emotion
aligning text and visual data, recognising relevant elements and Analysis (ABEA). They have also attempted to explore the
emotions, and combining data from multiple sources are a few potential of multimodal co-reference resolution within an
examples. Effective MABSA models can benefit industries like ABEA framework. The dataset contains 4,900 comments on
e-commerce, hospitality, and entertainment, providing valuable 175 images, and it has been annotated with aspect and emotion
insights into customer opinions, feedback, and preferences. categories, along with the emotional dimensions of valence
Therefore, further research and development is necessary to and arousal. The initial experiments indicate that ABEA does
create more accurate and efficient MABSA models. not benefit from multimodal co-reference resolution and that
aspect and emotion classification solely require textual infor-
B. Paper Organisation mation. However, image recognition could be crucial when
more specific information about aspects is required. [9]The
There are multiple sections in this article. The introduction authors of this paper employed Graph artificial intelligence
and motivation section comes first, then a section that explores methods to combine various modalities by leveraging cross-
prior research, preliminary work that introduces key terms, modal dependencies through geometric relationships. The re-
proposed work that describes research procedures and data searchers combined different datasets using graphs and fed
collection methods, results and analysis that present the study’s them into advanced multimodal architectures. These architec-
findings, and a conclusion that summarises the findings, as- tures were classified as image-focused, knowledge-based, or
sesses their significance and implications, and offers possible language-oriented models. The paper also presents a road map
directions for further investigation. for multimodal graph learning, which can be used to explore
existing techniques and develop new models.
II. R ELATED W ORK
III. P RELIMINARIES
[5] The authors of a recent study introduced a new task
named Multimodal Entity-Category-Sentiment Triple Extrac- A. Ensemble
tion (MECSTE), which aims to extract entities, their cor- An effective method for increasing model accuracy in
responding fine-grained categories, and sentiment polarities machine learning and deep learning is ensemble learning. In
from text simultaneously. They created two datasets for this essence, it combines several models’ predictions—each with
task using two existing Twitter corpora and developed a its own advantages and disadvantages—to produce a prediction
generative multimodal approach using a pre-trained sequence- that is more trustworthy and accurate. Imagine it as a team of
to-sequence model. The authors propose transforming entity- specialists together to find a solution to a dilemma. Because
category-sentiment triples into natural language sentences, each expert has a unique set of abilities and expertise, when
framing MECSTE as a paraphrase generation problem. They they collaborate, they can produce a better solution than any

Authorized licensed use limited to: Global Academy Of Technology. Downloaded on September 12,2025 at [Link] UTC from IEEE Xplore. Restrictions apply.
278
2023 Seventh International Conference on Image Information Processing (ICIIP)

one of them could alone. Ensemble learning can be applied IV. P ROPOSED W ORK
in a variety of methods, including bagging, boosting, and
stacking. Whereas boosting concentrates on the samples that
were incorrectly identified, bagging trains various models on
distinct portions of the data. By stacking, several models’
predictions are combined.
A flexible method that works well for a wide range of
machine learning issues is ensemble learning. The Boosting
Technique, a well-liked and practical method for raising ma-
chine learning models’ accuracy, was applied in this article.

B. LGBM

LGBM, which stands for Light Gradient Boosting Machine

[10], A well-known machine learning method for classifica-
tion, regression, and ranking tasks is LGBM, which is also
noted for its speed and accuracy. Because it can manage big
datasets with plenty of features and yet be effective for training
and prediction, data scientists utilise it in competitions and
real-world applications.
The simplicity with which LGBM may be tuned and opti-
mised is one of its strongest points, particularly when dealing
with missing data and categorical characteristics. It is therefore
a great option for those who wish to maximise the benefits of
their data analysis.
An additional feature that sets LGBM apart is its capacity to
manage data that is not balanced. For instance, when dealing
with fraud or anomalies, the data may be highly unbalanced. Fig. 2. The Architecture of the study
However, because LGBM’s objective function is based on
gradient boosting, which can be tailored to accommodate
Our study involved several steps, as shown in Figure 2.
various data imbalances, it can handle this with ease.
Firstly, we collected datasets and performed preprocessing on
both Twitter datasets from 2015 to 2017. Next, we generated
C. RoBERTa captions and paired them with the corresponding images. Fol-
lowing this, we paired aspects with their respective sentiments.
Facebook AI has developed a natural language processing In the final stage, we measured accuracy using the performance
model called RoBERTa. Its outstanding performance in a metrics of accuracy and F1-Measure.
range of Natural Language Processing (NLP) tasks has led
to its enormous popularity. The state-of-the-art results were A. Pre-processing
also attained by RoBERTa, an enhanced version of BERT, The process of preprocessing is shown in Figure 3. Various
the original Bidirectional Encoder Representations from the techniques were used to clean the study’s data [11]. To help
Transformers model. us understand the data better, we first divided the text into
Compared to BERT, RoBERTa was trained over a substan- smaller, meaningful units called tokens using a technique
tially longer period of time and with a much larger dataset. called tokenization. Then, we removed common words like
These elements play a part in its exceptional NLP performance ”and,” ”the,” ”in,” ”of,” ”to,” ”is,” and ”a,” because they didn’t
[10]. It was able to surpass its predecessor and get even greater provide us with much useful information. Next, we made
performance as a result. It also brought several significant the text easier to work with by converting all the letters to
changes to the training procedure, like the elimination of the lowercase and removing punctuation. Finally, we used Word
next sentence prediction job and the introduction of dynamic embedding with GloVe to create word vectors that helped us
masking, which contributed to the diversification of the train- analyze the data more effectively. [12].
ing data. RoBERTa’s adaptability in managing a range of
natural language processing (NLP) activities, including named B. Caption Generation
entity recognition, text categorization, and question answering, Generating captions refers to the task of creating a written
is one of its primary benefits. Because of its outstanding description that precisely represents the contents of an image
performance, it is frequently chosen for NLP applications, and or video. With the advancements in deep learning and com-
the NLP community is still actively researching this topic. puter vision technologies, caption generation has emerged as a

Authorized licensed use limited to: Global Academy Of Technology. Downloaded on September 12,2025 at [Link] UTC from IEEE Xplore. Restrictions apply.
279
2023 Seventh International Conference on Image Information Processing (ICIIP)

critical research area in artificial intelligence [13]. The ultimate combines textual and visual data, enhancing the user experi-
goal of caption generation is to create image and video de- ence in general.
scriptions that are similar to those made by humans. This can
be especially beneficial for people with visual impairments or D. Dataset
those who seek more information and context while browsing Twitter 15 and 17 are highly used multimodel datasets
through vast amounts of visual content. Caption generation among researchers. When researching about ABSA, datasets
can also be valuable in a variety of applications, such as image like Twitter 15 and 17 are commonly used [14]. Text and
search, product recommendations, and social media platforms, image data have recently been added to datasets. Researchers
where images and videos are frequently shared and viewed. now have access to a wider range of data, which will help them
better grasp the comments and viewpoints of their customers.
In 2018, Wang et al. presented the multimodal Twitter
15 dataset. More than 14,000 tweets with aspect categories,
sentiment polarities, and pertinent photos are included in this
collection. Using the aspect categories listed in the tweets
as a guide, Google Image Search was used to gather the
photographs. Three sets of the dataset have been created: test,
validation, and training. Ten percent of the data are in each of
the test and validation sets, compared to eighty percent in the
training set.

Fig. 3. Steps involved in Preprocessing Fig. 4. Sample of dataset

C. Image-Text Pairing The Multimodal Twitter 17 dataset was presented by Huang

An important component of the ABSA Multimodal experi- et al. in 2019. It comprises more than 6,000 tweets annotated
ence is matching pertinent images with textual content. This with aspect categories, sentiment polarities, and photos. Based
approach’s ultimate goal is to give users a thorough grasp of on the aspect categories specified in the tweets, these pho-
the material that is delivered, which will increase the content’s tographs were collected via image search engines and Twitter’s
engagement and enhance the user’s cognitive experience. APIs. Eighty per cent of the dataset is in the training set, ten
The ABSA Multimodal does this by analysing the con- per cent in the validation set, and ten per cent in the test set.
tent of the image and matching it with pertinent text using Because these multimodal datasets capture both the textual
sophisticated algorithms. Users are guaranteed to obtain the and visual parts of the data, they offer researchers a more
most relevant and accurate information possible thanks to this thorough knowledge of customer reviews and opinions. For
method. For example, if a picture shows someone holding this reason, they are indispensable. They can also aid in
a coffee cup, the text that goes with it might describe the enhancing the functionality of ABSA models by offering
different kinds of coffee that the store sells [13]. One useful further data that can be utilised to more accurately define
method for improving content accessibility for people with aspect categories and emotion polarities.
impairments is to pair text with graphics. Users with visual Research in the subject of ABSA can greatly benefit from
impairments can still comprehend the information and context the multimodal communication provided by the Twitter 15
if an image description is provided. The ABSA Multimodel and Twitter 17 datasets. These datasets offer a realistic and
image-text matching feature is a potent tool that effectively varied set of data that may be utilised to create sophisticated

Authorized licensed use limited to: Global Academy Of Technology. Downloaded on September 12,2025 at [Link] UTC from IEEE Xplore. Restrictions apply.
280
2023 Seventh International Conference on Image Information Processing (ICIIP)

models for social media platform analysis of client thoughts

and feedback.

V. R ESULTS

This section discusses our proposed framework’s analysis

and results. We compared our model with existing models
based on accuracy and F1 metrics, and we assessed the
algorithms using several performance indicators.
For analysis, we employed two methods: boosting and
Light Gradient Boosting Machine (LGBM). First, we used
RoBERTa, a neural network model for natural language pro-
Fig. 6. Twitter 2017 Dataset Results
cessing that has already been trained, to train our model.
We applied the RoBERTa results to LGBM after acquiring
them. Boosting strategies were used to raise the accuracy and VI. C ONCLUSION
performance of the model. By combining these two potent In this study, we have employed an end-to-end aspect-
methods, we were able to analyse the data efficiently and based sentiment analysis using a transformer model. The main
derive trustworthy conclusions. conclusions are that the author can outperform earlier models
For the Twitter 2015 and 2017 datasets, we were able to by utilising a boosting strategy. We achieved state-of-the-art
produce state-of-the-art findings. In particular, we increased results for both the Twitter datasets from 2015 and 2017. For
the accuracy by 1.78% for the Twitter 2017 dataset and by the Twitter 2015 dataset, we obtained a greater accuracy of
1.55% for the Twitter 2015 dataset. A comparison of the 1.55%, while for the Twitter 2017 dataset, we achieved an
baseline models and both datasets (Twitter 2015-2017) may improvement of 1.78%. At the moment, emoticon analysis
be found in Table 1. is restricted to a particular set of information. On the other
hand, we acknowledge the significance of comprehending the
TABLE I practical applications of these symbols. Consequently, our goal
COMPARISON WITH BASELINE MODELS is to expand the scope of our study to include a greater variety
of datasets so that we can learn more about the subtleties of
Twitter-2015 Twitter-2017
emoticon usage.
Models Accuracy F1-M Accuracy F1-M
TomBERT(ResNet) 76.60 71.57 69.42 67.70 R EFERENCES
[15]
TomBERT(Faster 77.03 72.85 69.77 67.59 [1] Y. Wang, G. Huang, J. Li, H. Li, Y. Zhou, and H. Jiang, “Refined global
R-CNN) [15] word embeddings based on sentiment concept for sentiment analysis,”
IEEE Access, vol. 9, pp. 37 075–37 085, 2021.
LGBM+RoBERTa 0.7858 0.7288 0.7155 0.6920 [2] H. Silva, E. Andrade, D. Araújo, and J. Dantas, “Sentiment analysis of
(Proposed) tweets related to sus before and during covid-19 pandemic,” IEEE Latin
America Transactions, vol. 20, no. 1, pp. 6–13, 2022.
[3] J. He, A. Wumaier, Z. Kadeer, W. Sun, X. Xin, and L. Zheng, “A local
Figures 5 and 6 show the results of Twitter 2015 and 2016, and global context focus multilingual learning model for aspect-based
sentiment analysis,” IEEE Access, vol. 10, pp. 84 135–84 146, 2022.
comparing the baseline models with our proposed model. [4] Y. Bie and Y. Yang, “A multitask multiview neural network for end-to-
end aspect-based sentiment analysis,” Big Data Mining and Analytics,
vol. 4, no. 3, pp. 195–207, 2021.
[5] L. Yang, J. Wang, J.-C. Na, and J. Yu, “Generating paraphrase sentences
for multimodal entity-category-sentiment triple extraction,” Knowledge-
Based Systems, vol. 278, p. 110823, 2023.
[6] H. Yang, Y. Zhao, J. Liu, Y. Wu, and B. Qin, “Macsa: A multimodal
aspect-category sentiment analysis dataset with multimodal fine-grained
aligned annotations,” arXiv preprint arXiv:2206.13969, 2022.
[7] J. Wang, Q. Wang, Z. Wen, X. Liang, and R. Xu, “Interactive fusion
network with recurrent attention for multimodal aspect-based sentiment
analysis,” in CAAI International Conference on Artificial Intelligence.
Springer, 2022, pp. 298–309.
[8] L. De Bruyne, A. Karimi, O. De Clercq, A. Prati, and V. Hoste, “Aspect-
based emotion analysis and multimodal coreference: A case study of
customer comments on adidas instagram posts,” in Proceedings of the
Thirteenth Language Resources and Evaluation Conference, 2022, pp.
574–580.
[9] Y. Ektefaie, G. Dasoulas, A. Noori, M. Farhat, and M. Zitnik, “Multi-
modal learning with graphs,” Nature Machine Intelligence, vol. 5, no. 4,
Fig. 5. Twitter 2015 Dataset Results pp. 340–350, 2023.

Authorized licensed use limited to: Global Academy Of Technology. Downloaded on September 12,2025 at [Link] UTC from IEEE Xplore. Restrictions apply.
281
2023 Seventh International Conference on Image Information Processing (ICIIP)

[10] P. Thiengburanathum and P. Charoenkwan, “Setar: Stacking ensemble

learning for thai sentiment analysis using roberta and hybrid feature
representation,” IEEE Access, vol. 11, pp. 92 822–92 837, 2023.
[11] J. Khan, A. Alam, and Y. Lee, “Intelligent hybrid feature selection
for textual sentiment classification,” IEEE Access, vol. 9, pp. 140 590–
140 608, 2021.
[12] B. Jabir, I. De La Torre Dı́ez, E. F. B. Thompson, D. L. R. Vargas, and
G. K. Castilla, “Ensemble partition sampling (eps) for improved multi-
class classification,” IEEE Access, vol. 11, pp. 48 221–48 235, 2023.
[13] B. L. V. S. Aditya and S. N. Mohanty, “Heterogenous social media
analysis for efficient deep learning fake-profile identification,” IEEE
Access, vol. 11, pp. 99 339–99 351, 2023.
[14] L. Xu and W. Wang, “Improving aspect-based sentiment analysis with
contrastive learning,” Natural Language Processing Journal, vol. 3, p.
100009, 2023.
[15] J. Yu and J. Jiang, “Adapting bert for target-oriented multimodal
sentiment classification.” IJCAI, 2019.

Authorized licensed use limited to: Global Academy Of Technology. Downloaded on September 12,2025 at [Link] UTC from IEEE Xplore. Restrictions apply.
282

Common questions

RoBERTa has achieved advancements over BERT by training over a longer period and with larger datasets, which improved its NLP performance. Significant training procedure modifications such as removing the next sentence prediction task and introducing dynamic masking, diversified the training data further enhancing model performance. These changes enabled RoBERTa to achieve superior results across various NLP tasks compared to BERT .

MECSTE is considered a significant advancement as it aims to extract sentiment-oriented triples involving entities, categories, and sentiments from multimodal data sources, enhancing the granularity and comprehensiveness of sentiment analysis. By integrating both textual and visual data, it provides a multilayered understanding of sentiments related to specific entities and categories, reflecting more nuanced customer insights .

Developing effective MABSA models poses challenges such as accurately aligning text and visual data, recognizing relevant elements and emotions, and combining data from multiple sources. These challenges stem from the complexity introduced by the different modalities involved. Despite efforts, overcoming these issues is crucial for producing models that can provide accurate and comprehensive sentiment insights, relevant for industries like e-commerce, hospitality, and entertainment .

LGBM effectively handles large datasets due to its speed and efficiency in training and prediction. It manages data imbalances through the adaptability of its objective function, which uses gradient boosting to tailor approaches for different imbalances, such as in fraud detection scenarios where data may be highly skewed .

Visual data in MABSA provides additional context that text alone might miss, such as a product's appearance or a service's presentation. This integration helps clarify the sentiment polarity of ambiguous text phrases, thus increasing the analysis's precision. Consequently, it offers businesses more comprehensive insights into customer preferences, beyond what text-based analysis can achieve .

Ensemble learning improves accuracy by combining the predictions from various models, leveraging their individual strengths and offsetting weaknesses. Techniques used include bagging, which trains models on different data portions; boosting, which corrects models' mistakes by focusing more on previously misclassified samples; and stacking, which combines predictions from multiple models into a final prediction. This results in predictions that are typically more reliable and accurate than what single models can achieve .

The Twitter 15 and 17 datasets are valuable for MABSA research due to their incorporation of text and image data, aspect categories, and sentiment polarities. They offer a multimodal approach to sentiment analysis, capturing a comprehensive view of customer opinions. These datasets enhance the accuracy of aspect category definitions and emotion polarities, making them indispensable resources for improving ABSA models .

RoBERTa's performance improvements over BERT resulted from modifications like training over a more extensive period with a larger dataset, removing the next sentence prediction task, and introducing dynamic masking. These changes provided a richer diversity of training data, allowing RoBERTa to adapt better to various NLP tasks and achieve higher accuracy .

Caption generation enhances accessibility by providing textual descriptions of images or videos, which benefits visually impaired users by offering them an understanding of visual content. These advancements are valuable in image search, product recommendations, and social media, where ensuring accessibility and providing context are important for improving user experience .

Image-text pairing enhances user cognitive experience by presenting a cohesive and thorough understanding of the content. This combination ensures that visual elements are contextually matched with relevant textual information, leading to better user engagement and more informed content interpretation. For users, especially those with impairments, it offers enhanced comprehension and accessible delivery of information .

Traversing The Landscape of ABSA
No ratings yet
Traversing The Landscape of ABSA
29 pages
Explainable Aspect-Based Sentiment Analysis Using Transformer Models
No ratings yet
Explainable Aspect-Based Sentiment Analysis Using Transformer Models
30 pages
A Survey On Aspect-Based Sentiment Analysis
No ratings yet
A Survey On Aspect-Based Sentiment Analysis
21 pages
Enhancing BERT for Aspect-Based Sentiment Analysis
No ratings yet
Enhancing BERT for Aspect-Based Sentiment Analysis
10 pages
Multitask Learning and BERT Embedding: A Comprehensive Approach To Subjectivity Detection and Aspect-Based Sentiment Analysis
No ratings yet
Multitask Learning and BERT Embedding: A Comprehensive Approach To Subjectivity Detection and Aspect-Based Sentiment Analysis
24 pages
Multilingual Aspect-Based Sentiment Analysis
No ratings yet
Multilingual Aspect-Based Sentiment Analysis
4 pages
Major Report 3
No ratings yet
Major Report 3
34 pages
Aspect-Based Sentiment Analysis Review
No ratings yet
Aspect-Based Sentiment Analysis Review
4 pages
Ensemble Deep Network for ABSA Analysis
No ratings yet
Ensemble Deep Network for ABSA Analysis
11 pages
Overview of Aspect-Based Sentiment Analysis
No ratings yet
Overview of Aspect-Based Sentiment Analysis
13 pages
Indo LEGO-ABSA A Multitask Generative Aspect Based Sentiment Analysis For Indonesian Language
No ratings yet
Indo LEGO-ABSA A Multitask Generative Aspect Based Sentiment Analysis For Indonesian Language
6 pages
Improved ABSA Model for Hotel Reviews
No ratings yet
Improved ABSA Model for Hotel Reviews
13 pages
Tran 2021
No ratings yet
Tran 2021
6 pages
BERT and LLM for Chinese Sentiment Analysis
No ratings yet
BERT and LLM for Chinese Sentiment Analysis
11 pages
Tag-free Aspect-Based Sentiment Analysis
No ratings yet
Tag-free Aspect-Based Sentiment Analysis
8 pages
Advances in Restaurant Sentiment Analysis
No ratings yet
Advances in Restaurant Sentiment Analysis
10 pages
Aspect-Based Sentiment Analysis Framework
No ratings yet
Aspect-Based Sentiment Analysis Framework
21 pages
Complex ABSA Methodologies Survey
No ratings yet
Complex ABSA Methodologies Survey
17 pages
Auto-ABSA: Cross-Domain Aspect Detection
No ratings yet
Auto-ABSA: Cross-Domain Aspect Detection
11 pages
Aspect-Based Sentiment Analysis Review
No ratings yet
Aspect-Based Sentiment Analysis Review
32 pages
Supervised Learning Based Approach To Aspect Based Sentiment Analysis
No ratings yet
Supervised Learning Based Approach To Aspect Based Sentiment Analysis
5 pages
1417 FInal Version (116-124)
No ratings yet
1417 FInal Version (116-124)
9 pages
Aspect-Based Sentiment Analysis Model
No ratings yet
Aspect-Based Sentiment Analysis Model
7 pages
Evaluating Annotated Dataset of Customer Reviews For Aspect Based Sentiment Analysis
No ratings yet
Evaluating Annotated Dataset of Customer Reviews For Aspect Based Sentiment Analysis
34 pages
37 - Datasets For Aspect-Based Sentiment Analysis in Bangla and Its Baseline Evaluation
No ratings yet
37 - Datasets For Aspect-Based Sentiment Analysis in Bangla and Its Baseline Evaluation
10 pages
Data 03 00015 PDF
No ratings yet
Data 03 00015 PDF
11 pages
Jurnal Part of Speech Grup 3 (Budiman, Dini, Rista, Uju)
No ratings yet
Jurnal Part of Speech Grup 3 (Budiman, Dini, Rista, Uju)
25 pages
Aspect-Based Sentiment Analysis For Restaurant Reviews
No ratings yet
Aspect-Based Sentiment Analysis For Restaurant Reviews
21 pages
ABSA of Tweets on Plant-Based Foods
No ratings yet
ABSA of Tweets on Plant-Based Foods
10 pages
Aspect-Based Sentiment Analysis Model
No ratings yet
Aspect-Based Sentiment Analysis Model
16 pages
Unifying
No ratings yet
Unifying
22 pages
Arabic Aspect-Based Sentiment Analysis
No ratings yet
Arabic Aspect-Based Sentiment Analysis
13 pages
ChatGPT for ABSA Prototyping
No ratings yet
ChatGPT for ABSA Prototyping
16 pages
Arabic Aspect-Based Sentiment Analysis Review
No ratings yet
Arabic Aspect-Based Sentiment Analysis Review
18 pages
Enhanced ABSA with Attention-CNN Model
No ratings yet
Enhanced ABSA with Attention-CNN Model
11 pages
Transfer Learning for E-commerce Sentiment Analysis
No ratings yet
Transfer Learning for E-commerce Sentiment Analysis
5 pages
Aspect-Oriented Sentiment Analysis in E-Commerce
No ratings yet
Aspect-Oriented Sentiment Analysis in E-Commerce
13 pages
Deep Context BERT for Aspect Sentiment Analysis
No ratings yet
Deep Context BERT for Aspect Sentiment Analysis
12 pages
Aspect-Level Sentiment Analysis with BERT
No ratings yet
Aspect-Level Sentiment Analysis with BERT
13 pages
Aspect Based Elsvierj - Eswa.2018.10.003
No ratings yet
Aspect Based Elsvierj - Eswa.2018.10.003
45 pages
Learning Implicit Sentiment in Aspect-Based Sentiment Analysis With Supervised Contrastive Pretraining
No ratings yet
Learning Implicit Sentiment in Aspect-Based Sentiment Analysis With Supervised Contrastive Pretraining
11 pages
Aspect-Based Sentiment Analysis in Education
No ratings yet
Aspect-Based Sentiment Analysis in Education
8 pages
Aspect Based Sentiment Analysis
No ratings yet
Aspect Based Sentiment Analysis
5 pages
Aspect Based Sentiment Analysis in Music: A Case Study With Spotify
No ratings yet
Aspect Based Sentiment Analysis in Music: A Case Study With Spotify
8 pages
Enhancing Aspect-Based Sentiment Analysis
No ratings yet
Enhancing Aspect-Based Sentiment Analysis
8 pages
BERT-Enhanced Sentiment Analysis for Student Feedback
No ratings yet
BERT-Enhanced Sentiment Analysis for Student Feedback
28 pages
Aspect-Based Sentiment Analysis with Deep Learning
No ratings yet
Aspect-Based Sentiment Analysis with Deep Learning
11 pages
Cold-Start Deep Memory Network for ME-ABSA
No ratings yet
Cold-Start Deep Memory Network for ME-ABSA
7 pages
1091 Ijmlc 1518
No ratings yet
1091 Ijmlc 1518
10 pages
Enhancing ABSA with Aspect Embedding
No ratings yet
Enhancing ABSA with Aspect Embedding
12 pages
Aspect-Based Sentiment Analysis For Hospitality in
No ratings yet
Aspect-Based Sentiment Analysis For Hospitality in
15 pages
Aspect-Based Sentiment Analysis Guide
0% (1)
Aspect-Based Sentiment Analysis Guide
38 pages
Approaches to Aspect-Based Sentiment Analysis
No ratings yet
Approaches to Aspect-Based Sentiment Analysis
4 pages
Multi Task Solution For Aspect Category Sentiment Analysis On Vietnamese Datasets
No ratings yet
Multi Task Solution For Aspect Category Sentiment Analysis On Vietnamese Datasets
6 pages
SSD-GCN for Aspect-Level Sentiment Analysis
No ratings yet
SSD-GCN for Aspect-Level Sentiment Analysis
25 pages
Comprehensive Test Plan for E-commerce Apps
No ratings yet
Comprehensive Test Plan for E-commerce Apps
10 pages
Deep Learning Enhancing Big Data Analysis
No ratings yet
Deep Learning Enhancing Big Data Analysis
4 pages
AI Chatbot Test Plan Guide
No ratings yet
AI Chatbot Test Plan Guide
8 pages
SQL Constraints Overview and Examples
No ratings yet
SQL Constraints Overview and Examples
21 pages
IoT Design Methodology Overview
No ratings yet
IoT Design Methodology Overview
13 pages
Overview of Computer Storage Devices
No ratings yet
Overview of Computer Storage Devices
16 pages
Bioinformatics & Clinical Data Resume
100% (5)
Bioinformatics & Clinical Data Resume
2 pages
NLP Applications and Pipeline Overview
No ratings yet
NLP Applications and Pipeline Overview
39 pages
PIXIU: Financial LLM and Benchmark
No ratings yet
PIXIU: Financial LLM and Benchmark
16 pages
Insurance Management System Project Report
No ratings yet
Insurance Management System Project Report
34 pages
Indexes and Views Implementation Guide
No ratings yet
Indexes and Views Implementation Guide
2 pages
Documentum Architecture White Paper
No ratings yet
Documentum Architecture White Paper
47 pages
Databricks Data Engineer Exam Prep Guide
No ratings yet
Databricks Data Engineer Exam Prep Guide
55 pages
SOA and Cloud Computing Overview
No ratings yet
SOA and Cloud Computing Overview
13 pages
Health Information Systems Glossary
No ratings yet
Health Information Systems Glossary
68 pages
Cryptography and Network Security Behrouz ch01 Slides
No ratings yet
Cryptography and Network Security Behrouz ch01 Slides
22 pages
API Health-Check Platform Overview
No ratings yet
API Health-Check Platform Overview
36 pages
Cloud-to-Cloud Data Integrity Analysis
No ratings yet
Cloud-to-Cloud Data Integrity Analysis
4 pages
Lab6 (Advanced Queries)
No ratings yet
Lab6 (Advanced Queries)
5 pages
Deep Learning for Imbalanced IDS
No ratings yet
Deep Learning for Imbalanced IDS
4 pages
Object Relational and NoSQL Databases Guide
No ratings yet
Object Relational and NoSQL Databases Guide
3 pages
Information Management for Decision-Making
No ratings yet
Information Management for Decision-Making
21 pages
Mobile App Project Management WBS
No ratings yet
Mobile App Project Management WBS
1 page
Implementing SharePoint Corporate Intranet
No ratings yet
Implementing SharePoint Corporate Intranet
3 pages
Understanding Relational Data Models
No ratings yet
Understanding Relational Data Models
71 pages
Understanding ACID Properties in DBMS
No ratings yet
Understanding ACID Properties in DBMS
5 pages
Defining Health Informatics Services
No ratings yet
Defining Health Informatics Services
30 pages
5610 RM-359 Schematics
No ratings yet
5610 RM-359 Schematics
11 pages
AI-Powered Health Insights Agent
100% (1)
AI-Powered Health Insights Agent
10 pages
Judiciary Information System SRS
No ratings yet
Judiciary Information System SRS
11 pages

End-to-End Multimodal Sentiment Analysis

Uploaded by

End-to-End Multimodal Sentiment Analysis

Uploaded by

2023 Seventh International Conference on Image Information Processing (ICIIP)

A Transformer Model for end-to-end Image and

Amit Chauhan Aman Sharma Rajni Mohana

LGBM, which stands for Light Gradient Boosting Machine

Fig. 3. Steps involved in Preprocessing Fig. 4. Sample of dataset

C. Image-Text Pairing The Multimodal Twitter 17 dataset was presented by Huang

models for social media platform analysis of client thoughts

This section discusses our proposed framework’s analysis

[10] P. Thiengburanathum and P. Charoenkwan, “Setar: Stacking ensemble

Common questions

What advancements in Natural Language Processing (NLP) have RoBERTa models achieved compared to their predecessor models like BERT?

Why is Multimodal Entity-Category-Sentiment Triple Extraction (MECSTE) considered a significant advancement in sentiment analysis?

What are the core challenges faced by researchers in developing effective Multimodal Aspect-Based Sentiment Analysis (MABSA) models?

How does the LGBM (Light Gradient Boosting Machine) handle large datasets and data imbalances effectively?

How does the integration of visual data enhance Multimodal Aspect-Based Sentiment Analysis (MABSA) compared to traditional text-based sentiment analysis?

In what ways does ensemble learning improve the accuracy of machine learning models, and which techniques are used in ensemble learning?

What are the key features of datasets such as Twitter 15 and 17 that make them valuable for Multimodal Aspect-Based Sentiment Analysis research?

What modifications in the training process of RoBERTa contributed to its performance improvements over BERT?

How is caption generation beneficial in enhancing accessibility and what fields benefit from these advancements?

What role does image-text pairing play in enhancing user cognitive experience in content delivery?

You might also like