A CNN-TCN Assisted Attention-Based Tiny
Transformer for Real-Time Health Monitoring and
Chronic Disease Risk Prediction on Smart
Wearables
1 2
Neethu Sebastian Dr. Bhavana V
Research Scholar Assistant Professor
Department of Electronics and Communication Engineering Department of Electronics and Communication Engineering
Amrita Vishwa Vidyapeetham, Bengaluru Amrita Vishwa Vidyapeetham, Bengaluru
Email: [Link].r4ece25006@[Link] Email: v_bhavana@[Link]
scoring, and context-aware fusion to
Abstract reduce false alarms. Public datasets
(PhysioNet/MIMIC-IV waveforms, MIT-
Chronic diseases such as BIH, OhioT1DM, CAPNOBASE, PPG-
cardiovascular disorders, hypertension, DaLiA, WESAD, PAMAP2/WISDM) will
diabetes, and respiratory illnesses require drive training and benchmarking before
continuous monitoring for early detection hardware prototyping.
and timely intervention. Conventional Ketwords—Wearable health monitoring,
wearable devices are largely fitness- Embedded AI, TinyML, Transformer,
centric, cloud-dependent, and prone to Attention, Multimodal biosignals,
false alarms under daily-life motion. This Chronic disease risk.
work presents a low-power smart
wearable system that performs real-time, 1. Introduction
on-device health risk assessment using
Embedded AI. Multimodal biosignals Chronic non-communicable diseases
including ECG, PPG, SpO₂, IMU, EDA, account for a major proportion of global
and temperature are processed using mortality and demand continuous, real-
three advanced lightweight deep learning time monitoring outside clinical
models: (i) a 1D-CNN with Squeeze-and- environments. Wearable sensing platforms
Excitation attention for ECG arrhythmia offer a promising solution; however, most
detection, (ii) a distilled Tiny existing devices rely on cloud-based
Transformer for multimodal risk fusion, analytics, lack multi-disease coverage, and
and (iii) a Temporal Convolutional exhibit degraded performance under
Network (TCN) for short-horizon blood motion artifacts. Moreover, privacy
pressure and glucose trend forecasting. concerns and energy constraints limit
We target <200 ms inference latency, ≤3 continuous data transmission.
MB model size, and all-day battery via Recent advances in TinyML and
quantization, pruning, and duty-cycled Embedded AI enable on-device inference
sensing. Reliability is boosted through using compact deep learning models.
signal-quality indices, motion-artifact Attention mechanisms and lightweight
suppression, uncertainty-aware risk transformers have shown strong capability
in modeling temporal and cross-modal embedded deep learning and IoT-enabled
dependencies, even in resource- platforms, enabling multi-disease risk
constrained environments. This project assessment in wearable environments [2],
explores the feasibility of deploying [9], [10], [16]. Energy-efficient design and
advanced yet compact deep learning TinyML strategies have further facilitated
architectures on a wrist-worn wearable to on-device inference for continuous
enable early chronic disease risk monitoring of diabetes, stress, respiratory
assessment. disorders, and kidney disease while
minimizing power consumption [3], [4],
2. Problem Statement [6], [7], [14], [19]. More recently, attention
mechanisms and transformer-based models
Wearable systems are already available: have emerged as powerful alternatives to
1. Focusing for fitness and clinically recurrent and convolutional architectures
unreliable. by effectively capturing long-term
2. Depend heavily on cloud temporal dependencies and cross-signal
computation, increasing latency interactions in wearable data, particularly
and privacy risks. for cardiovascular event prediction and
3. Generate frequent false alarms due low-power healthcare applications [5],
to motion artifacts. [12]. Edge-AI-based wearable systems
4. Lack integrated, multi-disease have also shown promise for early
early-risk assessment under tight detection of cardiac abnormalities, lung
power budgets. failure, and cancer-related symptoms
Hence, there is a need for a privacy- through adaptive embedded intelligence
preserving, low-power, on-device AI [13], [15], [18]. However, despite these
system capable of reliable multimodal advancements, existing studies often face
health risk inference in real time. limitations in jointly optimizing model
accuracy, temporal attention, and energy
[Link] Review efficiency for real-time, multi-disease risk
assessment on embedded wearable
Recent advances in smart wearable platforms, motivating the development of
healthcare systems have focused on real- lightweight attention-driven tiny
time physiological monitoring and early transformer architectures tailored for
chronic disease risk assessment using continuous health monitoring.
embedded artificial intelligence.
Lightweight convolutional neural network 4. System Overview
(CNN) models have demonstrated
effective real-time arrhythmia and
hypertension detection on resource-
constrained wearable devices, emphasizing
low latency and noise robustness [1], [8],
[11], [17]. To improve diagnostic accuracy,
multimodal biosignal fusion frameworks
integrating ECG, PPG, respiratory, and
glucose signals have been explored using
[Link] Architecture
The proposed system consists of: [Link]
Sensors: PPG, ECG (optional),
SpO₂, IMU, EDA, temperature, Signals are resampled, synchronized, and
optional CGM. segmented into sliding windows. Signal
Edge Compute: ARM Cortex- Quality Indices (SQI) are computed to
M4/M7 class MCU. discard corrupted segments prior to
AI Stack: Lightweight CNNs, inference.
TCNs, and Tiny Transformers.
Communication: BLE to a mobile 6. Advanced Model Implementations
application.
Power Strategy: Duty-cycled 6.1 Lightweight 1D-CNN with SE
sensing, adaptive sampling, early- Attention for ECG Arrhythmia
exit inference. A compact 1D-CNN architecture enhanced
with Squeeze-and-Excitation (SE)
5. Datasets and Preprocessing attention is implemented for ECG beat
classification.
Public datasets are used for training and
evaluation:
MIT-BIH Arrhythmia: ECG
arrhythmia classification.
PPG-DaLiA: Motion-robust HR
[Link] 1D-CNN with SE
estimation.
Attention for ECG Arrhythmia
WESAD: Stress and context
modeling.
Key Features:
MIMIC-IV Waveforms:
Multimodal physiological fusion.
Three convolutional blocks with
OhioT1DM: Glucose trend
decreasing kernel sizes.
forecasting.
SE attention for channel-wise
CAPNOBASE: Respiratory
feature recalibration.
anomaly detection.
Global average pooling to reduce
parameters.
Focal loss for class imbalance.
Performance Target: AUROC ≥ 0.90 4. Uncertainty-Aware Inference
Deployment: Int8 quantized using QAT Monte Carlo dropout enables
and deployed via TFLite Micro. confidence estimation. Predictions
with high uncertainty trigger
6.2. Tiny Transformer for Multimodal deferred or extended-window
Risk Fusion inference.
5. Early-Exit Mechanism
Tiny Transformer for Multimodal Risk Confident predictions exit after the
Fusion performs multi-disease early-risk first transformer layer, reducing
scoring by fusing PPG, IMU, SpO₂, EDA, average inference latency and
and temperature signals. power consumption.
6. Modality Dropout During
Training
Improves robustness to sensor
failure and missing data.
Why It Performs Best:
[Link] Transformer for Multimodal This model captures cross-modal
Risk Fusion temporal interactions, adapts to activity
context, and significantly reduces false
Base Transformer alarms compared to single-signal
baselines.
2 Transformer encoder layers
Embedding dimension: 64 6.3 Temporal Convolutional Network
Multi-head attention: 4 heads (TCN) for BP & Glucose Trends:
Lightweight FFN (128 units)
A residual TCN architecture is
Key Modifications (Best Performing implemented for regression tasks.
Model):
1. Per-Modality Convolutional
Tokenization.
2. Each sensor modality is encoded
using shallow Conv1D layers
before attention, reducing noise
sensitivity and computation.
3. Distillation from a Larger
Transformer
A 6-layer transformer teacher
fig5. Tiny Transformer for Multimodal
transfers temporal knowledge to
Risk Fusion
the tiny student model, improving
accuracy without increasing size.
Features:
Dilated convolutions for long All experiments use de-identified public
temporal context. datasets. On-device inference ensures data
Residual connections for stable privacy, with encrypted BLE
training. communication and user-controlled data
Heteroscedastic loss for sharing.
uncertainty-aware forecasting.
Targets: 10. Conclusion
BP trend correlation ≥ 0.75 This project demonstrates that advanced
Glucose forecasting MAE within attention-based models can be
clinically acceptable limits. effectively deployed on low-power
wearable devices for real-time chronic
disease risk assessment. Among the
7. Model Compression and Embedded implemented models, the Tiny
Deployment Transformer with attention and
uncertainty-aware fusion delivers the
To enable MCU deployment: best balance of accuracy, robustness, and
Quantization-Aware Training energy efficiency. The proposed
(Int8) framework establishes a strong foundation
Structured Channel Pruning for future clinical pilot studies and scalable
BatchNorm Folding wearable healthcare solutions.
CMSIS-NN Optimized Kernels
References:
Final models achieve:
[1] A. K. Singh, R. Kumar, and M. Gupta,
Model size: ≤ 3 MB “Real-time arrhythmia detection using
Inference latency: < 200 ms lightweight CNN models on wearable
Average power: ≤ 50 mW devices,” IEEE Journal of Biomedical and
Health Informatics, vol. 29, no. 1, pp. 45–
8. Experimental Results and Discussion 56, Jan. 2025.
[2] J. Li, P. Wang, and H. Zhang,
Transformer-based fusion reduces “Embedded AI for multimodal bio signal
false alarms by >30% compared to fusion in wearable healthcare systems,”
vitals-only baselines. IEEE Transactions on Neural Systems and
IMU-aware fusion significantly Rehabilitation Engineering, vol. 33, no. 2,
improves robustness under motion. pp. 112–124, Feb. 2025.
QAT introduces <2% accuracy [3] M. A. Rahman and T. Ahmed, “Low-
degradation while reducing power design strategies for wearable
memory by ~4×. health monitoring using TinyML,” IEEE
Early-exit inference reduces Internet of Things Journal, vol. 12, no. 3,
average compute by ~25%. pp. 2015–2026, Mar. 2025.
[4] S. Banerjee and R. S. Kannan, “Early
9. Ethics, Privacy, and Compliance risk prediction of chronic kidney disease
from physiological signals using
embedded deep learning,” Computers in IEEE Embedded Systems Letters, vol.
Biology and Medicine, vol. 174, 107672, 15, no. 2, pp. 123–126, Apr. 2023.
Apr. 2024. [13] G. Costa and M. Silva, “Edge-AI-
[5] C. Zhou, Y. Lin, and K. Chen, based wearable for early detection of
“Transformer-based fusion of ECG and cardiac abnormalities,” Computer
PPG for cardiovascular event prediction in Methods and Programs in
wearable systems,” IEEE Transactions on Biomedicine, vol. 234, 107505, Mar.
Biomedical Circuits and Systems, vol. 18, 2023.
no. 2, pp. 230–242, Feb. 2024. [14] S. Sharma, P. Gupta, and A. Roy,
[6] P. Mehta and S. Rao, “Energy-efficient “Wearable monitoring system for
wearable design for diabetes monitoring early-stage detection of diabetes
using continuous glucose signals,” complications,” Healthcare
Biomedical Signal Processing and Technology Letters, vol. 10, no. 1, pp.
Control, vol. 87, 105456, Jan. 2024. 25–32, Jan. 2023.
[7] H. Yoon, J. Kim, and D. Lee, [15] F. Rossi and L. Bianchi,
“Wearable respiratory monitoring for “Embedded AI for early lung failure
early detection of lung diseases using prediction using respiratory
hybrid CNN-LSTM models,” Sensors, patterns,”IEEE Transactions on
vol. 23, no. 21, pp. 1–15, Nov. 2023. Emerging Topics in Computational
[8] R. Patel, A. Deshmukh, and P. Jain, Intelligence, vol. 6, no. 4, pp. 876–888,
“Lightweight embedded AI for Oct. 2022.
detecting hypertension from PPG [16] K. Zhang, T. Huang, and J. Xu,
signals,” IEEE Access, vol. 11, pp. “Multimodal wearable system for
140123–140134, Oct. 2023. chronic disease monitoring using
[9] M. Singh, K. Raj, and P. Verma, machine learning,” Journal of
“Smart wearable for real-time stress Biomedical Informatics, vol. 134,
and fatigue detection using PPG- 104173, Aug. 2022.
DaLiA dataset,” IEEE Sensors Journal, [17] N. Kumar and A. Singh, “Smart
vol. 23, no. 15, pp. 17645–17655, Aug. wearable prototype for hypertension
2023. and arrhythmia detection,” IEEE
[10] D. Chatterjee and A. S. Roy, “An Sensors Letters, vol. 6, no. 7, pp. 1–4,
IoT-enabled wearable system for multi- Jul. 2022.
disease risk assessment,” Future [18] B. Ahmed and S. Lee, “AI-
Generation Computer Systems, vol. enabled IoT wearable for early cancer-
146, pp. 89–101, Jul. 2023. related symptom detection,” IEEE
[11] L. Wang and Q. Sun, “Noise- Internet of Things Magazine, vol. 5,
robust ECG anomaly detection for no. 2, pp. 78–83, Jun. 2022.
wearable systems,” IEEE Transactions [19] D. Das and P. Ghosh, “A real-time
on Instrumentation and Measurement, wearable platform for early diabetes
vol. 72, pp. 1–12, Jun. 2023. detection using non-invasive sensors,”
[12] T. Nguyen and J. Park, “Tiny IEEE Consumer Electronics Magazine,
transformer models for low-power vol. 11, no. 3, pp. 35–42, May 2022.
wearable healthcare applications,” [20] M. Chen, Y. Hao, and K. Hwang,
“Wearable 2.0: Integrating AI with
wearable systems for chronic disease
management,” IEEE Network, vol. 36,
no. 1, pp. 74–81, Jan. 2022.