Machine Learning · Signal Processing · Real-Time Mobile

Human Activity
Recognition

Classifying 6 daily activities from smartphone accelerometer and gyroscope signals. UCI HAR Dataset — 30 subjects, Samsung Galaxy S II, 50 Hz, 561 pre-extracted features. Best model: 95.52% accuracy. Live phone demo runs the ML model entirely in your browser.

30
Subjects
561
Features
10K
Samples
95.5%
Accuracy
Dataset

Experiment Overview

30 volunteers aged 19–48 performed 6 activities wearing a Samsung Galaxy S II at the waist. Signals captured at 50 Hz, segmented into 2.56 s windows with 50% overlap.

🚶
WALKING
Train: 1,226 · Test: 496
⬆️
WALKING UPSTAIRS
Train: 1,073 · Test: 471
⬇️
WALKING DOWNSTAIRS
Train: 986 · Test: 420
🪑
SITTING
Train: 1,286 · Test: 491
🧍
STANDING
Train: 1,374 · Test: 532
🛌
LAYING
Train: 1,407 · Test: 537
Signal Channels
Body acceleration X / Y / Z
Body gyroscope X / Y / Z
Total acceleration X / Y / Z
128 timesteps per window · 50 Hz
Feature Engineering (561)
Time-domain: mean, std, MAD, energy, IQR, entropy, AR coefficients
Frequency-domain (FFT): meanFreq, skewness, kurtosis, bandEnergy
17 signal types × multiple statistics
Model Evaluation

Results — 5 Models Compared

All models evaluated on held-out test subjects (9 of 30 subjects never seen during training).

Logistic Regression
95.52%BEST
Linear baseline — strong thanks to hand-crafted 561 features
SVM (RBF, C=10)
95.49%
Near-identical to LR; RBF kernel captures non-linear boundaries
LightGBM
93.45%
200 estimators, LR=0.1 — fast gradient boosting
XGBoost
93.42%
200 trees, max depth=6
Random Forest
92.57%
200 trees — best for feature importance analysis; gravity features dominate

Per-Activity Performance (Best Model — Logistic Regression)

🚶
WALKING
96%
⬆️
WALK UPSTAIRS
95%
⬇️
WALK DOWNSTAIRS
94%
🪑
SITTING
93%
🧍
STANDING
94%
🛌
LAYING
100%
Key Findings

Analysis & Insights

What the data reveals about activity recognition using inertial sensors.

🎯 LAYING — Perfect Classification

Gravity acts along a completely different axis when horizontal. The gravity acceleration along Z becomes dominant — completely separating LAYING from all other activities. 100% precision and recall on the test set.

⚠️ SITTING vs STANDING — Hardest Pair

Both are static postures with similar gravity magnitude. The only difference is a subtle change in trunk angle. ~48 SITTING windows misclassified as STANDING in the confusion matrix — the hardest confusion pair.

📊 Top 10 Features — Random Forest Importance
tGravityAcc-mean()-X
4.18%
tGravityAcc-min()-X
3.19%
tGravityAcc-mean()-Y
2.98%
angle(X,gravityMean)
2.60%
tGravityAcc-energy()-X
2.41%
angle(Y,gravityMean)
2.32%
tGravityAcc-max()-X
2.25%
tGravityAcc-min()-Y
2.21%
tGravityAcc-max()-Y
1.91%
tGravityAcc-energy()-Y
1.62%

Gravity features dominate — they encode body orientation and effectively separate static from dynamic activities.

Why linear models win: The 561 features were hand-crafted by domain experts — means, standard deviations, FFT coefficients, etc. By the time data reaches the classifier, activities are already nearly linearly separable. Logistic Regression (95.52%) matches SVM (95.49%) because the feature engineering does the heavy lifting. Boosting methods achieve ~93% — still excellent, but the marginal gains from non-linearity are small.
Implementation

ML Pipeline

End-to-end flow from raw sensor signals to activity prediction.

📡 Raw Sensor Signals
50 Hz · 9 channels
🔧 Pre-processing
Butterworth filter · 2.56s windows
📐 Feature Extraction
561 features · time + freq domain
⚖️ StandardScaler
zero mean · unit variance
🤖 Classifier
LR / SVM / RF / XGB / LGB
🎯 Activity Label
95.52% accuracy
📓 Notebook Contents
① Data loading & class distribution
② PCA 2D projection (coloured by activity)
③ t-SNE embedding (2,000 samples)
④ Raw body acceleration signal plots
⑤ Top feature mean comparison
⑥ 5 classifiers trained & evaluated
⑦ Confusion matrices (SVM + LR)
⑧ Feature importance (Random Forest)
⑨ Static vs dynamic activity scatter
⑩ Model save (joblib)
⚙️ Tech Stack
Python 3.10Scikit-learnXGBoost LightGBMNumPyPandas MatplotlibSeabornStreamlit t-SNEPCAStandardScaler joblib
📁 Repository
har_notebook.ipynb
app.py  (Streamlit)
UCI HAR Dataset/
requirements.txt
Reference

Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra and Jorge L. Reyes-Ortiz. A Public Domain Dataset for Human Activity Recognition Using Smartphones. ESANN 2013, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning.

↗ UCI ML Repository — Human Activity Recognition
📱

Try It Live on Your Phone

Open the Streamlit app on your phone. The ML model runs entirely in your browser — no data leaves your device. Tap Start, move naturally, see your activity predicted in real time.

🚀 Open Live Demo View Source Code
✅ iPhone Safari
✅ Android Chrome
⚡ No install needed
🔒 No data sent to server