Biological Tissue Stress Response Classification from Raman Spectroscopy Data
Back to Projects
R&D· BioTech

Biological Tissue Stress Response Classification from Raman Spectroscopy Data

We developed an ML pipeline for non-contact detection of biological tissue stress response from Raman spectroscopy data — classifying HSP70 protein expression into three classes with an interpretable ensemble approach.

Client

NDA

Period

1 week

Format

ML research

PythonXGBoostRandom Forestscikit-learnNumPyPandas

About the project

We developed an ML pipeline for non-contact detection of biological tissue stress response from Raman spectroscopy data. Target: classify HSP70 protein expression into three classes — healthy tissue, endogenous stress response, and exogenous stress response. Two spectral ranges were examined — 1500 cm⁻¹ and 2900 cm⁻¹, with the 1500 cm⁻¹ range showing key diagnostic value due to protein marker association.

The Challenge

We needed an interpretable ML system capable of determining tissue stress response from spectral data without overfitting to noise and hardware artifacts — avoiding the black-box effect by showing which spectral features and biomarkers influence the diagnosis.

Our Solution

  • Built spectral preprocessing pipeline: axis interpolation, median filtering, Savitzky-Golay smoothing, and SNV normalization
  • Rejected raw data training due to hardware artifacts and physically uninformative noise
  • Built Soft Voting ensemble on XGBoost and Random Forest
  • Trained on full interpolated spectral feature vector without forced dimensionality reduction
  • Implemented strict Patient-Level GroupKFold to prevent data leakage
  • Split inference: per-spectrum prediction and patient-level aggregation via majority voting
  • Added explainable AI module based on Gini Feature Importance for biomarker localization

Results

  • ML pipeline for non-contact tissue stress response classification delivered
  • High model accuracy confirmed on informative spectral range
  • Ensemble approach applicability to high-noise Raman spectroscopy data proven
  • Result interpretability ensured through significant spectral biomarker identification
  • Project shows potential for automated optical biopsy and in vitro tissue diagnostics

Let's buildsomething extraordinary.

Ready to start your next project? Reach out and let's discuss how we can help you achieve your goals.