Client-Side Scanning (CSS), as proposed under the EU Chat Control regulation, would require message content to be scanned on a user's device before encryption — placing the detection mechanism outside the encrypted channel. Technical analysis focused on two failure modes: high false positive rates in content matching, and the feasibility of deploying hidden dual-purpose models capable of facial recognition. The critical finding was structural: the legislation delegates scanning authority to private entities, which makes mass surveillance architecturally possible independent of stated legislative intent. This constitutes a violation of Articles 7 and 8 of the EU Charter of Fundamental Rights, which guarantee privacy and data protection.
Nonlinear Inequality System for Doubly Stochastic Matrices
Doubly stochastic matrices have known eigenvalue characterisations in low dimensions, but for n ≥ 6 the question of which eigenvalue regions are achievable remains open — a problem unresolved for roughly sixty years. The work investigated whether a nonlinear inequality system could characterise those regions. Because no analytical solution exists and there is no ground truth to validate against, the approach relied on heuristic methods: simulated annealing, neural networks, interval arithmetic, and grid search. The result was a ruling-out. The solution space grows in complexity with dimension in a way that makes generalisations from smaller cases infeasible.
Cardiovascular Disease Prediction with Explainable AI
Cardiovascular disease is a secondary risk for patients with Chronic Myeloid Leukaemia (CML), and predicting that risk requires working with data where individual predictions carry direct clinical consequences. A random forest model was trained for risk stratification, then integrated with SHAP for global feature attribution, LIME for local approximation, and DiCE for counterfactual generation. Explainability matters here for a specific reason: a clinician cannot act on a score they cannot interrogate. A knowledge graph built from clinical guidelines added a layer of structured reasoning to support treatment suggestions alongside the model's output.
An ensemble classifier combining XGBoost and LightGBM was trained to detect glaucoma and diabetic retinopathy from fundus images. The model reached 99% accuracy on the dataset, outperforming CNN and SVM baselines on the same evaluation. Published in Springer Proceedings.
Experience
CareData Infomatics
Intern — Chennai, –
COVID-19 patient records were analysed across variables including age, vaccination status, and comorbidities to identify which factors predicted higher-risk outcomes. K-means clustering produced risk-level segments; logistic regression was then applied to assign probabilistic scores to new records. The clearest result from the data was the predictive weight of cardiovascular disease: patients with pre-existing cardiovascular conditions were consistently classified at elevated risk across both methods.
Education
Maastricht University–present
MSc Data Science for Decision Making
SRM Institute of Science and Technology
BTech Computer Science and Engineering
Specialisation: Big Data Analytics
Skills
LanguagesPython, Java, C++, MATLAB, SQL
ML / DataScikit-learn, TensorFlow, Pandas, NLTK
ToolsTableau
Extracurricular
UM Chess–present
Vice President & Secretary
A student chess club, open to anyone regardless of level. The goal is to find people who like chess, build something around that, and make sure everyone who shows up leaves a bit better than when they came. We've organised multiple tournaments, thematic events, brought in International Master Dr. Christian Seel for a simultaneous exhibition, and hold biweekly lessons with Candidate Master Michal Bodicky.