Research

Research Interests

My research interests lie at the intersection of Graph Neural Networks, Agentic AI Systems, and Data Engineering. I’m drawn to problems that require bridging structured data processing with intelligent reasoning systems — enabling more robust, personalized, and explainable learning experiences. My current work spans from graph-based retrieval systems to data automation pipelines for clinical research.

Research Trajectory

My long-term research vision focuses on building graph-based intelligent systems capable of structured reasoning, multi-hop retrieval, and personalized interaction. I’m especially excited by opportunities that bridge machine learning, data engineering, and human-centered design — whether in education, research, or healthcare contexts.

U Lab — Graph Neural Networks & Intelligent Systems

Directed by Professor Jiaxuan You, U Lab focuses on leveraging Graph Neural Networks for reasoning, representation learning, and multi-hop knowledge access. Our work explores how graph structure can enable models to move beyond static prompts and towards structured, explainable intelligence.

TinyTutor (Ideation)

TinyTutor is an early-stage research project that explores how graph-based reasoning can support adaptive tutoring systems. The core idea is to create a graph-enhanced backbone that links concepts, dependencies, and skill progressions — enabling a small language model to act as an interactive, concept-aware tutor. I’m currently designing the system architecture, retrieval pipeline, and evaluation framework for multi-hop question answering and content personalization.

Center for Psychedelic & Consciousness Research (CPCR)

At Johns Hopkins CPCR, my work centers on automating and scaling clinical research data workflows — from ingestion to reporting. CPCR conducts world-leading psychedelic clinical trials, aiming to uncover therapeutic mechanisms and expand responsible psychedelic science. My contributions focus on building reliable, reproducible systems to support researchers and clinicians.

QA Library

A centralized Python library that automates scoring for over 100+ questionnaires and psychometric measures used in CPCR studies. It ensures reproducibility, version control, and harmonization of data across multiple clinical protocols.

PHI Redactor

A pipeline that automatically detects and redacts Protected Health Information (PHI) from unstructured text and tabular datasets. This enables compliant downstream processing and model training on de-identified data.

Data Catalog System

An internal data management platform for organizing, distributing, and versioning scored study datasets. This tool integrates tightly with the QA Library, AE/PD report generator, and other automation scripts to reduce manual overhead in research workflows.

Protocol Deviations + Adverse Events (AE/PD) Automation

A RedCap-based automation pipeline that ingests adverse event and protocol deviation data, parses and harmonizes it, and generates human-readable HTML reports for IRB and clinical teams. This reduced manual reporting time and improved standardization across studies.

Toscano Lab — Reactive Sulfur Species Chemistry

In Toscano Lab, I contributed to experimental research investigating the chemistry and biological roles of hydropersulfides. Our work focused on developing reagents and methods to understand protein persulfidation and its biochemical effects.

Lab Work

• Developed and executed wet-lab protocols for hydropersulfide generation.
• Contributed to characterization and analysis of reactive sulfur species.
• Collaborated with interdisciplinary researchers to align experimental workflows with downstream data processing and visualization.