Data Science Lead with seven years at Penn Medicine, building data systems for clinical research, behavioral health SaaS platforms, and operational analytics across 40+ projects and 18 publications. Before that: physics teacher, BI analyst.
Now building in the open toward roles in data science and AI engineering.
paprika-agent: An MCP server that connects Claude to the Paprika recipe manager app. Built as a deliberate learning project for modern AI tooling: Python 3.13, FastMCP, async-first architecture, 30 tests, GitHub Actions CI/CD, conventional commits. Stage 1 complete. 🔧
Next: A causal inference pipeline on Yelp reviews and SAMHSA treatment facility data: entity resolution, NLP sentiment analysis, heterogeneous treatment effects. (I've worked with both datasets before; this is the updated analysis with modern tools and a sharper research question.)
On the horizon: A Memory Diffusion MCP server (persistent AI memory via typed knowledge graphs and Bayesian confidence decay). And beyond MCP: autonomous agents and orchestration workflows.
The pinned repos below are production work from my Penn Medicine years. Some highlights: an employee wellness measurement program we designed from scratch that surfaced an equipment failure generating tens of thousands in monthly savings at a single facility; a behavioral health SaaS platform that grew to 55k+ users; and a six-year AWS data curation pipeline that ran without a single day of data loss. Older stack; the work was real.
Data science: pandas scikit-learn TensorFlow Keras PyTorch statsmodels
NLP: NLTK spaCy Gensim
AI engineering: FastMCP uv AsyncIO pytest GitHub Actions
Infrastructure: AWS SQL Docker
Methods: causal inference Bayesian methods experimental design