Skip to content
View apelullo's full-sized avatar

Block or report apelullo

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
apelullo/README.md

Arthur Pelullo

Data Science Lead with seven years at Penn Medicine, building data systems for clinical research, behavioral health SaaS platforms, and operational analytics across 40+ projects and 18 publications. Before that: physics teacher, BI analyst.

Now building in the open toward roles in data science and AI engineering.


Currently building

paprika-agent: An MCP server that connects Claude to the Paprika recipe manager app. Built as a deliberate learning project for modern AI tooling: Python 3.13, FastMCP, async-first architecture, 30 tests, GitHub Actions CI/CD, conventional commits. Stage 1 complete. 🔧

Next: A causal inference pipeline on Yelp reviews and SAMHSA treatment facility data: entity resolution, NLP sentiment analysis, heterogeneous treatment effects. (I've worked with both datasets before; this is the updated analysis with modern tools and a sharper research question.)

On the horizon: A Memory Diffusion MCP server (persistent AI memory via typed knowledge graphs and Bayesian confidence decay). And beyond MCP: autonomous agents and orchestration workflows.


Background

The pinned repos below are production work from my Penn Medicine years. Some highlights: an employee wellness measurement program we designed from scratch that surfaced an equipment failure generating tens of thousands in monthly savings at a single facility; a behavioral health SaaS platform that grew to 55k+ users; and a six-year AWS data curation pipeline that ran without a single day of data loss. Older stack; the work was real.


Stack

Data science: pandas scikit-learn TensorFlow Keras PyTorch statsmodels
NLP: NLTK spaCy Gensim
AI engineering: FastMCP uv AsyncIO pytest GitHub Actions
Infrastructure: AWS SQL Docker
Methods: causal inference Bayesian methods experimental design


LinkedIn · Portfolio · ORCID

Pinned Loading

  1. paprika-agent paprika-agent Public

    MCP server connecting Claude Desktop to the Paprika recipe manager via its unofficial API to remotely access and interact with recipe data and meal planning

    Python

  2. cobalt_health_wellness_platform_ops cobalt_health_wellness_platform_ops Public

    Cobalt is a mental health and wellness platform created for Penn Medicine employees that serves as a hub for support services such as therapy, wellness coaching, topic- and population-specific grou…

    Jupyter Notebook

  3. bluecoats_measurement_response_program_ops bluecoats_measurement_response_program_ops Public

    Bluecoats is a closed-loop, human-centric measurement and response program coordinating training, resources, and operational mechanisms to empower health system staff and management to systematical…

    Jupyter Notebook

  4. yelp_health_data_curation_ops yelp_health_data_curation_ops Public

    An AWS-based data pipeline to extract, process, store, and monitor Yelp "health-related" facility data in support of ongoing health system initiatives.

    Jupyter Notebook

  5. twitter_covid_stream_processing_ops twitter_covid_stream_processing_ops Public

    An AWS-based data pipeline to collect, process, store, and monitor Twitter streaming data thoughout the COVID-19 pandemic in support of local, regional, and national public health initiatives.

    Jupyter Notebook

  6. yelp_aha_covid_nlp_research yelp_aha_covid_nlp_research Public

    A foundational research initiative conducting an in-depth analysis of Yelp health-related facilities, facility categories, and facility review content "before" and "after" onset of COVID-19. The an…

    Jupyter Notebook