Skip to content
View HomenShum's full-sized avatar

Block or report HomenShum

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
HomenShum/README.md

👋 Hi, I'm Homen Shum

Building NodeRoom — a live room where humans and AI agents do high-trust research together, without clobbering each other.

A career, compiled: banking/finance → data engineering → agentic AI, converging on human-agent collaboration systems where the agent leaves receipts.

Meta · agentic QA (PQX)  ·  JPMorgan · 3.5 yrs, credit + agentic-RAG over 100k+ docs  ·  Ideaflow  ·  Founder, NodeBench AI  ·  UC Santa Barbara  ·  full history ↗

NodeRoom LinkedIn profile views

TypeScript Convex Next.js React Python TipTap Playwright


NodeRoom — Live Startup Diligence War Room: humans + NodeAgents run shared research on a live diligence sheet — lock → draft → smart-merge with no-clobber proof


🗺️ The system map — one lineage, not scattered repos

Repo Layer What it is
noderoom Current flagship Live multi-panel room where humans + NodeAgents edit a shared spreadsheet, note, and post-it wall through one versioned concurrency model — lock → draft → smart-merge, no-clobber, per-element CAS.
nodebench-ai Research engine Entity intelligence for any company, market, or question — searches + synthesizes with sources, turns each run into a reusable artifact, and ships a hosted public-research MCP.
NodeAgent Agent kernel The distilled core of NodeBench — one loop, four tool UIs: live context, grounded/cited search, a versioned spreadsheet delta, and a TipTap notebook memo.
feature-walkthrough-gif Proof / media Playwright → Remotion → ffmpeg turns any feature into an annotated walkthrough GIF — and because it's scripted, the GIFs double as an integration smoke-test.
parity-studio Visual QA Image (or live app route) → verified componentized ui_kit, self-judged on a 16-check deterministic rubric with honest score drift before any agent touches production.
LLM-Prior-Authorization… Regulated workflow LLMs auto-fill prior-auth forms from patient notes — structured extraction, a validation pass, and an LLM-as-a-Judge eval that scores on clinical knowledge, not string match.

Productivity infra: gmail-workspace-public (large inbox → one queue, one decision; private data stays local, public research delegated to NodeBench) · agent-workspace-template (reusable Convex/Next agent-workspace runtime).


🧬 The lineage

flowchart LR
    NB["NodeBench AI<br/>research / diligence engine<br/>sourced dossiers · MCP"]
    NA["NodeAgent<br/>distilled agent kernel<br/>one loop · four tool UIs"]
    NR["NodeRoom<br/>CURRENT FLAGSHIP<br/>live room · lock→draft→merge"]
    PF["Proof<br/>reproducible walkthroughs<br/>that double as smoke-tests"]

    NB -->|distill the core| NA
    NA -->|put humans + agents in one room| NR
    NR -->|ship review-ready artifacts| PF

    style NR fill:#111,color:#fff,stroke:#111
Loading

🛠️ A career, compiled

Five capability buckets, each load-bearing in the work above:

  • Banking & diligence — 3.5 yrs at JPMorgan: credit analysis (72 deals, ~$800M, 270 models) plus "LLMsuite," an agentic-RAG diligence tool over 100k+ documents. Turning messy research into structured, cited sheets and risk models — the reason NodeRoom is a War Room, not a toy.
  • Data engineering — pipelines, schemas, reactive runtimes (Convex), durable streaming. The plumbing under every live room and report.
  • Agentic AI & evals — agentic QA at Meta (PQX) and eval pipelines at Ideaflow: grounded search, tool loops, versioned model deltas, LLM-as-a-Judge scoring, scenario-based tests. Agents that get checked, not trusted — the harness matters more than the model.
  • Healthcare / regulated workflows — prior-auth auto-fill with validation + eval: structured extraction where being wrong has consequences.
  • Product engineering — Next.js / React / TS surfaces, UI parity harnesses, reproducible demos. The artifacts people actually click.

🎯 Current flagship demo — NodeRoom: Live Startup Diligence War Room

Multiple humans and multiple NodeAgents research companies in one live room and enrich a shared diligence sheet together:

  • Agents claim an affected-range lock (still readable as context), a blocked agent drafts around it, and on unlock the draft smart-merges — committed human edits are never clobbered. Every edit carries a per-element version (CAS).
  • Findings stream into the sheet, the note panel, and the post-it wall — no refresh; server-led agent work reaches every client (e.g. the live Q3DEMO room, /ask reconcile Q3 revenue filling a variance column).
  • Runs two modes from the same code: a deterministic no-key in-memory engine + scripted agents (npm run demo), and Live with a real Convex reactive backend + a model-routed LLM agent (routes promoted by ladder evidence, not provider brand).
  • Ends with downstream-ready review artifacts: company brief, runway chart, open-questions list.

People + agents + artifacts + evidence + review + shareability.


📚 Selected earlier work — the arc that compiled into the systems above
Project Signal
Banking assistant Finance-document assistant for company/PDF analysis — the diligence reflex, pre-NodeBench.
openai-agent-eval-framework Agent evaluation for classification, context verification, and pruning — the eval discipline, early.
CosmaNeura med billing ICD/CPT recommendation from physician dictation — regulated extraction before the prior-auth system.
FluencyMed Early healthcare AI workflow prototype.
voice_email_agent Email ingestion, summarization, embeddings, voice query — the seed of the Gmail workspace.

💡 What I care about

The agent should leave receipts. Sources on every claim, a version on every edit, an eval on every answer, and a demo anyone can reproduce. High-trust work doesn't get faster by trusting the model more — it gets faster by making the model checkable.

📫 LinkedIn · hshum2018@gmail.com

Pinned Loading

  1. noderoom noderoom Public

    Live room where humans and NodeAgents edit a shared spreadsheet, note, and post-it wall together — never clobbering each other. lock → draft → smart-merge with per-element CAS versioning. Runs dete…

    TypeScript 2

  2. nodebench-ai nodebench-ai Public

    Entity intelligence for any company, market, or question — not a chatbot that answers once, but a system that synthesizes with sources, turns each run into a reusable artifact, and watches for chan…

    TypeScript 14 3

  3. NodeAgent NodeAgent Public

    The distilled, portfolio-grade core of NodeBench AI. A cross-collaborative agent that gathers live room context, finds the right doc, updates a versioned spreadsheet, and writes a TipTap notebook m…

    TypeScript

  4. feature-walkthrough-gif feature-walkthrough-gif Public

    Turn any web feature into a polished, annotated walkthrough GIF — every state, the click, the loading, the result. SPEC → Playwright capture → Remotion overlay → ffmpeg two-pass palette. Scripted a…

    JavaScript

  5. parity-studio parity-studio Public

    Image (or live app route) → verified, componentized ui_kit. Self-judged with a 16-check deterministic parity rubric and honest score drift every iteration, so an agent stages only approved design d…

    TypeScript 1

  6. LLM-Prior-Authorization-Form-Auto-Fill-System-With-Eval LLM-Prior-Authorization-Form-Auto-Fill-System-With-Eval Public

    LLMs auto-fill regulated prior-authorization forms from patient notes — with a validation pass and an LLM-as-a-Judge eval that scores correctness on clinical knowledge, not string matching. FastAPI…

    Python 1 1