<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Clinical | Jie He</title><link>https://saster-he.github.io/tags/clinical/</link><atom:link href="https://saster-he.github.io/tags/clinical/index.xml" rel="self" type="application/rss+xml"/><description>Clinical</description><generator>Hugo Blox Builder (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Fri, 01 Aug 2025 00:00:00 +0000</lastBuildDate><image><url>https://saster-he.github.io/media/icon_hu7729264130191091259.png</url><title>Clinical</title><link>https://saster-he.github.io/tags/clinical/</link></image><item><title>TFL Automation at Vertex Pharmaceuticals</title><link>https://saster-he.github.io/project/vertex-llm-automation/</link><pubDate>Fri, 01 Aug 2025 00:00:00 +0000</pubDate><guid>https://saster-he.github.io/project/vertex-llm-automation/</guid><description>&lt;h2 id="overview">Overview&lt;/h2>
&lt;p>Every drug approval requires submission-ready clinical study deliverables: standardized datasets (SDTM, ADaM) and a Clinical Study Report (CSR) containing Tables, Figures, and Listings (TFL) that document trial results. Producing these deliverables is largely manual statistical programming work that has to be done correctly and consistently for every study.&lt;/p>
&lt;p>This project builds a multi-agent automation system to handle that work.&lt;/p>
&lt;h2 id="architecture">Architecture&lt;/h2>
&lt;p>The system is organized around a &lt;strong>human-in-the-loop agent pattern&lt;/strong>:&lt;/p>
&lt;ol>
&lt;li>
&lt;p>&lt;strong>Shiny front-end&lt;/strong>: Statistical programmers interact with the system through an R Shiny interface. They provide analysis specifications, review agent-generated outputs, and approve or revise before submission.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>LLM agent layer&lt;/strong>: Agents interpret the provided specifications and generate statistical programming code for each requested output. The agents handle SDTM domain mapping, ADaM dataset construction logic, and TFL output generation.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Validation layer&lt;/strong>: Automated checks run against CDISC standards and expected specifications before output is surfaced to the programmer for review.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;p>The Shiny interface keeps programmers in control while eliminating the repetitive parts of routine TFL work. Analysts drive the agents rather than writing boilerplate code by hand.&lt;/p>
&lt;h2 id="technical-stack">Technical Stack&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>R Shiny&lt;/strong>: Front-end interface and human-in-the-loop control panel&lt;/li>
&lt;li>&lt;strong>Python&lt;/strong>: LLM agent orchestration and pipeline automation&lt;/li>
&lt;li>&lt;strong>R and SAS&lt;/strong>: Statistical programming and output generation&lt;/li>
&lt;li>&lt;strong>Shell scripting&lt;/strong>: Workflow automation and environment management&lt;/li>
&lt;/ul>
&lt;h2 id="why-it-matters">Why It Matters&lt;/h2>
&lt;p>TFL generation is a prerequisite for every regulatory submission, and it is time-consuming to do manually. Automating routine outputs means statistical programmers can spend their effort on the judgment calls that require expertise: methodology, specification review, and interpretation of results.&lt;/p>
&lt;p>&lt;em>This project is ongoing (Aug 2025 to Present). Details are limited due to confidentiality.&lt;/em>&lt;/p></description></item></channel></rss>