<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Bioinformatics | Jie He</title><link>https://saster-he.github.io/tags/bioinformatics/</link><atom:link href="https://saster-he.github.io/tags/bioinformatics/index.xml" rel="self" type="application/rss+xml"/><description>Bioinformatics</description><generator>Hugo Blox Builder (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Wed, 01 May 2019 00:00:00 +0000</lastBuildDate><image><url>https://saster-he.github.io/media/icon_hu7729264130191091259.png</url><title>Bioinformatics</title><link>https://saster-he.github.io/tags/bioinformatics/</link></image><item><title>Gene Expression Variability in Single-Cell RNA-seq</title><link>https://saster-he.github.io/project/scrna-seq-variability/</link><pubDate>Wed, 01 May 2019 00:00:00 +0000</pubDate><guid>https://saster-he.github.io/project/scrna-seq-variability/</guid><description>&lt;h2 id="motivation">Motivation&lt;/h2>
&lt;p>Most differential expression analysis in single-cell RNA-seq focuses on differences in &lt;em>mean&lt;/em> expression between cell populations. But gene expression variability (the variance of expression across cells) carries its own biological signal. Genes that are more variable in one condition vs. another can indicate heterogeneous cell states, developmental plasticity, or disease-associated dysregulation.&lt;/p>
&lt;p>The challenge: zero-inflated count data in scRNA-seq creates a strong mean-variance dependency that biases naive variability tests. A gene with higher mean expression will appear more variable simply due to distributional properties, not biology.&lt;/p>
&lt;h2 id="methods">Methods&lt;/h2>
&lt;p>Developed a weighted hypothesis testing framework that:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Accounts for mean-variance dependency&lt;/strong> in zero-inflated negative binomial count data&lt;/li>
&lt;li>&lt;strong>Tests for differential variability&lt;/strong> between cell populations using a weighted statistic that stabilizes variance estimates across the expression range&lt;/li>
&lt;li>&lt;strong>Scales to large datasets&lt;/strong>, validated on 32,738 genes across 2,692 single cells&lt;/li>
&lt;/ul>
&lt;h2 id="implementation">Implementation&lt;/h2>
&lt;p>All methods implemented as R functions, using the &lt;code>MAST&lt;/code> and &lt;code>edgeR&lt;/code> frameworks as a foundation. The weighting scheme was derived analytically and validated via simulation.&lt;/p>
&lt;h2 id="recognition">Recognition&lt;/h2>
&lt;p>Awarded &lt;strong>Honors Thesis with Highest Distinction&lt;/strong> by the University of North Carolina at Chapel Hill, 2019.&lt;/p>
&lt;p>&lt;em>Advisor: &lt;a href="https://sph.unc.edu/adv_profile/di-wu-phd/" target="_blank" rel="noopener">Prof. Di Wu&lt;/a>, Department of Biostatistics, UNC Gillings School of Global Public Health.&lt;/em>&lt;/p></description></item></channel></rss>