<?xml version="1.0" encoding="UTF-8" ?>
<rss version="0.91">
<channel>
    <title>trevorbedford</title>
    <link>http://www.trevorbedford.com</link>
    <description>RSS feed for www.trevorbedford.com</description>
    <language>en-us</language>
    <copyright>Copyright 2009-2012 Trevor Bedford</copyright>
    
	<lastBuildDate>Mon, 20 Feb 2012 12:42:28 GMT</lastBuildDate>

	<item>
		<pubDate>Mon, 20 Feb 2012 12:42:00 GMT</pubDate>
		<title>Some thoughts on a GitHub of Science</title>
		<link>http://www.trevorbedford.com/archive/feb_20_2012.html</link>
		<description><![CDATA[
		
		<p class="margin">
		<img class="offset" src="/images/github_large.png">			
		
		<p>
		Lately, I've been thinking more about issues surrounding Open Science and scientific publishing.  This post is in part a response to posts by <a href="http://schamberlain.github.com/scott/2012/02/13/a-github-publishing-model/">Scott Chamberlain</a> and <a href="http://marciovm.com/i-want-a-github-of-science/">Marcio von Muhlen</a>.  Marcio's idea represents a major call-to-arms for innovation in how science is conducted and communicated.  He states that "we need a social network of science, meaning scientific bundles of knowledge must be structured and accessible by API, with the connections among those bundles and appropriate utility metrics being what connects and prioritizes scientists."  I would completely agree here.  Making small steps, this is why I chose to post my latest paper to the <a href="http://arxiv.org/abs/1111.4579">arXiv</a> and to <a href="http://trvrb.github.com/canalization/">GitHub itself</a>.
		
		<p>
		Scott questions whether GitHub could be useful as a scientific publishing platform, which I think is a very different thing from Marcio's GitHub of Science.  Here, as publishing platform, I think the primary advantage of GitHub is the versioning system at its heart.  This would allow an audience to follow a scientific story as is progresses, but would also allow the history of a project to be queried and individual contributions to be easily assessed (at least in terms of writing and coding).  If we want to move towards a <a href="http://www.michaeleisen.org/blog/?p=694">system of post publication peer review</a> there needs to be a good way of continually updating a manuscript and making it obvious what each new version brings.  A nice open source analogy here (that Scott originally mentioned on Twitter) is the idea of peer review as opening <i>issues</i>.  Right now, in Google Code or GitHub, you can open an issue with a project documenting a bug or other sort of problem.  Developers can then respond to this issue and make the appropriate changes to their project (that are then linked to the issue, making tracking of specific revisions straight-forward).  Peer review acts in a very similar fashion, documenting inadequacies with the approach taken in a scientific manuscript.  So, please, please, <a href="http://github.com/trvrb/canalization/issues">open an issue with the canalization paper</a>.  I would be happy to try to attend to it.
		
		<p>
		However, I think the potential for something like a GitHub of Science goes much farther than just a publishing platform.  In the current paradigm, manuscripts are built on top of manuscripts, but there is a lot of replicated effort.  Let's say someone thinks of a small, but highly relevant, addition to a paper.  For example, in the case of the canalization paper, what is the effect of vaccination on the antigenic evolution of influenza?  This could be a one figure addition to the present paper.  However, in the current system, doing this research would entail rerunning a lot of the model basics, writing a new paper, with a new introduction and a new discussion, all centered around vaccination.  This one-figure vaccination addendum may not make a paper by itself, but it would be great if it could somehow be integrated into the literature.
		
		<p>
		The basic paradigm of GitHub is the <i>forking</i> of a software project.  I write some code, you take what I've done and make some additions.  I then have the option of folding your changes back into my version, or if I'm not happy with your changes, the two versions continue on their separate ways.  With something like a GitHub of Science, someone else could fork my canalization paper, code and all, and append a short section and figure on vaccination.  I could choose to <i>pull</i> this addition, integrating it as part of the paper, or the forked version could exist on its own.  Here, I'm imagining a scenario where most collaboration manifests as a network of fork and pull requests between co-authors, where a story emerges by combining a number of individual contributions.
		
		<p>
		In a conversation with <a href="http://benfry.com/">Ben Fry</a> about this, he commented that the most beneficial aspect of peer review is  that it forces scientists to work in such a way that their research can be reviewed, and, at least in theory, replicated.  Working in such a way that research could be <i>forked</i> would be a much higher, better, bar in terms of documentation and reproducibility.  There is continual innovation in terms of models for Open Science (<a href="http://stackexchange.com/">Stack Exchange</a>, <a href="http://arxiv.org/">arXiv</a>, <a href="http://github.com/">GitHub</a>, etc...).  I'm hopeful that we can eventually come up with something that gains some traction.  However, I'm sure that whatever we start with, it has to has to produce a publishable end product, so that both old and new systems could continue forward, existing side-by-side.
		
		]]></description>
	</item>

	<item>
		<pubDate>Mon, 30 Jan 2012 01:57:00 GMT</pubDate>
		<title>Estimating global flu diversity</title>
		<link>http://www.trevorbedford.com/archive/jan_30_2012.html</link>
		<description><![CDATA[
		
		<p class="margin">
		<img class="offset" src="/images/flu_turnover.png">		
		
		<p>
		How many strains of flu are circulating at any given moment?  And how much sampling is necessary to capture this diversity?  This came up in a conversation with <a href="http://tree.bio.ed.ac.uk/people/arambaut/">Andrew Rambaut</a> and <a href="http://www.erikvolz.info/">Erik Volz</a> last week.  Fortunately, we can get a back-of-the-envelope estimate using standard population genetic theory.  Here, I've downloaded all the amino acid sequences for the HA1 region of the H3N2 hemagglutinin protein that exist in Genbank between January 2002 and June 2009.  This figure is looking at 10 week windows, with each colored region representing the frequency of a particular sequence in that window's sample.  You can see that there are a few common sequences and many rare sequences, and that sequence diversity rapidly changes over time.  The HA1 region is the region of the influenza genome most responsible for antigenic variation.  Evolution of HA1 is what allows the virus to infect people that have built up immunity to previous strains of flu.  Looking at amino acid diversity of HA1 will give an under-estimate of total genomic diversity of flu, but should be a decent proxy for functional diversity.

		<p class="margin">
		<img class="offset" src="/images/samples_vs_types.png">	
		
		<p>
		We can use the <a href="http://en.wikipedia.org/wiki/Ewens's_sampling_formula">Ewen's sampling formula</a> to calculate the probability that we observe <i>k</i> distinct sequences (or alleles) in a sample of <i>n</i> sequences.  In this case, the expected number of alleles in sample of <i>n</i> sequences is <img align="center" src="/images/ewens.png">, where <i>&theta;</i> represents the level of mutational input into the population.  This formula assumes neutral demography, no geographic subdivision and an infinite alleles mutation model, where every mutation creates a new allele.  I fit this formula to the windows from Genbank comparing the number of sequences sampled each month to the number of distinct sequences observed.  Doing so, I get an estimate for <i>&theta;</i> of 28.8, shown in red.
		
		<p>
		With this number in hand, it's possible to estimate the number of distinct alleles that one would find in a very large sample.  We expect to find 104 alleles in a sample of 1000 sequences and 169 alleles in a sample of 10k sequences.  Estimated global prevalence of influenza is around 70 million (more during the northern hemisphere winter, but this should be good enough for our purposes).  A sample of 70 million sequences is expected to have 358 distinct sequences.  However, most of these are at very low frequency.  We would only expect to see around 30 alleles at greater than 1% frequency, 86 alleles present at >0.1% frequency and 164 alleles present at >0.01% frequency in the population.  I'm not sure exactly where to draw the line in terms of "important" variation, but I would think that 1 in 1000 is a good ballpark.  Thus, it seems to me that a sample of around 500 sequences (with an expected 84 unique alleles) would be sufficient to capture all the possibly important diversity in the HA1 protein.
				
		
		]]></description>
	</item>

	<item>
		<pubDate>Mon, 16 Jan 2012 11:10:00 GMT</pubDate>
		<title>LaTeX manuscript template with web display</title>
		<link>http://www.trevorbedford.com/archive/jan_16_2012.html</link>
		<description><![CDATA[
		<p>
		I've spent a bit more time cleaning up my LaTeX template to make for fully automated web display.  You can find it <a href="http://github.com/trvrb/canalization">over on GitHub</a>. This is currently set up for the canalization paper, but it should make a good basis for any sort of scientific manuscript.  I've provided style sheets and a ruby script to cleanup output from TeX4ht into a presentable web version.  This web version is <a href="http://trvrb.github.com/canalization/">hosted automatically through GitHub pages</a>.  Thus, by running a single script, the LaTeX source is compiled to HTML and with a GitHub push you can update the public web version of the manuscript.  I hope this sort of approach will prove useful for collaborative writing.  
		
		<p>
		It was easy to run this on my previous manuscripts written in LaTeX.  I now have web versions of the <a href="/tree_topology/">tree topology</a> and the <a href="/migration_dynamics/">global migration dynamics</a> papers up.
		
		]]></description>
	</item>

	<item>
		<pubDate>Tue, 03 Jan 2012 13:10:00 GMT</pubDate>
		<title>Interactive visualization of the Serengeti food web</title>
		<link>http://edbaskerville.com/static/research/serengeti-food-web/groups-figure3-interactive/</link>
		<description><![CDATA[
		<p>
		In the <a href="/pdfs/baskerville-serengeti-2011.pdf"">Serengeti food web paper</a>, we present a network diagram of predator-prey relationships, illustrating network structure (<a href="/images/food_web_full.png">Figure 3</a>).  In getting with the times, we've also made an <a href="http://edbaskerville.com/static/research/serengeti-food-web/groups-figure3-interactive/">interactive version of this figure, presenting the network in a force-directed layout</a>.  Ed coded this up in <a href="http://mbostock.github.com/d3/">d3.js</a> based on a version I did in <a href="http://processing.org/">Processing</a>.  Green nodes represent plants, blue nodes represent herbivores and red nodes represent carnivores.  Edges connecting nodes pull them toward each other following Hooke's Law, while nodes are repelled from each other according to Coulomb's Law.  We add an additional force pulling nodes belonging to the same group toward each other.
		
		<p>
		My favorite part of the visualization is the concept of <i>focus</i>.  If you click on a node, the spring forces applied to the edges of this node are magnified, pulling its connections closer.  This makes it easier to explore relationships in the network.  A double-click removes focus.
		
		]]></description>
	</item>

	<item>
		<pubDate>Mon, 02 Jan 2012 14:20:00 GMT</pubDate>
		<title>Spatial guilds in the Serengeti food web revealed by a Bayesian group model</title>
		<link>http://www.trevorbedford.com/pdfs/baskerville-serengeti-2011.pdf</link>
		<description><![CDATA[
		
		<p class="margin">
		<a href="/pdfs/baskerville-serengeti-2011.pdf"><img class="offset" src="/images/serengeti.jpg"></a>
		
		<p>
		Our <a href="/pdfs/baskerville-serengeti-2011.pdf">paper on modeling food webs</a> was just published in PLoS Computational Biology.  Here, I was happy to bring the statistics I've learned from phylogenetic analysis to an entirely different field.  I advised <a href="http://edbaskerville.com/">Ed Baskerville</a> in implementing MCMC and marginal likelihood estimation for network data.  In this case, the data is a matrix of predator-prey relationships, which can be thought of as a network of directed edges specifying who-eats-whom.  We investigated <i>structure</i> in the Serengeti food web through a model in which groups of species behave similarly to one another in terms of what species they eat and what species they are eaten by.  The inferred model shows a high degree of trophic and spatial clustering in which a number of spatially distinct plant groups are fed upon by a few wider-ranging herbivore groups, which are in turn fed upon by just a couple of predator groups.  
		
		<p>
		Also of possible interest, <a href="/pdfs/baskerville-serengeti-supp.pdf">the supporting appendix</a> provides a nice overview of the use Bayesian methods for inference on network data.  The model we present here really should be useful in a variety of biological contexts; genetic regulatory networks and protein interaction networks immediately come to mind. 
		<i>Photo by <a href="http://www.princeton.edu/~dobber/">Andy Dobson</a></i>.
		
		]]></description>
	</item>

	<item>
		<pubDate>Wed, 14 Dec 2011 01:27:00 GMT</pubDate>
		<title>Reproducible peer review</title>
		<link>http://www.trevorbedford.com/archive/dec_14_2011.html</link>
		<description><![CDATA[

		<p class="margin">		
		<table style="float:right; text-align:center; padding:10px 30px 15px 30px;" width=250>
			<tr><td></td><td colspan=2>Journal A</td></tr>
			<tr><td>&nbsp;</td><td>Accept</td><td>Reject</td></tr>
			<tr><td>Accept</td><td>71 (67)</td><td>35 (43)</td></tr>
			<tr><td>Reject</td><td>42 (43)</td><td>31 (27)</td></tr>
			<tr><td colspan=2>&nbsp;</td></tr>
			<tr><td></td><td colspan=2>Journal B</td></tr>
			<tr><td>&nbsp;</td><td>Accept</td><td>Reject</td></tr>
			<tr><td>Accept</td><td>57 (47)</td><td>18 (27)</td></tr>
			<tr><td>Reject</td><td>16 (27)</td><td>25 (15)</td></tr>
		</table>	

		<p>
		I discovered this <a href="http://brain.oxfordjournals.org/content/123/9/1964">paper by Peter Rothwell and Christopher Martyn</a> through an excellent <a href="http://blogs.scientificamerican.com/guest-blog/2011/11/02/what-is-peer-review-for/">blog post by Bradley Voytek</a>.  In the paper, the authors show that reviews of the same paper by two independent reviewers show a level of agreement little better than expected by chance alone.  The authors repeat their experiment across two neuroscience journals.  For the first journal, they have 179 pairs of reviews, with 219 of the 358 votes (61%) recommending acceptance or acceptance with revision.  If votes between reviewers were distributed entirely by chance, we would expect 67 accept-accept pairs, 43 accept-reject pairs, 43 reject-accept pairs and 27 reject-reject pairs.  However, if the reviewers are coming to some sort of scientific consensus we would see an overabundance of accept-accept and reject-reject pairs.  
		
		<p>
		Here, I've shown their findings, with observed and (expected) counts for each scenario.  In journal A, there appears to be little or no difference from the chance expectation, while journal B shows a very modest improvement over the chance expectation.  A simple Fisher's exact test gives a <i>P</i> value of 0.285 on the results of journal A and a <i>P</i> value of less than 0.0001 on the results of journal B.  Additionally, Rothwell and Martyn find little correspondence in reviewer's assessments of priority of publication.
		
		<p>
		Interestingly, the authors studied reproducibility of abstract acceptance at two different scientific conferences.  Here, each abstract was reviewed and rated on a 1 to 6 scale by a panel of 14 or 16 reviewers.  In this case, variance across abstracts can be assessed, but also variance across reviewers (we expect some reviewers to be tougher than others in their assessments).  Rothwell and Martyn find a very modest <i>R</i><sup>2</sup> across abstracts of 0.11–0.15, indicating very little reviewer agreement.  However, <i>R</i><sup>2</sup> across reviewers was a more respectable 0.27–0.32, suggesting more variation in reviewer "toughness".
		
		<p>
		Thus, it appears that in small samples of two or three reviewers, noise from positive/negative reviewer bias may swamp the signal of a particular manuscript.  This fits with my own anecdotal experiences.  Usually (but by no means always) reviewers seem to agree on what's lacking in a manuscript, but will often disagree on how terrible a particular failing is to the manuscript's prospects.  Perhaps if each reviewer's overall positive/negative rating bias were taken into account, we could arrive at a measure of manuscript quality that is more repeatable between independent reviewers.  In turn, this could make authors less beholden to the roll of the reviewer die.
			
		]]></description>
	</item>

	<item>
		<pubDate>Wed, 23 Nov 2011 12:23:00 GMT</pubDate>
		<title>Canalization of the evolutionary trajectory of the human influenza virus</title>
		<link>http://www.trevorbedford.com/archive/nov_23_2011.html</link>
		<description><![CDATA[
		
		<p class="margin">
		<a href="/canalization/index.html"><img class="offset" src="/images/canal_map.png"></a>
		
		<p>
		In an ongoing effort to be more open in my scientific dealings, I've posted a preprint of my latest paper <a href="http://arxiv.org/abs/1111.4579">to the arXiv</a> and here on my website, as both <a href="/pdfs/bedford-canalization-2011.pdf">PDF</a> and <a href="/canalization/index.html">HTML</a>.  This represents my first attempt at a straight-up modeling study.  There's a lot going on with the epidemiology and evolution of influenza; I've made a model that attempts capture all the salient details.  This includes things like the yearly attack rates, rate of antigenic evolution, genetic diversity, and geographic spread.  At it's core, the model assumes that the antigenic phenotype of the virus can be adequately explained as a point in a Euclidean space.  Mutation serves to jostle the location of the virus in this space and infection by one virus confers immunity to subsequent infection by nearby viruses in this antigenic space.  The geometric basis of the model stems from empirical studies of influenza's antigenic phenotype (see <a href="http://www.sciencemag.org/content/305/5682/371.short">Smith et al. 2004</a>).  In this study, I find that evolution in such a space results in a "canalized" trajectory.  The best move for a virus is to move as far away from its past as possible, resulting in linear antigenic movement and a distinctive single-trunked phylogenetic tree.
		
		<p>
		I'm especially proud of my <a href="/canalization/index.html">HTML version of the manuscript</a>, which, through the magic of LaTeX, has all sorts of hyperlinking between figures and references.  In addition, I've done my best to make something that's highly readable on the screen.  Almost everything is taken care of by <a href="http://tug.org/tex4ht/">TeX4ht</a> conversion from my LaTeX source and with a CSS stylesheet, so with only a little more work I should be able to fully automate the process.
		
		<p>
		I'm working now to put the source code for the simulations behind this online.  In the meantime, I would very much welcome any feedback you might have on the manuscript.  Good to get feedback before publication, when there's still an opportunity to incorporate it.
		
		]]></description>
	</item>

	<item>
		<pubDate>Mon, 31 Oct 2011 11:50:00 GMT</pubDate>
		<title>Visualizing mortality data</title>
		<link>http://www.trevorbedford.com/mortality/index.html</link>
		<description><![CDATA[
		
		<p class="margin">
		<a href="/mortality/index.html"><img class="offset" src="/images/mortality_small.png"></a>
		
		<p>
		I came across a simple <a href="http://www.guardian.co.uk/news/datablog/2011/oct/28/mortality-statistics-causes-death-england-wales-2010#_">visualization of England and Wales mortality data in the Guardian</a>.  And because I couldn't deal with the network-y display of hierarchical count data, I decided to redesign the graphic as a tree map.  In googling for "treemap", I found <a href="http://mbostock.github.com/d3/">d3.js</a>, which makes extremely attractive Javascript graphics, with a number of rather fancy built-in figure types.  It seems a little harder to get into than <a href="http://processing.org/">Processing</a>, as it exposes more of the raw Javascript, but the results are beautiful and it provides full SVG support.  Here's the <a href="/mortality/index.html">mortality data laid out with d3's treemap algorithm</a>.
		
		]]></description>
	</item>

	<item>
		<pubDate>Tue, 25 Oct 2011 00:00:00 GMT</pubDate>
		<title>Estimating the effective population size of swine flu</title>
		<link>http://www.trevorbedford.com/archive/oct_25_2011.html</link>
		<description><![CDATA[
		
		<p class="margin">
		<a href="/images/measles_swine_human_large.png"><img class="offset" src="/images/measles_swine_human.png"></a>
		
		<p>
		In my paper on <a href="/tree_topology/">selection in viral phylogenies</a>, I compared the effective population size of measles virus to the effective population of human influenza virus.  The concept of effective population size <i>N<sub>e</sub></i> is central to population genetics.  It measures the timescale of population turnover, or, looking backwards in time, it measures how long it takes for individuals in the population to find a common ancestor.  Genetic diversity is a combination of this timescale and mutation rate.  
		
		<p>
		This is just a small addendum to that paper.  I had wanted to include swine influenza in with the comparison of measles virus and human influenza virus, but decided that this would detract from the paper's focus.  Here, the sequences of swine influenza come from <a href="http://jvi.asm.org/cgi/content/short/81/8/4315">de Jong et al. (2007)</a>.
		
		<p>
		The scaled effective population size <i>N<sub>e</sub>&tau;</i> of measles is estimated at 124.6 years, <i>N<sub>e</sub>&tau;</i> of global H3N2 human influenza is estimated at 7.2 years, and <i>N<sub>e</sub>&tau;</i> of European H3N2 swine influenza is estimated at 24.1 years.
		
		<p>
		This fits nicely with the observed patterns of antigenic evolution.  Infection with measles confers life-long immunity; evolution of the measles genome does not change its antigenic phenotype.  This results in neutral population dynamics.  However, human influenza evolves in antigenic phenotype very rapidly, causing strong selective pressures that reduce effective population size.  Swine influenza presents a nice example between these two extremes.  In comparing rates of antigenic evolution, de Jong et al. find that "while human H3N2 viruses have evolved at a rate of about 2.0 antigenic units per year since 1982, swine H3N2 viruses have evolved more than six times more slowly, about 0.3 antigenic units per year."  In this case, selective pressures still reduce effective population size, but not to the degree seen in human influenza.
		
		]]></description>
	</item>

	<item>
		<pubDate>Mon, 10 Oct 2011 00:00:00 GMT</pubDate>
		<title>Illustrating Darwin's Principle of Divergence</title>
		<link>http://www.trevorbedford.com/divergence/</link>
		<description><![CDATA[
		
		<p class="margin">
		<a href="/divergence/index.html"><img class="offset" src="/images/darwin_tree_closeup.png"></a>
		
		<p>
		In my work on flu I've been trying to build joint evolutionary and epidemiological models, where natural selection emerges dynamically from influenza strains competing for susceptible hosts.  In speaking on this, I found it useful to broaden the context a bit. Here, you can think very generally of genetic / ecological variants competing with one another in some sort of ecological space.  Variants that are close together in this space strongly compete, while more distant variants exist more-or-less independently.
		
		<p>
		This is exactly the model that Darwin used to illustrate the <i>Origin of Species</i>.  Here, I've described this idea in a bit more detail and built a <a href="/divergence/index.html">visualization of the model</a>.
		]]></description>
	</item>

	<item>
		<pubDate>Tue, 23 Aug 2011 00:00:00 GMT</pubDate>
		<title>Wright-Fisher population genetic simulation with selection</title>
		<link>http://www.trevorbedford.com/selsim/</link>
		<description><![CDATA[
		<p>I put the source code to the simulations in last month's tree topology paper <a href="/selsim/index.html">online</a>.
		]]></description>
	</item>

	<item>
		<pubDate>Mon, 25 Jul 2011 00:00:00 GMT</pubDate>
		<title>Strength and tempo of selection revealed in viral gene genealogies</title>
		<link>http://www.trevorbedford.com/pdfs/bedford-tree-topology-2011.pdf</link>
		<description><![CDATA[
		
		<p class="margin">
		<a href="/pdfs/bedford-tree-topology-2011.pdf"><img style="float:right; padding:8px;" src="/images/topology.png"></a>
		
		<p>
		I just had <a href="http://www.biomedcentral.com/1471-2148/11/220">a paper published online</a> in BMC Evolutionary Biology.  I've hosted a <a href="/pdfs/bedford-tree-topology-2011.pdf">compiled PDF</a> with all the figures inline, while their production office gets the official version together.  
		
		<p>This was a fun project.  It's essentially showing how the basic models of selection in population genetics play out in detailed phylogenies, where the passage of time is clearly evident.  I think a visual / phylogenetic approach really helps to understand the processes at work.  In this case, rather than describing an allele that reaches 100% frequency, the process of <i>fixation</i> describes a lineage that outcompetes its contemporaries and comes to be the progenitor of the entire future population.  A fundamental finding in population genetics is that selection reduces <i>effective population size</i>, that is when there is heritable variation for fitness, then the patterns of ancestry connecting individuals will resemble a smaller population.  Essentially, only a few fitter individuals have a chance of contributing their genetic legacy to the future population.  This paper explores the effects of selection on phylogeny shape, with particular attention to uncovering selective dynamics through time.
		]]></description>
	</item>

	<item>
		<pubDate>Tue, 14 Jun 2011 00:00:00 GMT</pubDate>
		<title>Matching R0 to cumulative prevalence in the H1N1 influenza pandemic</title>
		<link>http://www.plosone.org/article/info:doi/10.1371/journal.pone.0020358</link>
		<description><![CDATA[
		<p>It's a reassuring thing when science works the way it's supposed to and the pieces mesh together.  A paper by McLeish and colleagues just came out looking at <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0020358">sero-prevalence of pandemic H1N1 from 2009 to winter 2010 in Scotland</a>.  This comes from measuring antibodies specific to pandemic H1N1 in a large sample of the population.  McLeish et al. find that through March 2010, approximately 35% of people in Scotland were infected with pandemic H1N1 influenza.  This number apparently came as a surprise to the media and the story has been getting a lot of play.
		
		<p>However, what's cool is that this 35% number is exactly what we expected from our epidemiological models.  The basic reproductive number <i>R</i><sub>0</sub> measures the number of secondary infections expected to result from a given infection in a naive host population (a population with no previous exposure).  This number was quite small for the H1N1 pandemic, estimated <a href="http://www.sciencemag.org/content/324/5934/1557.abstract">at around 1.2 to 1.3 from the early upswing of the pandemic in 2009</a>.  The original paper detailing the SIR model (Kermack and McKendrick 1927) gives the formula: <img align="center" src="/images/finalsize.png">, where <i>Z</i> represents the final size of an epidemic (in terms of proportion of the population infected).  Numerically solving this for these values of <i>R</i><sub>0</sub> gives an expected final epidemic size of 31% to 42%.  This is amazingly on target.
		]]></description>
	</item>

	<item>
		<pubDate>Mon, 04 Apr 2011 00:00:00 GMT</pubDate>
		<title>Counting parameters in a phylogeny</title>
		<link>http://www.trevorbedford.com/writeups/effective_parameters.html</link>
		<description><![CDATA[
		<p>
		I did a <a href="/writeups/effective_parameters.html">small write-up</a> on the scaling of the effective number of parameters in phylogenetic inference.  I was surprised by how nicely it worked out.  Basically, each additional taxa included in the model contributes one additional parameter.
		]]></description>
	</item>

	<item>
		<pubDate>Tue, 08 Mar 2011 00:00:00 GMT</pubDate>
		<title>Multi-strain multi-deme model with SIRS dynamics</title>
		<link>http://www.trevorbedford.com/spatialflu/</link>
		<description><![CDATA[
		<p>I finally got the simulation code associated with the <a href="/pdfs/bedford-flu-mig-2010.pdf">"Global migration dynamics"</a> paper up and online.  Hopefully this will prove useful to other groups working on the evolutionary dynamics of viral pathogens.  Source code and analysis can be found <a href="/spatialflu/index.html">here</a>.
		]]></description>
	</item>

	<item>
		<pubDate>Thu, 02 Sep 2010 00:00:00 GMT</pubDate>
		<title>Updating coaltrace to use Javascript</title>
		<link>http://www.trevorbedford.com/coaltracejs/</link>
		<description><![CDATA[
		<p>I've been trying to bring myself somewhat up to date with current web technologies.  I got my coalescent visualization <a href="/coaltracejs/index.html">ported over</a> to HTML5 / Javascript by using <a href="http://processingjs.org">Processing.js</a>.  If you're running Chrome or Safari, then this definitely makes for the better version.
		]]></description>
	</item>

	<item>
		<pubDate>Thu, 10 Jun 2010 00:00:00 GMT</pubDate>
		<title>Adaptive impact of the chimeric gene Quetzalcoatl in Drosophila melanogaster</title>
		<link>http://www.trevorbedford.com/pdfs/rogers-qtzl-2010.pdf</link>
		<description><![CDATA[
		
		<p class="margin">	
		<a href="/pdfs/rogers-qtzl-2010.pdf"><img class="offset" src="/images/qtzl_large.png"></a>

		<p>Another <a href="/pdfs/rogers-qtzl-2010.pdf">paper to share</a>.  Here, I helped with the population genetic side of things in <a href="http://www.oeb.harvard.edu/faculty/hartl/lab/people/rebekah.html">Rebekah Roger's</a> analysis of a new gene in <i>Drosophila</i>.  Before I left the Hartl lab, I worked with Rebekah on a bioinformatic analysis of <em>chimeric</em> genes.  These are strange accidents of evolution where two functioning genes are spliced together to create a new gene with bits from both parental genes.  We found 14 such genes in <i>Drosophila</i>, one of which is the focus of this current work.  This new gene, which we're calling <i>Quetzalcoatl</i>, appears to be fantastic for the flies that possess it.
		
		<p>From my perspective, it's good to see the bioinformatic work pay dividends.
		]]></description>
	</item>

	<item>
		<pubDate>Fri, 28 May 2010 00:00:00 GMT</pubDate>
		<title>Reuters: "Who to blame for flu? Maybe the US, study finds"</title>
		<link>http://www.reuters.com/article/idUSN2713330520100527</link>
		<description><![CDATA[
		<p class="emph"><a href="http://www.reuters.com/article/idUSN2713330520100527">Reuters: "Who to blame for flu? Maybe the US, study finds."</a>
		
		<p>This is hilarious.  Of course it has to be someone's fault.  Wow.  
		
		<p>Ph.D. comics did a <a href="http://www.phdcomics.com/comics.php?f=1174">piece on this</a>.  Fortunately, the articles produced by science writers were actually pretty good: <a href="http://www.eurekalert.org/pub_releases/2010-05/uom-fdd052010.php">U of M News Service</a>, <a href="http://www.hhmi.org/news/pascual20100527.html">HHMI News</a>,  <a href="http://www.cidrap.umn.edu/cidrap/content/influenza/general/news/may2710strains.html">CIDRAP News</a>.
		]]></description>
	</item>

	<item>
		<pubDate>Thu, 27 May 2010 00:00:00 GMT</pubDate>
		<title>Global migration dynamics underlie evolution and persistence of human influenza A (H3N2)</title>
		<link>http://www.trevorbedford.com/pdfs/bedford-flu-mig-2010.pdf</link>
		<description><![CDATA[
		
		<p class="margin">			
		<a href="/pdfs/bedford-flu-mig-2010.pdf"><img class="offset" src="/images/flumap.png"></a>
		
		<p>
		Today, my paper on <a href="/pdfs/bedford-flu-mig-2010.pdf">migration patterns in the flu virus</a> was published in PLoS Pathogens.  This was fun work to do, requiring approaches from multiple disciplines.  While the basics of the migration model came from population genetics and coalescent theory, fitting this model to sequence data required a lot of heavy-lifting computation implemented by Peter Beerli in the program <a href="http://popgen.sc.fsu.edu/Migrate-n.html">Migrate</a>.  I originally wrote my program <a href="/pact/index.html">PACT</a> to deal with the enormous (2000+ tips) phylogenetic trees produced by this analysis.  Additionally, a lot of epidemiology went in to making realistic simulations on which to hone the methods.
		
		<p>The common ancestor of all contemporaneous H3N2 flu can be traced back to a single infection occurring somewhere in the world approximately 2-5 years before hand.  This infection, by luck and by virtue of its genotype, becomes the progenitor of the entire worldwide flu population.  The main goal of this analysis was to trace this progenitor lineage through time.  We found that this lineage existed primarily in China and Southeast Asia, but also, surprisingly, in the USA.  The occasional presence of this progenitor lineage in USA has important public health implications.
		
		<p>I'm not terribly happy with the PLoS presentation.  Rather than keeping figures as line art, they were converted to low quality bitmaps.  Also, I don't like the splitting of the supporting information into 10 different files.  So, in addition to the paper, I'm hosting high quality PDFs of the figures and a single PDF of the entire supporting appendix.  Go <a href="/papers.html">here</a>.
		]]></description>
	</item>

	<item>
		<pubDate>Sat, 01 May 2010 00:00:00 GMT</pubDate>
		<title>Presentation on phyloseminar.org</title>
		<link>http://phyloseminar.org</link>
		<description><![CDATA[
		<p>About a week ago I gave an internet seminar on phylogenetics of the influenza virus over at <a href="http://phyloseminar.org">phyloseminar.org</a>.  I'm very happy someone stepped up to organize something like this (thanks Erick!).  You can watch me talk to my computer for an hour if you'd like, and the other seminars are really good as well.  
		]]></description>
	</item>

	<item>
		<pubDate>Thu, 11 Mar 2010 00:00:00 GMT</pubDate>
		<title>Population genetic simulation on a mutational landscape</title>
		<link>http://www.trevorbedford.com/poptrace/</link>
		<description><![CDATA[
		
		<p class="margin">	
		<a href="/poptrace/index.html"><img class="offset" src="/images/poptrace_large.jpg"></a>
		
		<p>
		I've posted another Processing app. This one is <a href="/poptrace/index.html">a basic population genetic simulation</a>.  There are multiple variants within a population of reproducing individuals.  Variants can mutate into other variants, and the frequencies of each change over time due to genetic drift and natural selection.  
		
		<p>There are a number of basic results that are immediately obvious here, such as the conditions required for persistent variation in the population and the conditions required for the evolution of mutational robustness. 
		]]></description>
	</item>

	<item>
		<pubDate>Wed, 13 Jan 2010 00:00:00 GMT</pubDate>
		<title>Compilation of most interesting Wikipedia articles</title>
		<link>http://www.trevorbedford.com/writeups/wikipedia.html</link>
		<description><![CDATA[
		<p>I often find myself getting lost in Wikipedia.  There are so many amazing things in this world.  More recently, I've started keeping track of some of the more interesting / outlandish articles I come across.  You can find the list <a href="/writeups/wikipedia.html">here</a>.
		]]></description>
	</item>

	<item>
		<pubDate>Mon, 12 Oct 2009 00:00:00 GMT</pubDate>
		<title>PACT: Posterior Analysis of Coalescent Trees</title>
		<link>http://www.trevorbedford.com/pact/</link>
		<description><![CDATA[
		
		<p class="margin">		
		<a href="/pact/index.html"><img class="offset" src="/images/single_tree.png"></a>
		
		<p>
		I wrote a program called <a href="/pact/index.html">PACT (Posterior Analysis of Coalescent Trees)</a> this spring to properly analyze the genealogical trees produced by <a href="http://popgen.scs.fsu.edu/Migrate-n.html">Migrate</a>.  I finally put in the extra effort to write documentation and make it easy for other people to use the software.  It's now available for <a href="/pact/index.html">download</a>.
		
		<p>I had originally wanted to estimate the relative contribution of various geographic regions to the evolution of the influenza virus.  Trees produced by Migrate contain an explicit description of which geographic region branches reside in.  It was just a matter a extracting, displaying and summarizing this information.  The program can do a variety of things beyond this, and hopefully should prove a useful accessory to any sort of coalescent inference.
		]]></description>
	</item>

	<item>
		<pubDate>Wed, 23 Sep 2009 00:00:00 GMT</pubDate>
		<title>Basic coalescent simulation with physics-based layout</title>
		<link>http://www.trevorbedford.com/coaltrace/</link>
		<description><![CDATA[
		
		<p class="margin">	
		<a href="/coaltrace/index.html"><img class="offset" src="/images/coaltrace_large.jpg"></a>
		
		<p>
		I've written a small Processing <a href="/coaltrace/index.html">app to visualize the genealogical process</a>.  I've seen a lot of evolutionary trees drawn quite nicely.  However, this is the first example that I've seen that presents trees in a dynamic fashion, showing how they evolve over time.  It also allows for interactivity.  For instance, you can see how adding more individuals to an evolving population causes their evolutionary tree to deepen.
		
		<p>Probably the best part about writing this in <a href="http://processing.org">Processing</a> is how nicely objected-orientated things are.  Each individual in the simulation follows a simple physics simulation, repelling away from other individuals.  This takes care of layout without having to worry about high-level control.
		
		<p>I'm planning on writing more apps in this vein.  I think it might be a very useful framework for data analysis, rather than just simulation. 
		]]></description>
	</item>

	<item>
		<pubDate>Thu, 17 Sep 2009 00:00:00 GMT</pubDate>
		<title>Welcome</title>
		<link>http://www.trevorbedford.com</link>
		<description><![CDATA[
		<p>Welcome.  I created this site to host my work, both large and small.  Large projects have a natural home as articles  in scientific journals.  However, I'll often spend an afternoon following up on some small thing that's of passing interest to me.  I would like to keep a journal of these small creations.  Not planning a blog, but something involving a bit more novelty.  We'll see what happens...
		]]></description>
	</item>

	</channel></rss>

