Skip to content

Commit 553b68e

Browse files
committed
Use tskit 0.5.1 so we don't have to call SVG() explicitly
1 parent 7753653 commit 553b68e

12 files changed

+130
-118
lines changed

analysing_tree_sequences.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -224,10 +224,10 @@ tskit provides a flexible and powerful approach to computing such spectra.
224224
Suppose we have simulated the following tree sequence:
225225

226226
```{code-cell} ipython3
227-
from IPython.display import display, SVG
227+
from IPython.display import display
228228
ts = tskit.load("data/afs.trees")
229229
tree = ts.first()
230-
display(SVG(tree.draw_svg()))
230+
display(tree.draw_svg())
231231
ts.tables.sites
232232
```
233233

@@ -268,7 +268,7 @@ the *joint allele frequency spectra*.
268268
```{code-cell} ipython3
269269
node_colours = {0: "blue", 2: "blue", 3: "blue", 1: "green", 4: "green", 5: "green"}
270270
styles = [f".n{k} > .sym {{fill: {v}}}" for k, v in node_colours.items()]
271-
SVG(tree.draw_svg(style = "".join(styles)))
271+
tree.draw_svg(style = "".join(styles))
272272
```
273273

274274
Here we've marked the samples as either blue or green (we can imagine

analysing_trees.md

+46-25
Original file line numberDiff line numberDiff line change
@@ -103,18 +103,21 @@ places mutations ("characters" in phylogenetic terminology) on a given tree.
103103
## Tree traversals
104104

105105
Given a single {class}`Tree`, traversals in various orders are possible using the
106-
{meth}`~Tree.nodes` iterator. For example, in the following tree we can visit the
107-
nodes in different orders:
106+
{meth}`~Tree.nodes` iterator. Take the following tree:
108107

109108

110109
```{code-cell} ipython3
111110
import tskit
112-
from IPython.display import SVG, display
113111
114112
ts = tskit.load("data/tree_traversals.trees")
115113
tree = ts.first()
116-
display(SVG(tree.draw_svg()))
114+
tree.draw_svg()
115+
```
116+
117+
We can visit the nodes in different orders by providing an `order` parameter to
118+
the {meth}`Tree.nodes` iterator:
117119

120+
```{code-cell} ipython3
118121
for order in ["preorder", "inorder", "postorder"]:
119122
print(f"{order}:\t", list(tree.nodes(order=order)))
120123
```
@@ -225,7 +228,7 @@ different time points:
225228
```{code-cell} ipython3
226229
ts = tskit.load("data/different_time_samples.trees")
227230
tree = ts.first()
228-
SVG(tree.draw_svg(y_axis=True, time_scale="rank"))
231+
tree.draw_svg(y_axis=True, time_scale="rank")
229232
```
230233

231234
The generation times for these nodes are as follows:
@@ -280,42 +283,55 @@ print(dict(zip(range(3), nearest_neighbor_of)))
280283

281284
## Parsimony
282285

283-
The {meth}`Tree.map_mutations` method finds a parsimonious explanation for a
284-
set of discrete character observations on the samples in a tree using classical
285-
phylogenetic algorithms.
286+
Take a site on the following tree with three allelic states, where the
287+
samples are coloured by the allele they possess, but where we don't know
288+
the position of the mutations that caused this variation:
286289

287290
```{code-cell} ipython3
288291
tree = tskit.load("data/parsimony_simple.trees").first()
289292
alleles = ["red", "blue", "green"]
290293
genotypes = [0, 0, 0, 0, 1, 2]
291294
styles = [f".n{j} > .sym {{fill: {alleles[g]}}}" for j, g in enumerate(genotypes)]
292-
display(SVG(tree.draw_svg(style="".join(styles))))
295+
tree.draw_svg(style="".join(styles))
296+
```
293297

298+
The {meth}`Tree.map_mutations` method finds a parsimonious explanation for a
299+
set of discrete character observations on the samples in a tree using classical
300+
phylogenetic algorithms:
301+
302+
```{code-cell} ipython3
294303
ancestral_state, mutations = tree.map_mutations(genotypes, alleles)
295304
print("Ancestral state = ", ancestral_state)
296305
for mut in mutations:
297306
print(f"Mutation: node = {mut.node} derived_state = {mut.derived_state}")
298307
```
299308

300-
So, the algorithm has concluded, quite reasonably, that the most parsimonious
309+
In this case, the algorithm has concluded, quite reasonably, that the most parsimonious
301310
description of this state is that the ancestral state is red and there was
302311
a mutation to blue and green over nodes 4 and 5.
303312

304313
### Building tables
305314

306-
One of the main uses of {meth}`Tree.map_mutations` is to position mutations on a tree
307-
to encode observed data. In the following example we show how a set
308-
of tables can be updated using the {ref}`Tables API<tskit:sec_tables_api>`; here we
309-
infer the location of mutations in an simulated tree sequence of one tree,
310-
given a set of site positions with their genotypes and allelic states:
315+
Below we show how a set of tables can be updated using the
316+
{ref}`Tables API<tskit:sec_tables_api>`, taking advantage of the
317+
{meth}`Tree.map_mutations` method to identify parsimonious positions
318+
for mutations on a tree. Here's the tree we'll use:
311319

312320
```{code-cell} ipython3
313321
import pickle
314322
ts = tskit.load("data/parsimony_map.trees")
323+
ts.draw_svg(size=(500, 300), time_scale="rank")
324+
```
325+
326+
Now we can modify the tables by adding mutations. To find the location of mutations,
327+
we infer them from some observed data (some site positions with associated genotypes
328+
and allelic states, in the conventional {class}`tskit encoding <Variant>`):
329+
330+
331+
```{code-cell} ipython3
315332
with open("data/parsimony_map.pickle", "rb") as file:
316333
data = pickle.load(file) # Load saved variant data from a file
317-
display(SVG(ts.draw_svg(size=(500, 300), time_scale="rank")))
318-
print("Variant data: pos, genotypes & alleles as described by the ts.variants() iterator:")
334+
print("Variant data: each site has a position, allele list, and genotypes array:")
319335
for i, v in enumerate(data):
320336
print(f"Site {i} (pos {v['pos']:7.4f}): alleles: {v['alleles']}, genotypes: {v['genotypes']}")
321337
print()
@@ -333,11 +349,13 @@ for variant in data:
333349
parent += parent_offset
334350
mut_id = tables.mutations.add_row(
335351
site_id, node=mut.node, parent=parent, derived_state=mut.derived_state)
336-
info += f", and places mutation {mut.id} to {mut.derived_state} above node {mut.node}"
352+
info += f", and places mutation {mut_id} to {mut.derived_state} above node {mut.node}"
337353
print(info)
338354
new_ts = tables.tree_sequence()
339355
```
340356

357+
And here are the parsimoniously positioned mutations on the tree
358+
341359
```{code-cell} ipython3
342360
mut_labels = {} # An array of labels for the mutations
343361
for mut in new_ts.mutations(): # Make pretty labels showing the change in state
@@ -346,34 +364,37 @@ for mut in new_ts.mutations(): # Make pretty labels showing the change in state
346364
prev = new_ts.mutation(mut.parent).derived_state if older_mut else site.ancestral_state
347365
mut_labels[site.id] = f"{mut.id}: {prev}→{mut.derived_state}"
348366
349-
display(SVG(new_ts.draw_svg(size=(500, 300), mutation_labels=mut_labels, time_scale="rank")))
367+
new_ts.draw_svg(size=(500, 300), mutation_labels=mut_labels, time_scale="rank")
350368
```
351369

352370

353371
### Parsimony and missing data
354372

355-
The Hartigan parsimony algorithm in {meth}`Tree.map_mutations` can also take missing data
373+
We can also take missing data
356374
into account when finding a set of parsimonious state transitions. We do this by
357375
specifying the special value {data}`tskit.MISSING_DATA` (-1) as the state, which is
358376
treated by the algorithm as "could be anything".
359377

360-
For example, here we state that sample 0 is missing, and use the colour white to indicate
361-
this:
378+
For example, here we state that sample 0 is missing, indicated by the colour white:
362379

363380
```{code-cell} ipython3
364381
tree = tskit.load("data/parsimony_simple.trees").first()
365382
alleles = ["red", "blue", "green", "white"]
366383
genotypes = [tskit.MISSING_DATA, 0, 0, 0, 1, 2]
367384
styles = [f".n{j} > .sym {{fill: {alleles[g]}}}" for j, g in enumerate(genotypes)]
368-
display(SVG(tree.draw_svg(style="".join(styles))))
385+
tree.draw_svg(style="".join(styles))
386+
```
387+
388+
Now we run the {meth}`Tree.map_mutations` method, which applies the Hartigan parsimony
389+
algorithm:
369390

391+
```{code-cell} ipython3
370392
ancestral_state, mutations = tree.map_mutations(genotypes, alleles)
371393
print("Ancestral state = ", ancestral_state)
372394
for mut in mutations:
373395
print(f"Mutation: node = {mut.node} derived_state = {mut.derived_state}")
374396
```
375397

376-
377398
The algorithm decided, again, quite reasonably, that the most parsimonious explanation
378399
for the input data is the same as before. Thus, if we used this information to fill
379400
out mutation table as above, we would impute the missing value for 0 as red.
@@ -389,7 +410,7 @@ ancestral_state, mutations = tree.map_mutations(genotypes, alleles)
389410
print("Ancestral state = ", ancestral_state)
390411
for mut in mutations:
391412
print(f"Mutation: node = {mut.node} derived_state = {mut.derived_state}")
392-
SVG(tree.draw(node_colours=node_colours))
413+
tree.draw(node_colours=node_colours)
393414
```
394415

395416

bottlenecks.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ bottB = 2
4545
4646
ts = run_bott_sims(12, time=bottT, strength=bottB)
4747
48-
SVG(ts.draw_svg(y_axis=True, size=(400, 400)))
48+
ts.draw_svg(y_axis=True, size=(400, 400))
4949
```
5050

5151
The genealogy shows several simultaneous coalescence events at the time of the bottleneck

completing-forward-sims.md

+4-6
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,6 @@ import tskit
3434
import msprime
3535
import random
3636
import numpy as np
37-
from IPython.display import SVG
3837
3938
4039
def wright_fisher(N, T, L=100, random_seed=None):
@@ -86,7 +85,7 @@ for 5 generations, and print out the resulting trees:
8685
num_loci = 2
8786
N = 10
8887
wf_ts = wright_fisher(N, 5, L=num_loci, random_seed=3)
89-
SVG(wf_ts.draw_svg())
88+
wf_ts.draw_svg()
9089
```
9190

9291
Because our Wright Fisher simulation ran for only 5 generations, there has not
@@ -136,7 +135,7 @@ coalesced_ts = msprime.sim_ancestry(
136135
recombination_rate=1 / num_loci,
137136
ploidy=1,
138137
random_seed=7)
139-
SVG(coalesced_ts.draw_svg())
138+
coalesced_ts.draw_svg()
140139
```
141140

142141
The trees have fully coalesced and we've successfully combined a forwards-time
@@ -192,9 +191,8 @@ computed tree sequence which is easily done using the
192191
{meth}`simplify <tskit.TreeSequence.simplify>` method:
193192

194193
```{code-cell} ipython3
195-
196-
final_ts = coalesced_ts.simplify()
197-
SVG(coalesced_ts.draw_svg())
194+
final_ts = coalesced_ts.simplify()
195+
coalesced_ts.draw_svg()
198196
```
199197

200198
This final tree sequence is topologically identical to the original tree sequence,

demography.md

+4-7
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,6 @@ models of demography and population history with a simple Python API.
2525
```{code-cell}
2626
import msprime
2727
import numpy as np
28-
from IPython.display import SVG
2928
```
3029

3130
## Population structure
@@ -141,11 +140,9 @@ with nodes coloured by population label using SVG:
141140

142141
```{code-cell}
143142
colour_map = {0:"red", 1:"blue"}
144-
node_colours = {u.id: colour_map[u.population] for u in ts.nodes()}
145-
for tree in ts.trees():
146-
print("Tree on interval:", tree.interval)
147-
# The code below will only work in a Jupyter notebook with SVG output enabled.
148-
display(SVG(tree.draw(node_colours=node_colours)))
143+
styles = [f".node.p{p} > .sym {{fill: {col} }}" for p, col in colour_map.items()]
144+
# The code below will only work in a Jupyter notebook with SVG output enabled.
145+
ts.draw_svg(style="".join(styles))
149146
```
150147

151148
More coalescences are happening in population 1 than population 0.
@@ -394,7 +391,7 @@ The effect of the census is to add nodes onto each branch of the tree sequence a
394391
```{code-cell}
395392
print("IDs of census nodes:")
396393
print([u.id for u in ts.nodes() if u.flags==msprime.NODE_IS_CEN_EVENT])
397-
SVG(ts.draw_svg())
394+
ts.draw_svg()
398395
```
399396

400397
By extracting these node IDs, you can perform further analyses using the ancestral haplotypes.

getting_started.md

+5-6
Original file line numberDiff line numberDiff line change
@@ -172,13 +172,11 @@ $5\ 000\ 000$ --- the position of the sweep --- drawn using the
172172
{meth}`Tree.draw_svg` method.
173173

174174
```{code-cell} ipython3
175-
from IPython.display import SVG
176-
177175
swept_tree = ts.at(5_000_000) # or you can get e.g. the nth tree using ts.at_index(n)
178176
intvl = swept_tree.interval
179177
print(f"Tree number {swept_tree.index}, which runs from position {intvl.left} to {intvl.right}:")
180178
# Draw it at a wide size, to make room for all 40 tips
181-
SVG(swept_tree.draw_svg(size=(1000, 200)))
179+
swept_tree.draw_svg(size=(1000, 200))
182180
```
183181
:::{margin}
184182
The {ref}`visualization tutorial <sec_tskit_viz>` gives more drawing possibilities
@@ -203,7 +201,7 @@ more than one tree: either the entire tree sequence, or
203201
```{code-cell} ipython3
204202
reduced_ts = ts.simplify([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) # simplify to the first 10 samples
205203
print("Genealogy of the first 10 samples for the first 5kb of the genome")
206-
display(SVG(reduced_ts.draw_svg(x_lim=(0, 5000))))
204+
reduced_ts.draw_svg(x_lim=(0, 5000))
207205
```
208206

209207
These are much more standard-looking coalescent trees, with far longer branches higher
@@ -514,8 +512,9 @@ in rough order of importance:
514512
* {meth}`~TreeSequence.keep_intervals()` (or its complement,
515513
{meth}`~TreeSequence.delete_intervals()`) removes genetic information from
516514
specific regions of the genome
517-
* {meth}`~TreeSequence.draw_svg()` plots tree sequences (and {meth}`Tree.draw_svg()`
518-
plots trees)
515+
* {meth}`~TreeSequence.draw_svg()` returns an SVG representation of a tree sequence
516+
(and plots it if in a Jupyter notebook). Similarly, {meth}`Tree.draw_svg()`
517+
plots individual trees.
519518
* {meth}`~TreeSequence.at()` returns a tree at a particular genomic position
520519
(but using {meth}`~TreeSequence.trees` is usually preferable)
521520
* Various population genetic statistics can be calculated using methods on a tree

introgression.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,6 @@ import collections
3636
import matplotlib.pyplot as plt
3737
import msprime
3838
import numpy as np
39-
from IPython.display import SVG
4039
4140
def run_simulation(sequence_length, random_seed=None):
4241
time_units = 1000 / 25 # Conversion factor for kya to generations
@@ -260,7 +259,7 @@ css += "".join([
260259
for k, v in enumerate(mutation_population)])
261260
y_ticks = {0: "0", 30: "30", 50: "Introgress", 70: "Eur origin", 300: "Nea origin", 1000: "1000"}
262261
y_ticks = {y * time_units: lab for y, lab in y_ticks.items()}
263-
SVG(ts.draw_svg(
262+
ts.draw_svg(
264263
size=(1200, 500),
265264
x_lim=(0, 25_000),
266265
time_scale="log_time",
@@ -270,7 +269,8 @@ SVG(ts.draw_svg(
270269
x_label="Genomic position (bp)",
271270
y_ticks=y_ticks,
272271
y_gridlines=True,
273-
style=css))
272+
style=css,
273+
)
274274
```
275275

276276
The depth of the trees indicates that most coalescences occur well before the origin of

requirements.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -7,5 +7,5 @@ scikit-allel
77
tqdm
88
numpy
99
networkx
10-
tskit>=0.5
10+
tskit>=0.5.1
1111
msprime>=1.0

tables_and_editing.md

+5-6
Original file line numberDiff line numberDiff line change
@@ -166,8 +166,7 @@ ts = tskit.load("data/tables_example.trees")
166166

167167
```{code-cell} ipython3
168168
:"tags": ["hide-input"]
169-
from IPython.display import SVG
170-
SVG(ts.draw_svg(y_axis=True))
169+
ts.draw_svg(y_axis=True)
171170
```
172171

173172
Ancestral recombination events have produced three different trees
@@ -238,7 +237,7 @@ for mut in ts.mutations(): # This entire loop is just to make pretty labels
238237
older_mut = mut.parent >= 0 # is there an older mutation at the same position?
239238
prev = ts.mutation(mut.parent).derived_state if older_mut else site.ancestral_state
240239
mut_labels[mut.id] = "{}→{} @{:g}".format(prev, mut.derived_state, site.position)
241-
SVG(ts.draw_svg(y_axis=True, mutation_labels=mut_labels))
240+
ts.draw_svg(y_axis=True, mutation_labels=mut_labels)
242241
```
243242

244243
There are four mutations in the depiction above,
@@ -553,7 +552,7 @@ Now that it had been turned into a tree sequence, we can plot it:
553552

554553
```{code-cell} ipython3
555554
# Plot without mutation labels, for clarity
556-
SVG(altered_ts.draw_svg(y_axis=True, y_gridlines=True, mutation_labels={}))
555+
altered_ts.draw_svg(y_axis=True, y_gridlines=True, mutation_labels={})
557556
```
558557

559558
You can see that the new tree sequence has been modified as expected: there is a new
@@ -570,7 +569,7 @@ understand how tables work. We'll build an extremely simple tree sequence, consi
570569
a single tree that looks like this:
571570

572571
```{code-cell} ipython3
573-
SVG(tskit.load("data/construction_example.trees").draw_svg(y_axis=True))
572+
tskit.load("data/construction_example.trees").draw_svg(y_axis=True)
574573
```
575574

576575
Starting with an empty set of tables, we can fill, say, the node information by using
@@ -612,7 +611,7 @@ site_id = tables.sites.add_row(position=500.0, ancestral_state='0')
612611
tables.mutations.add_row(site=site_id, node=2, derived_state='1')
613612
ts = tables.tree_sequence()
614613
print("A hand-built tree sequence!")
615-
SVG(ts.draw_svg(y_axis=True))
614+
ts.draw_svg(y_axis=True)
616615
```
617616

618617
:::{note}

0 commit comments

Comments
 (0)