|
109 | 109 | "\n", |
110 | 110 | "The [_tskit_arg_visualizer_](https://github.com/kitchensjn/tskit_arg_visualizer) software uses the [D3js library](https://d3js.org) to visualise ARGs and other tree sequences interactively, in a browser or Jupyter notebook. As is conventional, the oldest nodes are drawn at the top, with the youngest, usually at time 0, at the bottom.\n", |
111 | 111 | "\n", |
112 | | - "It works by creating a new [`D3ARG`](https://github.com/kitchensjn/tskit_arg_visualizer/blob/main/docs/tutorial.md#what-is-a-d3arg) object from the _tskit_ ARG. This `D3ARG` object can then be plotted using `.draw()`.\n", |
| 112 | + "The visualiser creates a [`D3ARG`](https://github.com/kitchensjn/tskit_arg_visualizer/blob/main/docs/tutorial.md#what-is-a-d3arg) object from the _tskit_ ARG. This object can then be plotted using `.draw()`.\n", |
113 | 113 | "\n", |
114 | 114 | "<div class=\"alert alert-block alert-info\"><b>Note:</b> You'll see that some nodes in this plot have two IDs. Don't worry about this: as we'll see later it's because the simulator has represented recombination using 2 nodes, which have been overlaid in the visualizer</div>\n", |
115 | 115 | "\n", |
|
212 | 212 | "source": [ |
213 | 213 | "### Visualising local trees\n", |
214 | 214 | "\n", |
215 | | - "By default, `tskit` displays each local tree as a summary table, as above. To draw the tree out, you can use the [`.draw_svg()`](https://tskit.dev/tutorials/viz.html#svg-format) method, suitable for small trees of tens or hundreds of nodes each." |
| 215 | + "By default, `tskit` displays each local tree as a summary table, as above. To draw the tree, you can use the [`.draw_svg()`](https://tskit.dev/tutorials/viz.html#svg-format) method, suitable for small trees of tens or hundreds of nodes each." |
216 | 216 | ] |
217 | 217 | }, |
218 | 218 | { |
|
294 | 294 | "source": [ |
295 | 295 | "## Coalescent and non-coalescent regions\n", |
296 | 296 | "\n", |
297 | | - "Looking at the tree-by-tree plot, it should be clear that some of the nodes in a local tree have one child in some trees, and two children in others. There are even some nodes that have only one child in every tree in which they appear (e.g. node 26). We can classify nodes into\n", |
| 297 | + "Looking at the tree-by-tree plot, it should be clear that some of the nodes in a local tree have one child in some trees, and two children in others. There are even some nodes that have only one child in every tree in which they appear (e.g. node 26). We can classify nodes into:\n", |
298 | 298 | "\n", |
299 | 299 | "0. **non-coalescent**, sometimes called _always unary_ (i.e. one child in all local trees, e.g. node 26)\n", |
300 | 300 | "1. **part-coalescent**, sometimes called _locally unary_ (i.e. one child in some local trees, coalescent in others, e.g. node 18)\n", |
|
403 | 403 | "id": "f0876be3-5bfb-42ee-884d-c844b4c19743", |
404 | 404 | "metadata": {}, |
405 | 405 | "source": [ |
406 | | - "The ARG was actually simulated using a model of human evolution that reflects the Out of Africa event. As well as having a value denoting the <code>individual</code>, each node also has a value indicating a <code>population</code> it belongs to.\n", |
| 406 | + "The ARG was actually simulated using a model of human evolution that reflects the Out of Africa event. As well as having a value denoting the <code>individual</code>, each node also has a value indicating a <code>population</code> to which it belongs.\n", |
407 | 407 | "\n", |
408 | 408 | "<dl class=\"exercise\"><dt>Exercise E</dt>\n", |
409 | 409 | " <dd>Change the code above to colour by <code>node.population</code> ID rather than <code>node.individual</code> ID. You could also stop colouring the recombination nodes as black if you like.</dd>\n", |
|
853 | 853 | "source": [ |
854 | 854 | "It should be reasonably obvious how this works. E.g. edge 0 connects parent node 10 to child node 6 in the part of the genome that spans 0 to 930 bp. For further information see [https://tskit.dev/tskit/docs/stable/data-model.html](https://tskit.dev/tskit/docs/stable/data-model.html), and for a tutorial approach, see [https://tskit.dev/tutorials/tables_and_editing.html](https://tskit.dev/tutorials/tables_and_editing.html).\n", |
855 | 855 | "\n", |
856 | | - "As a brief introduction, you can access particular edges, nodes, sites, etc. as Python objects using `arg.edge(i)`, `arg.node(i)`, `arg.site(i)`, and so on." |
| 856 | + "We previously used [`arg.nodes()`](https://tskit.dev/tskit/docs/stable/python-api.html#tskit.TreeSequence.nodes), [`arg.individuals()`](https://tskit.dev/tskit/docs/stable/python-api.html#tskit.TreeSequence.individuals), and [`arg.populations()`](https://tskit.dev/tskit/docs/stable/python-api.html#tskit.TreeSequence.populations) to return Python objects, created by iterating over all the rows in a table. Similarly, methods exist for [`arg.edges()`](https://tskit.dev/tskit/docs/stable/python-api.html#tskit.TreeSequence.edges), [`arg.sites()`](https://tskit.dev/tskit/docs/stable/python-api.html#tskit.TreeSequence.sites), and [`arg.mutations()`](https://tskit.dev/tskit/docs/stable/python-api.html#tskit.TreeSequence.mutations). To access a specific edge, node, site, etc. as a Python object you can also use [`arg.edge(i)`](https://tskit.dev/tskit/docs/stable/python-api.html#tskit.TreeSequence.edge), [`arg.node(i)`](https://tskit.dev/tskit/docs/stable/python-api.html#tskit.TreeSequence.node), [`arg.site(i)`](https://tskit.dev/tskit/docs/stable/python-api.html#tskit.TreeSequence.site), and so on. " |
857 | 857 | ] |
858 | 858 | }, |
859 | 859 | { |
|
915 | 915 | "source": [ |
916 | 916 | "#### High performance data access\n", |
917 | 917 | "\n", |
918 | | - "However, the most performant way to access the underlying data is to use the [efficient column accessors](https://tskit.dev/tskit/docs/stable/python-api.html#efficient-table-column-access), which provide _numpy_ arrays that are a direct view into memory. For example, to find all the site positions along the genome, you can use `arg.tables.sites.position` (or the shortcut `arg.sites_position`). This is particularly relevant when dealing with ARGs containing large tables (e.g. millions of rows)." |
| 918 | + "Using Python objects is convenient, but can be inefficient for large ARGs. The most performant way to access the underlying data is to use the [efficient column accessors](https://tskit.dev/tskit/docs/stable/python-api.html#efficient-table-column-access), which provide _numpy_ arrays that are a direct view into memory. For example, to find all the site positions along the genome, you can use `arg.tables.sites.position` (or the shortcut [`arg.sites_position`](https://tskit.dev/tskit/docs/stable/python-api.html#tskit.TreeSequence.sites_position)). This is particularly relevant when dealing with ARGs containing large tables (e.g. millions of rows)." |
919 | 919 | ] |
920 | 920 | }, |
921 | 921 | { |
|
977 | 977 | "source": [ |
978 | 978 | "#### High performance trees\n", |
979 | 979 | "\n", |
980 | | - "There are also [fast array access methods](https://tskit.dev/tskit/docs/stable/python-api.html#array-access) for local trees in a tree sequence. \n" |
| 980 | + "Local trees in a tree sequence are not stored in a table, but iteratively constructed on the fly using the `arg.trees()` method. However, a tree object has a set of [fast array access methods](https://tskit.dev/tskit/docs/stable/python-api.html#array-access) to provide efficient access to tree-based information, such as the [parents of nodes](https://tskit.dev/tskit/docs/stable/python-api.html#tskit.Tree.parent_array) in a tree, the [number of children](https://tskit.dev/tskit/docs/stable/python-api.html#tskit.Tree.num_children_array) of tree nodes, or the [edge above each node](https://tskit.dev/tskit/docs/stable/python-api.html#tskit.Tree.edge_array).\n" |
981 | 981 | ] |
982 | 982 | }, |
983 | 983 | { |
|
987 | 987 | "metadata": {}, |
988 | 988 | "outputs": [], |
989 | 989 | "source": [ |
990 | | - "tree = arg.first()\n", |
| 990 | + "tree = arg.first()\n", |
991 | 991 | "\n", |
992 | 992 | "# Simple access to the parent of node 0 in the tree\n", |
993 | 993 | "print(\"Parent of node 0 in the first tree is\", tree.parent(0))\n", |
|
0 commit comments