Skip to content

Adding variants iterator and other functions#132

Open
LynxJinyangii wants to merge 13 commits intoHighlanderLab:mainfrom
LynxJinyangii:add-multiple-functions-on-pr-131
Open

Adding variants iterator and other functions#132
LynxJinyangii wants to merge 13 commits intoHighlanderLab:mainfrom
LynxJinyangii:add-multiple-functions-on-pr-131

Conversation

@LynxJinyangii
Copy link
Copy Markdown
Collaborator

No description provided.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 1, 2026

Codecov Report

❌ Patch coverage is 99.83740% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
RcppTskit/src/RcppTskit.cpp 99.69% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@gregorgorjanc gregorgorjanc changed the title Add multiple functions on pr 131 Adding variants iterator and other functions Apr 10, 2026
@gregorgorjanc gregorgorjanc force-pushed the add-multiple-functions-on-pr-131 branch from c443f64 to c309e3a Compare April 10, 2026 10:24
@gregorgorjanc
Copy link
Copy Markdown
Member

I have rebased this to the most recent main branch that contains previous PRs.

@gregorgorjanc
Copy link
Copy Markdown
Member

gregorgorjanc commented Apr 11, 2026

There is quite a bit to review in this PR, so best to make a list of new things! @LynxJinyangii I would appreciate such communication so it's easier to follow your work and thoughts etc. Best to err on the slight "over communication" that "no communication";) Even just copying NEWS.md entries or linking to them in the PR message would be fab.

From what I gather, this is the list of additions we should review:

  • rtsk_table_collection_sort and TableCollection$sort related to Do we have to add TableCollection$sort() method? #99
  • rtsk_treeseq_get_samples and TreeSeqeunce$samples (do we need samples for TableCollection then too? - best to discuss; edit - there is no such method for TableCollection)
  • rtsk_node_table_get_row and TableCollection$node_table_get_row (do we need node_table_get_row for TreeSequence too? - best to discuss)
  • rtsk_const_tsk_no_check_integrity
  • rtsk_variant_iterator_init, rtsk_variant_iterator_next, and TreeSequence$variants

Comment thread RcppTskit/R/Class-TableCollection.R Outdated
@gregorgorjanc
Copy link
Copy Markdown
Member

@LynxJinyangii I have pushed changes to this branch/PR (rebased to main, which required sorting some merge conflicts; and come comments on sort), so best if you first pull before you start changes on your end.

@gregorgorjanc
Copy link
Copy Markdown
Member

I have done some polishing of the code, docs, and terms in my recent commit&push. @LynxJinyangii make sure you git pull on your end so that you get these changes. I have reviewed all but variant/iteration stuff - that will take me more time - but you can work on the comments I provided above in the meantime and then we iterate.

@gregorgorjanc
Copy link
Copy Markdown
Member

@bryo-han do you have capacity to go over this PR while I focus on teaching/course?

Copy link
Copy Markdown
Member

@gregorgorjanc gregorgorjanc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LynxJinyangii I have tagged you in a bunch of places where I don't see changes when I last reviewed certain parts. Remember to click on the > or v symbol next to file names to fold and unfold the file so you can see comments as you scroll through the files. Also, when the diff is large GitHub will not render the diff so you have to click on the file display to see diffs and comment sections.

Comment thread RcppTskit/R/Class-TableCollection.R Outdated
Comment thread RcppTskit/R/Class-TableCollection.R Outdated
#' @description Get sample node IDs in this tree sequence.
#' @return An integer vector with sample node IDs (0-based).
#' @details See the \code{tskit Python} equivalent at
#' \url{https://tskit.dev/tskit/docs/latest/python-api.html#tskit.TreeSequence.samples}.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Python method has population, population_id, and time arguments, so if we implement the samples method in R we should mirror what the Python method does.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this implementation was driven by the C method, which only accepts TreeSequence, but let's ensure R API matches Python API - looking at the source of https://tskit.dev/tskit/docs/latest/_modules/tskit/trees.html#TreeSequence.samples I see that these args are all handled on the Python side (first all sample IDs are obtained by calling the low-level C function and then subsetting is done in Python - suggesting we should handle these also on R side).

Looking at the docs I now also see that population_id is deprecated, so best skipping it in our implementation.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LynxJinyangii this one also has TODO that is not addressed

}

// PUBLIC, wrapper for tsk_treeseq_get_samples
// @describeIn rtsk_treeseq_summary Get sample node IDs.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any documentation about rtsk_treeseq_get_samples in rtsk_treeseq_summary, so we should address this. The C method is described at https://tskit.dev/tskit/docs/latest/c-api.html#c.tsk_treeseq_get_samples. Since it only accepts tsk_treeseq_t (tree sequence type in C) it does make sense to describe rtsk_treeseq_get_samples in rtsk_treeseq_summary, but we should follow the convention established there for other methods (provide a link etc.).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread RcppTskit/R/RcppExports.R
.Call(`_RcppTskit_tskit_version`)
}

rtsk_const_tsk_no_check_integrity <- function() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since Python sort method does not have options I assume we do not need rtsk_const_tsk_no_check_integrity at all?

// @title Add a row to the migration table in a table collection
// @param tc an external pointer to table collection as a
// \code{tsk_table_collection_t} object.
// @param left numeric scalar left coordinate for the new migration.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention inclusive/exclusive

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// @title Add a row to the edge table in a table collection
// @param tc an external pointer to table collection as a
// \code{tsk_table_collection_t} object.
// @param left numeric scalar left coordinate for the new edge.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention inclusive/exclusive

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// (n_before <- RcppTskit:::rtsk_table_collection_get_num_migrations(tc_xptr))
// (m_before <-
// RcppTskit:::rtsk_table_collection_metadata_length(tc_xptr)$migrations) new_id
// <- RcppTskit:::rtsk_migration_table_add_row(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in R side - let's do a meaningful example

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread RcppTskit/NEWS.md
`pointer` to `xptr`.
- Ensured `TableCollection$tree_sequence()` matches `tskit Python` API:
it now builds indexes on the `TableCollection`, if indexes are not present.
- Improved table row getter APIs and documentation for better `tskit C`/Python
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LynxJinyangii I think we can remove 78-80 since getters have been introduced as part of 0.3.0 anyway.

)
},

# TODO: how should we handle useR's experience with numeric&integer in getters?
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LynxJinyangii remove these TODOs please - we have now decided we support both integers and numerics and that we will work with 0-based indexing (I assume)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants