Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -50,4 +50,4 @@ Encoding: UTF-8
Language: en-US
LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.2
RoxygenNote: 7.3.3
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# dunlin 0.1.11.9000

* New `mode` argument in `log_filter` determining whether all tables should be filtered to retain only the rows whose identifiers are present in the filtered data set selected with the `table` argument. (Previously, the behavior was triggered automatically when the `table` name was `adsl`)

# dunlin 0.1.11

* Added `subject_level_flag()` function.
Expand Down
37 changes: 27 additions & 10 deletions R/filter.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,11 @@
#' @param ... further arguments to be passed to or from other methods.
#' @returns a `data.frame` or `list` of `data.frame` filtered for the provided conditions.
#' @details
Comment on lines 5 to 7

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#' @param ... further arguments to be passed to or from other methods.
#' @returns a `data.frame` or `list` of `data.frame` filtered for the provided conditions.
#' @details
#' @param ... further arguments to be passed to or from other methods.
#'
#' @returns a `data.frame` or `list` of `data.frame` filtered for the provided conditions.
#'
#' @details

#' `log_filter` will filter the data/named list of data according to the `condition`.
#' `log_filter` will filter the `data.frame` /named list of `data.frame` according to the `condition`.
#' All the variables in `condition` must exist in the data (as variables) or in the parent
#' frame(e.g., in global environment).
#' For named list of data, if `ADSL` is available, `log_filter` will also try to subset all
#' other datasets with `USUBJID`.
#' For a named list of `data.frame`, set `mode = "all"`` to filter other tables by keys retained
#' in table (using by), or `mode = "unique"` to leave other tables unchanged.
#' @export
Comment on lines +12 to 13
Copy link

@Melkiades Melkiades Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#' in table (using by), or `mode = "unique"` to leave other tables unchanged.
#' @export
#' in table (using by), or `mode = "unique"` to leave other tables unchanged.
#'
#' @export

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just for readability

log_filter <- function(data, condition, ...) {
UseMethod("log_filter")
Expand Down Expand Up @@ -44,29 +44,46 @@ log_filter.data.frame <- function(data, condition, suffix = NULL, ...) {

#' @rdname log_filter
#' @param table (`string`) table name.
#' @param by (`character`) variable names shared by `adsl` and other datasets for filtering.
#' @param by (`character`) variable names shared by `table` and other datasets for filtering when `mode == "all"`.
#' @param verbose (`flag`) whether to print a report about the filtering.
#' @param mode (`string`) one of `all` or `unique` whether the other tables should be filtered based on the rows
#' retained in `table`.
#' Default value is `"all"` is `table == "adsl"` and `"unique"` otherwise.
#' @export
#' @examples
#' log_filter(list(iris = iris), Sepal.Length >= 7, "iris", character(0))
log_filter.list <- function(data, condition, table, by = c("USUBJID", "STUDYID"), suffix = NULL, verbose = FALSE, ...) {
log_filter.list <- function(data,
condition,
table,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be adsl as default maybe?

by = c("USUBJID", "STUDYID"),
suffix = NULL,
verbose = FALSE,
mode = ifelse(table == "adsl", "all", "unique"),
...) {
checkmate::assert_list(data, types = "data.frame", names = "unique")
assert_all_tablenames(data, table)
checkmate::assert_names(colnames(data[[table]]), must.include = by)
checkmate::assert_string(suffix, null.ok = TRUE)
checkmate::assert_flag(verbose)
checkmate::assert_choice(mode, c("all", "unique"))

condition <- match.call()$condition
data[[table]] <- eval(bquote(log_filter(data[[table]], .(condition), .(suffix))))
if (identical(table, "adsl")) {
for (k in setdiff(names(data), "adsl")) {

if (mode == "all") {
checkmate::assert_character(by, null.ok = TRUE)

for (k in setdiff(names(data), table)) {
if (all(by %in% names(data[[k]]))) {
if (length(by) == 0) by <- intersect(names(data[[k]]), names(data$adsl))
if (length(by) == 0) by <- intersect(names(data[[k]]), names(data[[table]]))

ori_n <- nrow(data[[k]])
ori_att <- attr(data[[k]], "rows")

data[[k]] <- dplyr::semi_join(data[[k]], data$adsl, by = by)
data[[k]] <- dplyr::semi_join(data[[k]], data[[table]], by = by)

rows <- list(list(init = ori_n, final = nrow(data[[k]]), suffix = suffix))
names(rows) <- paste0("Filtered by adsl: ", deparse(condition), collapse = "")
names(rows) <- paste0(sprintf("Filtered by %s: ", table), deparse(condition), collapse = "")
attr(data[[k]], "rows") <- c(ori_att, rows)
}
}
Expand Down
4 changes: 4 additions & 0 deletions R/reformat.R
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
#' Reformat Values
#'
#' Replaces values in `vectors` or `list` of `data.frame` using used-defined [`rule`] or list of [`rule`].
#' See `vignette("Reformatting", package = "dunlin")` for a detailed guide on using this function.
#'
#' @param obj (`character`, `factor` or `list of data.frame`) to reformat.
#' @param format (`rule`) or (`list`) of `rule` depending on the class of obj.
#' @param ... for compatibility between methods and pass additional special mapping to transform rules.
Expand Down
2 changes: 1 addition & 1 deletion man/get_arg.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 8 additions & 4 deletions man/log_filter.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion man/reformat.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

17 changes: 17 additions & 0 deletions tests/testthat/test-filter.R
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,23 @@ test_that("log_filter works with long conditions", {
expect_identical(df1, df2, ignore_attr = TRUE)
})

test_that("log_filter works with custom `mode` argument", {
dfa <- data.frame(USUBJID = letters[5:14], b = 1:10)
dfb <- data.frame(USUBJID = letters[1:10], c = 1:10)

attr(dfa$USUBJID, "label") <- "usubjid_dfa"
attr(dfb$USUBJID, "label") <- "usubjid_dfb"

df_raw <- list(adsl = dfa, dfb = dfb)
res <- expect_silent(log_filter(df_raw, c >= 7, "dfb", by = "USUBJID", mode = "all"))
expect_equal(nrow(res$dfb), 4)
expect_equal(nrow(res$adsl), 4)

res <- expect_silent(log_filter(df_raw, c >= 7, "dfb", by = "USUBJID", mode = "unique"))
expect_equal(nrow(res$dfb), 4)
expect_equal(nrow(res$adsl), 10)
Comment on lines +157 to +162

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe you could show the attribute that was filtered out/in?

})

# get_log ----

test_that("get_log works as expected", {
Expand Down
Loading