Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -50,4 +50,4 @@ Encoding: UTF-8
Language: en-US
LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.2
RoxygenNote: 7.3.3
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# dunlin 0.1.11.9000

* New `mode` argument in `log_filter` determining whether all tables should be filtered to retain only the rows whose identifiers are present in the filtered data set selected with the `table` argument. (Previously, the behavior was triggered automatically when the `table` name was `adsl`)

# dunlin 0.1.11

* Added `subject_level_flag()` function.
Expand Down
26 changes: 16 additions & 10 deletions R/filter.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,10 @@
#' @param ... further arguments to be passed to or from other methods.
#' @returns a `data.frame` or `list` of `data.frame` filtered for the provided conditions.
#' @details
#' `log_filter` will filter the data/named list of data according to the `condition`.
#' `log_filter` will filter the `data.frame` /named list of `data.frame` according to the `condition`.
#' All the variables in `condition` must exist in the data (as variables) or in the parent
#' frame(e.g., in global environment).
#' For named list of data, if `ADSL` is available, `log_filter` will also try to subset all
#' other datasets with `USUBJID`.
#' For named list of `data.frame`,
#' @export
log_filter <- function(data, condition, ...) {
UseMethod("log_filter")
Expand Down Expand Up @@ -44,29 +43,36 @@

#' @rdname log_filter
#' @param table (`string`) table name.
#' @param by (`character`) variable names shared by `adsl` and other datasets for filtering.
#' @param by (`character`) variable names shared by `table` and other datasets for filtering when `mode == "all"`.
#' @param verbose (`flag`) whether to print a report about the filtering.
#' @param mode (`string`) one of `all` or `unique` whether the other tables should be filtered based on the rows retained in `table`. Default value is `"all"` is `table == "adsl"` and `"unique"` otherwise.

Check warning on line 48 in R/filter.R

View workflow job for this annotation

GitHub Actions / SuperLinter 🦸‍♀️ / Lint R code 🧶

file=R/filter.R,line=48,col=121,[line_length_linter] Lines should not be more than 120 characters. This line is 205 characters.
#' @export
#' @examples
#' log_filter(list(iris = iris), Sepal.Length >= 7, "iris", character(0))
log_filter.list <- function(data, condition, table, by = c("USUBJID", "STUDYID"), suffix = NULL, verbose = FALSE, ...) {
log_filter.list <- function(data, condition, table, by = c("USUBJID", "STUDYID"), suffix = NULL, verbose = FALSE, mode = ifelse(table == "adsl", "all", "unique"), ...) {

Check warning on line 52 in R/filter.R

View workflow job for this annotation

GitHub Actions / SuperLinter 🦸‍♀️ / Lint R code 🧶

file=R/filter.R,line=52,col=121,[line_length_linter] Lines should not be more than 120 characters. This line is 169 characters.
checkmate::assert_list(data, types = "data.frame", names = "unique")
assert_all_tablenames(data, table)
checkmate::assert_names(colnames(data[[table]]), must.include = by)
checkmate::assert_character(by, null.ok = TRUE)
checkmate::assert_string(suffix, null.ok = TRUE)
checkmate::assert_flag(verbose)
checkmate::assert_subset(mode, c("all", "unique"))

condition <- match.call()$condition
data[[table]] <- eval(bquote(log_filter(data[[table]], .(condition), .(suffix))))
if (identical(table, "adsl")) {
for (k in setdiff(names(data), "adsl")) {

if (mode == "all") {
for (k in setdiff(names(data), table)) {
if (all(by %in% names(data[[k]]))) {
if (length(by) == 0) by <- intersect(names(data[[k]]), names(data$adsl))
if (length(by) == 0) by <- intersect(names(data[[k]]), names(data[[table]]))

ori_n <- nrow(data[[k]])
ori_att <- attr(data[[k]], "rows")

data[[k]] <- dplyr::semi_join(data[[k]], data$adsl, by = by)
data[[k]] <- dplyr::semi_join(data[[k]], data[[table]], by = by)

rows <- list(list(init = ori_n, final = nrow(data[[k]]), suffix = suffix))
names(rows) <- paste0("Filtered by adsl: ", deparse(condition), collapse = "")
names(rows) <- paste0(sprintf("Filtered by %s: ", table), deparse(condition), collapse = "")
attr(data[[k]], "rows") <- c(ori_att, rows)
}
}
Expand Down
4 changes: 4 additions & 0 deletions R/reformat.R
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
#' Reformat Values
#'
#' Replaces substitute values in `vectors` or `list` of `data.frame` using used defined [`rule`].
#' See `vignette("Reformatting", package = "dunlin")` for a detailed guide on using this function.
#'
#' @param obj (`character`, `factor` or `list of data.frame`) to reformat.
#' @param format (`rule`) or (`list`) of `rule` depending on the class of obj.
#' @param ... for compatibility between methods and pass additional special mapping to transform rules.
Expand Down
2 changes: 1 addition & 1 deletion man/get_arg.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 6 additions & 4 deletions man/log_filter.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion man/reformat.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

17 changes: 17 additions & 0 deletions tests/testthat/test-filter.R
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,23 @@ test_that("log_filter works with long conditions", {
expect_identical(df1, df2, ignore_attr = TRUE)
})

test_that("log_filters works with custom `mode` argument", {
dfa <- data.frame(USUBJID = letters[5:14], b = 1:10)
dfb <- data.frame(USUBJID = letters[1:10], c = 1:10)

attr(dfa$USUBJID, "label") <- "usubjid_dfa"
attr(dfb$USUBJID, "label") <- "usubjid_dfb"

df_raw <- list(adsl = dfa, dfb = dfb)
res <- expect_silent(log_filter(df_raw, c >= 7, "dfb", by = "USUBJID", mode = "all"))
expect_equal(nrow(res$dfb), 4)
expect_equal(nrow(res$adsl), 4)

res <- expect_silent(log_filter(df_raw, c >= 7, "dfb", by = "USUBJID", mode = "unique"))
expect_equal(nrow(res$dfb), 4)
expect_equal(nrow(res$adsl), 10)
})

# get_log ----

test_that("get_log works as expected", {
Expand Down
Loading