Skip to content

Write a small vignette on .complete and slide_period_*() family #202

@wallyxie

Description

@wallyxie

Hi @DavisVaughan,

Per this Stack Overflow question, I am experiencing an issue where slide_period_dfr produces the same output including partial period calculations regardless of whether .complete is set to T or F. It looks like at least one other user was able to replicate this.

The issue can be replicated as follows:

library(lubridate)

set.seed(1)

dates <- ymd(parse_date("2023-12-31") - days(0:199))

colors <- c('red', 'blue')
sample_colors <- sample(colors, 200, replace = TRUE)
objects <- c('pen', 'marker', 'brush')
sample_objects <- sample(objects, 200, replace = TRUE)

test_df <- data.frame(dates, sample_colors, sample_objects)

period_count  <- function(dat) {
    dat |>
        add_count(sample_colors, sample_objects, name = "sub_total") |>
        summarise(
            earliest_day_of_period = min(dates),
            latest_day_of_period = max(dates),
            day_span = latest_day_of_period - earliest_day_of_period,
            min_object_n = min(sub_total)
        ) 
}

slider::slide_period_dfr(
    test_df,
    .i = test_df$dates,
    .period = "day",
    .f = period_count,
    .every = 60,
    .complete = TRUE,
    .origin = max(test_df$dates) +1
)

Running

test_df_period_counts <- slide_period_dfr(
  test_df,
  .i = test_df$dates,
  .period = "day",
  .f = period_count,
  .every = 60,
  .complete = TRUE,
  .origin = max(test_df$dates) +1
)

then produces

#   earliest_day_of_period latest_day_of_period day_span min_object_n
# 1             2023-08-15           2023-09-03  19 days            1
# 2             2023-09-04           2023-11-02  59 days            5
# 3             2023-11-03           2024-01-01  59 days            6
# 4             2024-01-02           2024-03-01  59 days            6

as does

test_df_period_counts <- slide_period_dfr(
  test_df,
  .i = test_df$dates,
  .period = "day",
  .f = period_count,
  .every = 60,
  .complete = FALSE,
  .origin = max(test_df$dates) +1
)

where the partial period and its .f operations are included.

Is this a bug, or does slide_period_dfr ignore the .complete argument?

Thank you for your time and attention!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions