Debug: mutate with ifelse overwrites a non-NA value with NA
Debugging a weird problem
library(dplyr)
df = tibble::tribble(
~data_field_name, ~type,
"vehicle_id", NA,
"type", "TEXT"
)
df
#> # A tibble: 2 x 2
#> data_field_name type
#> <chr> <chr>
#> 1 vehicle_id <NA>
#> 2 type TEXT
After mutate()
, second row’s type
field becomes NA
. This is unexpected behaviour.
df %>%
dplyr::mutate( type = ifelse( rutils::is_na(type), "TEXT", type))
#> # A tibble: 2 x 2
#> data_field_name type
#> <chr> <chr>
#> 1 vehicle_id TEXT
#> 2 type <NA>
Using base::is.na()
instead of rutils::is_na()
solves the problem.
df %>%
dplyr::mutate( type = ifelse( is.na(type), "TEXT", type))
#> # A tibble: 2 x 2
#> data_field_name type
#> <chr> <chr>
#> 1 vehicle_id TEXT
#> 2 type TEXT
Here is the definition of rutils::is_na()
:
is_na = function(x) is.na(x) | all(ifelse( class(x) == "character", x == "NA", FALSE))
Probably, the unexpected behaviour is caused by intermixing of column data and row data. Normally, we expect mutate()
to work for each row separately. But in this case, probably x
argument is bound to the whole column vector not just a single row of it.