Debug: mutate with ifelse overwrites a non-NA value with NA

library(dplyr)
df = tibble::tribble(
  ~data_field_name,   ~type,
      "vehicle_id",      NA,
            "type",  "TEXT"
  )
df
#> # A tibble: 2 x 2
#>   data_field_name  type
#>             <chr> <chr>
#> 1      vehicle_id  <NA>
#> 2            type  TEXT

After mutate(), second row’s type field becomes NA. This is unexpected behaviour.

df %>%
  dplyr::mutate( type = ifelse( rutils::is_na(type), "TEXT", type)) 
#> # A tibble: 2 x 2
#>   data_field_name  type
#>             <chr> <chr>
#> 1      vehicle_id  TEXT
#> 2            type  <NA>

Using base::is.na() instead of rutils::is_na() solves the problem.

df %>%
  dplyr::mutate( type = ifelse( is.na(type), "TEXT", type)) 
#> # A tibble: 2 x 2
#>   data_field_name  type
#>             <chr> <chr>
#> 1      vehicle_id  TEXT
#> 2            type  TEXT

Here is the definition of rutils::is_na():

is_na = function(x) is.na(x) | all(ifelse( class(x) == "character", x == "NA", FALSE))

Probably, the unexpected behaviour is caused by intermixing of column data and row data. Normally, we expect mutate() to work for each row separately. But in this case, probably x argument is bound to the whole column vector not just a single row of it.

 Tech    22 Dec, 2017

Any work (images, writings, presentations, ideas or whatever) which I own is always provided under
Creative Commons License Creative Commons Attribution-Share Alike 3.0 License

Mert Nuhoglu is a Trabzon-born programmer and data scientist.

You may also like...