You are a data scientist living your best life: “I have the sexiest job of the 21st century” you mutter under your breath as you work. Then tragedy strikes. That new dataset you just received? It has date data. And not just any date data- date data which has been provided by a free-text web form. You scroll through the data, the dates are in many different formats: day of the month first, year first, - or / separators, month given as an abbreviation or in full. “This is going to be a nightmare to standardise” you think to yourself. This is decidedly unsexy.

Fortunately, this is where the datefixR R package comes in. In a single function call, all of these date data can be standardised into one format. datefixR’s fix_dates() function takes a dataframe (or tibble) object and the names of any columns with date data and returns a dataframe object where the selected columns have been standardised and converted to R’s built-in Date class (which follows the ISO 8601 format). If there are any dates which cannot be resolved, an error will be raised providing the date and associated ID (assumed to be the first column if not explicitly provided).

Congratulations! You have potentially just saved yourself a whole lot of time. Now you can run the analysis you actually want to do!

If you are interested in learning how to make the most of the datefixR, check its getting started vignette

datefixR is available on CRAN.


FAQs

  • Q: Is the US date standard (month-first) supported?
    A Yes! Just provide the argument format = "mdy"when calling fix_dates()

  • Q: Can I just standardise a single date?
    A Yes! Just use the fix_date() function. Please note that this might be slightly slower in a for loop than when using fix_dates().