The R ecosystem knows a vast number of time series classes: ts, xts, zoo, tsibble, tibbletime or timeSeries. The plethora of standards causes confusion. As different packages rely on different classes, it is hard to use them in the same analysis. tsbox provides a set of tools that make it easy to switch between these classes. It also allows the user to treat time series as plain data frames, facilitating the use with tools that assume rectangular data.
The tsbox package is built around a set of functions that convert time series of different classes to each other. They are frequency-agnostic, and allow the user to combine multiple non-standard and irregular frequencies. Because coercion works reliably, it is easy to write functions that work identically for all classes. So whether we want to smooth, scale, differentiate, chain-link, forecast, regularize or seasonally adjust a time series, we can use the same tsbox-command for any time series class.
Version 0.1, now on CRAN, brings a large number of bug fixes and improvements. A substantial
change involves the treatment of
NA values in data frames. Previously, all
NAs in data frames were treated as implicit, and were only made explicit by a call to
This has changed now. If you convert a
ts object to a data
NA values will be preserved. To replicate previous behavior, apply
library(tsbox) x.ts <- ts_c(mdeaths, austres) x.ts ts_df(x.ts) ts_na_omit(ts_df(x.ts))
ts_spanextends outside of series span
This lays the groundwork for
ts_span to be extensible. With
extend = TRUE,
ts_span extends a regular series with
NA values, up to the specified limits,
similar to base
window. Like all functions in tsbox, this is frequency-agnostic. For example, in the following, the monthly series
extended by monthly
NA values, while the quarterly series
austres is extended
x.df <- ts_df(ts_c(mdeaths, austres)) ts_span(x.df, end = "1999-12-01", extend = TRUE)
ts_defaultstandardizes column names in a data frame
In rectangular data structures, i.e., in a
data.table, or a
tibble, tsbox stores one or multiple time series in the ‘long’ format. By
default, tsbox detects a value, a time and zero, one or several id
columns. Alternatively, the time column and the value column can be explicitly
value. If explicit names are used, the column order will
While automatic column name detection is useful in interactive mode, it produces
unnecessary overhead in longer workflows. The helper function
detects and renames the time and the value column, so that auto-detection will
be turned off in subsequent steps (note that the names of the id columns are not
x.df <- ts_df(ts_c(mdeaths, austres)) names(x.df) <- c("a fancy id name", "date", "count") ts_plot(x.df) # tsbox is fine with that ts_default(x.df)
ts_summarysummarizes time series
ts_summary provides a frequency agnostic summary of a ts-boxable object:
ts_summary(ts_c(mdeaths, austres)) #> id obs diff freq start end #> 1 mdeaths 72 1 month 12 1974-01-01 1979-12-01 #> 2 austres 89 3 month 4 1971-04-01 1993-04-01
ts_summary returns a plain data frame that can be used for any purpose. It is
also recommended for the extraction of various time series properties, such as
ts_summary(austres)$id #>  "austres" ts_summary(austres)$start #>  "1971-04-01"
Finally, we fabricated a tsbox cheat sheet that summarizes most functionality. Print and enjoy working with time series.
We are hiring a data scientist/engineer! You are enthusiastic about R and know a bit about Git – everything else is negotiable. We offer interesting projects around the R ecosystem and lots of freedom, with option to work remotely.
Very good knowledge of R
Basic knowledge of Git
Experience in Machine Learning, Statistics, Programming, Experience with Databases, Teaching, shiny, tidyverse, etc., is a plus.
40%-100% commitment, can be adapted throughout the year
Ability and desire to learn and improve on the job
Good working knowledge of written English
Very good command of spoken English or German
Interesting projects around consulting and open source software development
Nice private office in Zurich close to ETH Hönggerberg, free coffee during office hours
Remote work possible
Please submit your application via firstname.lastname@example.org, or share a private GitHub repository with us. Get in touch with us if you have further questions.
cynkra is a Zurich-based data consulting company with a strong focus on R. We use R and the tidyverse in the vast majority of our projects. We support businessess and organizations by helping them picking the right tools, implementing solutions, training and code review. We are enthusiastic about open source software and contribute to it, too. Learn more at www.cynkra.com.
The R ecosystem knows a ridiculous number of time series classes. So, I decided to create a new universal standard that finally covers everyone’s use case…
tsbox, now freshly on CRAN, provides a set of tools that are agnostic towards existing time series classes. It is built around a set of converters, which convert time series stored as ts, xts, data.frame, data.table, tibble, zoo, tsibble or timeSeries to each other.
To install the stable version from CRAN:
To get an idea how easy it is to switch from one class to another, consider this:
library(tsbox) x.ts <- ts_c(mdeaths, fdeaths) x.xts <- ts_xts(x.ts) x.df <- ts_df(x.xts) x.tbl <- ts_tbl(x.df) x.dt <- ts_tbl(x.tbl) x.zoo <- ts_zoo(x.dt) x.tsibble <- ts_tsibble(x.zoo) x.timeSeries <- ts_timeSeries(x.tsibble)
We jump form good old
ts objects to
xts, store our time series in various
data frames and convert them to some highly specialized time series formats.
Because these converters work nicely, we can use them to make functions class-agnostic. If a class-agnostic function works for one class, it works for all:
ts_scale(x.ts) ts_scale(x.xts) ts_scale(x.df) ts_scale(x.dt) ts_scale(x.tbl)
ts_scale normalizes one or multiple series, by subtracting the mean and
dividing by the standard deviation. It works like a ‘generic’ function: You can
apply it on any time series object, and it will return an object of the same
class as its input.
So, whether we want to smooth, scale, differentiate, chain-link, forecast, regularize or seasonally adjust a series, we can use the same commands to whatever time series at hand. tsbox offers a comprehensive toolkit for the basics of time series manipulation. Here are some additional operations:
ts_pc(x.ts) # percentage change rates ts_forecast(x.xts) # forecast, by exponential smoothing ts_seas(x.df) # seasonal adjustment, by X-13 ts_frequency(x.dt, "year") # convert to annual frequency ts_span(x.tbl, "-1 year") # limit time span to final year
There are many more. Because they all start with
ts_, you can use
auto-complete to see what’s around. Most conveniently, there is a time series
plot function that works for all classes and frequencies:
ts_plot( `Airline Passengers` = AirPassengers, `Lynx trappings` = ts_df(lynx), `Deaths from Lung Diseases` = ts_xts(fdeaths), title = "Airlines, trappings, and deaths", subtitle = "Monthly passengers, annual trappings, monthly deaths" )
There is also a version that uses ggplot2 and has the same syntax.
You may have wondered why we treated data frames as a time series class. The spread of dplyr and data.table has given data frames a boost and made them one of the most popular data structures in R. So, storing time series in a data frame is an obvious consequence. And even if you don’t intend to keep time series in data frames, this is still the format in which you import and export your data. tsbox makes it easy to switch from data frames to time series and back.
tsbox includes tools to
make existing functions class-agnostic. To do so, the
ts_ function can be used
to wrap any function that works with time series. For a function that works on
"ts" objects, this is as simple as that:
ts_rowsums <- ts_(rowSums) ts_rowsums(ts_c(mdeaths, fdeaths))
ts_ returns a function, which can be used with or without a name.
In case you are wondering, tsbox uses data.table as a backend, and makes use of its incredibly efficient reshaping facilities, its joins and rolling joins. And thanks to anytime, tsbox will be able to recongnize almost any date format without manual intervention.
So, enjoy some relieve in R’s time series class struggle.