Package website: release | dev
Extends the mlr3 package with a data backend to transparently work with databases. Two additional backends are currently implemented:
DataBackendDplyr
: Relies internally on the abstraction of dplyr and dbplyr.DataBackendDuckDB
: Connector to duckdb.You can install the released version of mlr3db from CRAN with:
And the development version from GitHub with:
library(mlr3)
library(mlr3db)
# Create a classification task:
task = tsk("spam")
# Convert the task backend from a data.table backend to a DuckDB backend.
# By default, a temporary directory is used to store the database files.
# Note that the in-memory data is now used anymore, its memory will get freed
# by the garbage collector.
task$backend = as_duckdb_backend(task$backend)
# The requested data will be queried from the database in the background:
learner = lrn("classif.rpart")
ids = sample(task$row_ids, 3000)
learner$train(task, row_ids = ids)