step_embed()
now correctly defaults to have a random id with the word “embed”. (#102)
step_feature_hash()
is soft deprecated in embed in favor of step_dummy_hash()
in textrecipes. (#95)
Steps now have a dedicated subsection detailing what happens when tidy()
is applied. (#105)
Reorganize documentation for all recipe step tidy
methods (#115).
Fixed a bug where woe_table()
and step_woe()
didn’t respect the factor levels of the outcome. (109)
Re-licensed package from GPL-2 to MIT. See consent from copyright holders here.
The tunable parameter ranges for step_umap()
were changed for neighbors
, num_comp
, and min_dist
to prevent uwot
segmentation faults. The step also check to see if the data dimensions are consistent with the argument values.
Two new PCA steps were added, each using sparse techniques for estimation: step_pca_sparse()
and step_pca_sparse_bayes()
.
Updated to use recipes_eval_select()
from recipes 0.1.17 (#85).
Added prefix
argument to step_umap()
to harmonize with other recipes steps (#93).
All embed recipe steps now officially support empty selections to be more aligned with recipes, dplyr and other packages that use tidyselect.
step_woe()
no longer warns about high-cardinality predictors when the recipe is estimated. Instead it warns when categories have fewer than 10 data points in the training set. (#74)
Minor release with changes to test for cases when CRAN cannot get xgboost
to work on their Solaris configuration.
lme4
and rstanarm
are now in the Suggests list so they are not automatically installed with embed
. A message is written to the console if those packages are missing and their associated steps functions are invoked.
Changes to tests to get out of archive jail.
Updated the plumbing behind step_woe()
.
Due to a bug in tensorflow
, added a “warm start” to instigate a TF session if one does not currently exist.
dplyr
1.0.0step_discretize_xgb()
and step_discretize_cart()
can be used to convert numeric predictors to categorical using supervised binning methods based on tree models. Thanks to Konrad Semsch for the contribution.
Added step_feature_hash()
for creating dummy variables using feature hashing.
tidy.step_woe()
now has column names consistent with other recipe steps.stringsAsFactors
change.embed
0.0.5The example data are now in the modeldata
package.
Small TF updates to step_embed()
.
embed
0.0.4Methods were added for a future generic called tunable()
. This outlines which parameters in a step can/could be tuned.
Small updates to work with different versions of tidyr
.
embed
0.0.3step_umap()
was added for both supervised and unsupervised encodings.step_woe()
created weight of evidence encodings.embed
0.0.2A mostly maintainence release to be compatible with version 0.1.3 of recipes
.
The package now depends on the generics
pacakge to get the broom
tidy
methods.
Karim Lahrichi added the ability to use callbacks when fitting tensorflow models. PR
embed
0.0.1First CRAN version