% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/functions.R
\name{cat_new_class}
\alias{cat_new_class}
\title{Clubbing class of categorical variables with low population percentage with another class of similar event rate}
\usage{
cat_new_class(base, target, cat_var_name, threshold, event = 1)
}
\arguments{
\item{base}{input dataframe}

\item{target}{column / field name for the target variable to be passed as string (must be 0/1 type)}

\item{cat_var_name}{column name or array of column names of categorical variable on which the operation is to be done, to be passed as string}

\item{threshold}{threshold population percentage below which the class will be considered to be be clubbed with another class, to be provided as decimal/fraction}

\item{event}{(optional) the event class, to be passed as 0 or 1 (default is 1)}
}
\value{
The function returns an object of class "cat_new_class" which is a list containing the following components:

\item{base_new}{a dataframe after clubbing low percentage classes with another class of similar or closest but higher event rate}

\item{cat_class_new}{a dataframe with mapping between original classes and new clubbed classes (if any)}
}
\description{
The function groups classes of categorical variables, which have population percentage less than a threshold, with another class of similar event rate. If a class of exactly same event rate is not available, it is clubbed with the one having a higher event rate closest to it.
}
\examples{
data <- iris[1:110,]
data$Species <- as.character(data$Species)
data$Y <- sample(0:1,size=nrow(data),replace=TRUE)
data_newclass <- cat_new_class(base = data,target = "Y",cat_var_name = "Species",threshold = 0.1)
}
\author{
Arya Poddar <aryapoddar290990@gmail.com>

Kanishk Dogar <Kanishkd4@gmail.com>
}
