% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/fexport.R
\name{fclust}
\alias{fclust}
\title{Build a functional clustering for one or more performances}
\usage{
fclust(dat, nbElt,
       weight     = rep(1, dim(dat)[2] - nbElt - 1),
       opt.na     = FALSE,
       opt.repeat = FALSE,
       opt.method = "divisive",
       affectElt  = rep(1, nbElt),
       opt.mean   = "amean",
       opt.model  = "byelt",
       opt.jack   = FALSE,   jack = c(3,4) )
}
\arguments{
\item{dat}{a data.frame or matrix that brings together:
a vector of assemblage identity,
a matrix of occurrence of components within the system,
one or more vectors of observed performances.
Consequently, the data.frame or matrix dimensions are:
\code{dim(dat)[1]=} the number of observed assemblages,
\code{* dim(dat)[2]=} 1 + number of system components +
number of observed performances.
On a first line (colnames): assemblage identity,
a list of components identified by their names,
a list of performances identified by their names.
On following lines (a line by assemblage),
name of the assemblage (read as character),
a sequence of 0 (absence) and 1 (presence of component
within each assemblage)
(this is the matrix of occurrence of components within the system),
a sequence of numeric values for informed each observed performances
(this is the set of observed performances).}

\item{nbElt}{an integer, that specifies the number of components
belonging to interactive system.
\code{nbElt} is used to know the dimension of matrix of occurrence.}

\item{weight}{a vector of numerics,
that specifies the weight of each performance.
By default, each performance is equally weighted.
If \code{weight} is informed, it must have the same length
as the number of observed performances.}

\item{opt.na}{a logical.
The records for each assemblage can have \code{NA}
in matrix of occurrence or in observed assemblage performances.
If \code{opt.na = FALSE} (by default), an error is returned.
If \code{opt.na = TRUE}, the records with \code{NA} are ignored.}

\item{opt.repeat}{a logical.
in any case, the function looks for
different assemblages with identical elemental composition.
Messages indicate these identical assemblages.
If \code{opt.repeat = FALSE} (by default),
their performances are averaged.
If \code{opt.repeat = TRUE}, nothing is done,
and the data are processed as they are.}

\item{opt.method}{a string that specifies the method to use.
\code{opt.method = c("divisive", "agglomerative", "apriori")}.
The three methods generate hierarchical trees.
Each tree is complete, running from a unique trunk
to as many leaves as components. \cr

If \code{opt.method = "divisive"}, the components are clustered
by using a divisive method,
from the trivial cluster where all components are together,
towards the clustering where each component is a cluster.
This method gives the best result for several reasons,
exposed in detail in joined vignettes (see "The options of fclust").

If \code{opt.method = "agglomerative"}, the components are clustered
by using an agglomerative method,
from the trivial clustering where each component is a cluster,
towards the cluster where all components are brought together
If all possible assemblages are not observed
(that is generally he case in practice),
the first clustering of few components can have no effect
on convergence criterion, indicing a non-optimum result.

If \code{opt.method = "apriori"}, the user knows and gives
an "a priori" partitioning of the system components he is studying.
The partition is arbitrary, in any number of clusters of components,
but it must be specified (see following option \code{affectElt}).
The tree is then built:
\emph{(i)} by using \code{opt.method =  "divisive"}
from the defined component clustering towards as many leaves as components;
\emph{(ii)} by using \code{opt.method =  "agglomerative"}
from the  component clustering towards the trunk of tree.}

\item{affectElt}{a vector of characters or integers,
as long as the number of components \code{nbElt},
that indicates the labels of different functional clusters
to which each component belongs.
Each functional cluster is labelled as a character or an integer, and
each component must be identified by its name in \code{names(affectElt)}.
The number of functional clusters defined in \code{affectElt}
determines an \emph{a priori} level of component clustering
(\code{level <- length(unique(affectElt))}).\cr

If \code{affectElt = NULL} (by default),
the option \code{opt.method} must be specified.
If \code{affectElt} is specified,
the option \code{opt.method} switchs to \code{apriori}.}

\item{opt.mean}{a character, equals to \code{"amean"} or \code{"gmean"}.
If \code{opt.mean = "amean"},
means are computed using an arithmetic formula,
if \code{opt.mean = "gmean"},
mean are computed using a geometric formula.}

\item{opt.model}{a character equals to \code{"bymot"} or \code{"byelt"}.
If \code{opt.model = "bymot"},
the  modelled performances are means
of performances of assemblages
that share a same assembly motif
by including all assemblages that belong to a same assembly motif. \cr

If \code{opt.model = "byelt"},
the modelled performances are the average
of mean performances of assemblages
that share a same assembly motif
and that contain the same components
as the assemblage to predict.
This procedure corresponds to a linear model within each assembly motif
based on the component occurrence in each assemblage.
If no assemblage contains component belonging to assemblage to predict,
performance is the mean performance of all assemblages
as in \code{opt.model = "bymot"}.}

\item{opt.jack}{a logical,
that switchs towards cross-validation method.

If \code{opt.jack = FALSE} (by default), a Leave-One-Out method is used:
predicted performances are computed
as the mean of performances of assemblages
that share a same assembly motif,
experiment by experiment,
except the only assemblage to predict. \cr

If \code{opt.jack = TRUE}, a jackknife method is used:
the set of assemblages belonging to a same assembly motif is divided
into \code{jack[2]} subsets of \code{jack[1]} assemblages.
Predicted performances of each subset of \code{jack[1]} assemblages
are computed, experiment by experiment,
by using the other (\code{jack[2] - 1}) subsets of assemblages.
If the total number of assemblages belonging
to the assembly motif is lower than \code{jack[1]*jack[2]},
predictions are computed by Leave-One-Out method.}

\item{jack}{an integer vector of length \code{2}.
The vector specifies the parameters for jackknife method.
The first integer \code{jack[1]} specifies the size of subset,
the second integer \code{jack[2]} specifies the number of subsets.}
}
\value{
Return a list containing the primary tree of component clustering,
predictions of assembly performances
and statistics computed by using the primary and secondary trees
of component clustering.

 Recall of inputs:
 \itemize{
 \item \code{nbElt, nbAss, nbXpr}:
 the number of components that belong to the interactive system,
 the number of assemblages and the number of performances observed,
 respectively.

 \item \code{opt.method, opt.mean, opt.model, opt.jack, jack, opt.na,
 opt.repeat, affectElt}: the options used
 for computing the resulting clustering trees,
 respectively.

 \item \code{fobs, mOccur, xpr}:
 the vector or matrix of observed performances of assemblages,
 the binary matrix of occurrence of components, and
 the vector of weight of different performances,
 respectively.
 }

 Primary and secondary, fitted and validated trees,
 of component clustering and associated statistics:
 \itemize{
 \item \code{tree.I, tree.II, nbOpt}:
 the primary tree of component clustering,
 the validated secondary tree of component clustering,
 and the optimum number of functional clusters,
 respectively.
 A tree is a list of a square-matrix of dimensions
 \code{nbLev * nbElt} (with \code{nbLev = nbElt}),
 and of a vector of coefficient of determination (of length \code{nbLev}).

 \item \code{mCal, mPrd, tCal, tPrd}:
 the numeric matrix of modelled values,
 and of values predicted by cross-validation,
 using the primary tree (\code{mCal} and (\code{mPrd})
 or the secondary tree (\code{tCal} and (\code{tPrd}), respectively.
 All matrices have the same dimension \code{nbLev * nbAss}.
 \code{rownames} contains the number of component clusters,
 that is from \code{1} to \code{nbElt} clusters.
 \code{colnames} contains the names of assemblages.

 \item \code{mMotifs, tNbcl}: the matrix
 of affectation of assemblages to different assembly motifs,
 coded as integers, and the matrices of the last tree levels
 used for predicting assemblage performances.
 All matrices have the same dimension \code{nbLev * nbAss}.
 \code{rownames} contains the number of component clusters,
 that is from \code{1} to \code{nbElt} clusters.
 \code{colnames} contains the names of assemblages.

 \item \code{mStats, tStats}: the matrices of associated statistics.
 \code{rownames} contains the number of component clusters,
 that is from {1} to {nbElt} clusters.
 \code{colnames = c("missing", "R2cal", "R2prd", "AIC", "AICc")}.
 }
}
\description{
Fit a primary tree of component clustering
to observed assemblage performances,
then prune the primary tree for its predicting ability and its parcimony,
finally retain a validated secondary tree
and the corresponding predictions, statistics and other informations.
}
\details{
see Vignette "The options of fclust".
}
\examples{

# Enable the comments
oldOption <- getOption("verbose")
if (!oldOption) options(verbose = TRUE)

nbElt <- 16   # number of components
# index = Identity, Occurrence of components, a Performance
index <- c(1, 1 + 1:nbElt, 1 + nbElt + 1)
dat.2004 <- CedarCreek.2004.2006.dat[ , index]
res <- fclust(dat.2004, nbElt)
names(res)
res$tree.II

options(verbose = oldOption)


}
\references{
Jaillard, B., Richon, C., Deleporte, P., Loreau, M. and Violle, C. (2018)
\emph{An a posteriori species clustering
for quantifying the effects of species
interactions on ecosystem functioning}.
Methods in Ecology and Evolution, 9:704-715.
\url{https://doi.org/10.1111/2041-210X.12920}. \cr

Jaillard, B., Deleporte, P., Loreau, M. and Violle, C. (2018)
\emph{A combinatorial analysis using observational data
identifies species that govern ecosystem functioning}.
PLoS ONE 13(8): e0201135.
\url{https://doi.org/10.1371/journal.pone.0201135}.
}
\seealso{
\code{\link{fclust}}: build a functional clustering,\cr
\code{\link{fclust_plot}}: plot the results of a functional clustering,\cr
\code{\link{fclust_write}}: save the results of a functional clustering,\cr
\code{\link{fclust_read}}: read the results of a functional clustering.\cr
}
