Merge pull request #179 from chhoumann/background-fix-transformations

Consistent notation for transformations
chhoumann · Jun 5, 2024 · e492078 · e492078
2 parents 67e9f5a + 47d6a73
commit e492078
Showing 1 changed file with 14 additions and 12 deletions.
diff --git a/report_thesis/src/sections/background/preprocessing/power_transform.tex b/report_thesis/src/sections/background/preprocessing/power_transform.tex
@@ -3,31 +3,33 @@ \subsubsection{Power Transformation}
 They are particularly useful in statistical modeling and data analysis to meet the assumptions of linear models.
 
 One of the first influential power transformation techniques is the Box-Cox power transform, introduced by \citet{BoxAndCox} in 1964.
-This is defined for positive data and is aimed at normalizing data or making it more symmetric. The transformation is given by:
+This is defined for positive data and is aimed at normalizing data or making it more symmetric.
+For a feature vector $\mathbf{x}$, the Box-Cox transformation is defined as:
 
 $$
-\text{BC}(\lambda, x) =
+\psi^{\text{BC}}(\lambda, \mathbf{x}) =
 \begin{cases}
-\frac{x^\lambda - 1}{\lambda} & \text{if } \lambda \neq 0 \\
-\log(x) & \text{if } \lambda = 0
-\end{cases}
+\frac{\mathbf{x}^\lambda - 1}{\lambda}, & (\lambda \neq 0) \\
+\log(\mathbf{x}), & (\lambda = 0)
+\end{cases},
 $$
-where $ \lambda $ is the transformation parameter and $x$ is the input data.
+
+where $\lambda$ is the transformation parameter.
 $\lambda$ determines the extend and nature of the transformation, where positive values of $\lambda$ apply a power transformation and $\lambda = 0$ applies a logarithmic transformation.
 
 To overcome the limitations of the Box-Cox transformation, \citet{YeoJohnson} introduced a new family of power transformations that can handle both positive and negative values.
 The Yeo-Johnson power transformation is defined as:
 
 $$
-y =
+\psi(\lambda, \mathbf{x}) =
 \begin{cases}
-\frac{((x + 1)^\lambda - 1)}{\lambda} & \text{for } x \geq 0, \lambda \neq 0 \\
-\log(x + 1) & \text{for } x \geq 0, \lambda = 0 \\
--\frac{((-x + 1)^{2 - \lambda} - 1)}{2 - \lambda} & \text{for } x < 0, \lambda \neq 2 \\
--\log(-x + 1) & \text{for } x < 0, \lambda = 2
+\frac{(\mathbf{x} + 1)^\lambda - 1}{\lambda} & (\mathbf{x} \geq 0, \lambda \neq 0) \\
+\log(\mathbf{x} + 1) & (\mathbf{x} \geq 0, \lambda = 0) \\
+- \frac{(-\mathbf{x} + 1)^{2 - \lambda} - 1}{2 - \lambda} & (\mathbf{x} < 0, \lambda \neq 2) \\
+-\log(-\mathbf{x} + 1) & (\mathbf{x} < 0, \lambda = 2)
 \end{cases}
 $$
-where $x$ is the input data, $y$ is the transformed data, and $\lambda$ is the transformation parameter.
+
 For non-negative values, the Yeo-Johnson transformation simplifies to the Box-Cox transformation, making them equivalent in this context.
 The key benefit of the Yeo-Johnson transformation is its ability to handle any real number, making it a robust choice for transforming data to achieve approximate normality or symmetry.
 This property is particularly beneficial for preparing data for statistical analyses and machine learning models that require normally distributed input data.