diff --git a/.github/workflows/build-joss-paper.yml b/.github/workflows/build-joss-paper.yml
new file mode 100644
index 00000000..74e1e112
--- /dev/null
+++ b/.github/workflows/build-joss-paper.yml
@@ -0,0 +1,30 @@
+name: Build JOSS paper pdf
+
+on:
+  push:
+    branches:
+      - JOSSpaper_noT
+      - JOSSpaper
+      - main
+
+jobs:
+  paper:
+    runs-on: ubuntu-latest
+    name: Paper Draft
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+      - name: Build draft PDF
+        uses: openjournals/openjournals-draft-action@1d5c3be74a6a8454854f099d63c6fbb1e8938052
+        with:
+          journal: joss
+          # This should be the path to the paper within your repo.
+          paper-path: ./vignettes/paper.md
+      - name: Upload
+        uses: actions/upload-artifact@v1
+        with:
+          name: paper
+          # This is the output path where Pandoc will write the compiled
+          # PDF. Note, this should be the same directory as the input
+          # paper.md
+          path: ./vignettes/paper.pdf
diff --git a/README.md b/README.md
index 5ebfc490..8d418cec 100644
--- a/README.md
+++ b/README.md
@@ -16,19 +16,13 @@ The package provides routines for structure learning and parameter estimation of
 
 # Installation
 
-The [`abn`](https://CRAN.R-project.org/package=abn) R package can easily be installed from [CRAN](https://CRAN.R-project.org/package=abn) using:
-
-```r
-install.packages("abn", dependencies = TRUE)
-```
-
 The most recent development version is available from [Github](https://github.com/furrer-lab/abn) and can be installed with:
 
 ```r
 devtools::install_github("furrer-lab/abn")
 ```
 
-It is recommended to install `abn` within a virtual environment, e.g., using [renv](https://rstudio.github.io/renv/articles/renv.html)) which can be done with:
+It is recommended to install `abn` within a virtual environment, e.g., using [renv](https://rstudio.github.io/renv/articles/renv.html) which can be done with:
 
 ```r
 renv::install("bioc::graph")
@@ -36,13 +30,22 @@ renv::install("bioc::Rgraphviz")
 renv::install("abn", dependencies = c("Depends", "Imports", "LinkingTo", "Suggests"))
 ```
 
+Please note that the `abn` package is currently unavailable on CRAN. 
+We are dedicated to providing a robust and reliable package, and we appreciate your understanding as we work towards making `abn` available on CRAN soon. [^1]
+
+[^1]: The `abn` package includes certain features, such as multiprocessing and integration with the INLA package, which are limited or available only on specific CRAN flavors. 
+While it is possible to relax the testing process by, e.g., excluding tests of these functionalities, we believe that rigorous testing is important for reliable software development, especially for a package like `abn` that includes complex functionalities. 
+We have implemented a rigorous testing framework similar to CRAN's to validate these functionalities in our development process. 
+Our aim is to maximize the reliability of the `abn` package under various conditions, and we are dedicated to providing a robust and reliable package. 
+We appreciate your understanding as we work towards making `abn` available on CRAN soon.
+
 ## Additional libraries
 
 The following additional libraries are recommended to best profit from the [abn](https://cran.r-project.org/package=abn) features.
 
 - [INLA](https://www.r-inla.org/), which is an R package used for model fitting. It is hosted separately from CRAN and is easy to install on common platforms (see instructions on the INLA website). 
 ```r
-install.packages("INLA", repos=c(getOption("repos"), INLA="https://inla.r-inla-download.org/R/stable"), dep=TRUE)
+install.packages("INLA", repos = c(getOption("repos"), INLA = "https://inla.r-inla-download.org/R/stable"), dep = TRUE)
 ```
 
 - [Rgraphviz](https://www.bioconductor.org//packages/release/bioc/html/Rgraphviz.html) is used to produce plots of network graphs and is hosted on [Bioconductor](https://www.bioconductor.org/).
@@ -52,7 +55,7 @@ if (!requireNamespace("BiocManager", quietly = TRUE))
 BiocManager::install("Rgraphviz", version = "3.8")
 ```
 
-- [JAGS](https://mcmc-jags.sourceforge.io/) is a program for analysing Bayesian hierarchical models using Markov Chain Monte Carlo (MCMC) simulation. Its installation is platform-dependent and is, therefore, not covered here.
+- [JAGS](https://mcmc-jags.sourceforge.io/) is a program for analyzing Bayesian hierarchical models using Markov Chain Monte Carlo (MCMC) simulation. Its installation is platform-dependent and is, therefore, not covered here.
 
 # Quickstart
 
@@ -107,29 +110,29 @@ Unlike other packages, `abn` does not restrict the combination of parent-child d
 
 The analysis of "hierarchical" or "grouped" data, in which observations are nested within higher-level units, requires statistical models with parameters that vary across groups (e.g. mixed-effect models).
 
-`abn` allows to control for one-layer clustering, where observations are grouped into a single layer of clusters which are themself assumed to be independent, but observations within the clusters may be correlated (e.g. students nested within schools, measurements over time for each patient, etc).
+`abn` allows to control for one-layer clustering, where observations are grouped into a single layer of clusters that are themself assumed to be independent, but observations within the clusters may be correlated (e.g. students nested within schools, measurements over time for each patient, etc).
 The argument `group.var` specifies the discrete variable that defines the group structure. The model is then fitted separately for each group, and the results are combined. 
 
-For example, studying student test scores across different schools, a varying intercept model would allow for the possibility that average test scores (the intercept) might be higher in one school than another due to factors specific to each school. This can be modelled in `abn` by setting the argument `group.var` to the variable containing the school names. The model is then fitted as a varying intercept model, where the intercept is allowed to vary across schools, but the slope is assumed to be the same for all schools.
+For example, studying student test scores across different schools, a varying intercept model would allow for the possibility that average test scores (the intercept) might be higher in one school than another due to factors specific to each school. This can be modeled in `abn` by setting the argument `group.var` to the variable containing the school names. The model is then fitted as a varying intercept model, where the intercept is allowed to vary across schools, but the slope is assumed to be the same for all schools.
 
-Under the frequentist paradigm (`method = "mle"`), `abn` relies on the `lme4` package to fit generalised linear mixed models (GLMMs) for Binomial, Poisson, and Gaussian distributed variables. For multinomial distributed variables, `abn` fits a multinomial baseline category logit model with random effects using the `mclogit` package. Currently, only one-layer clustering is supported (e.g., for `method = "mle"`, this corresponds to a random intercept model).
+Under the frequentist paradigm (`method = "mle"`), `abn` relies on the `lme4` package to fit generalized linear mixed models (GLMMs) for Binomial, Poisson, and Gaussian distributed variables. For multinomial distributed variables, `abn` fits a multinomial baseline category logit model with random effects using the `mclogit` package. Currently, only one-layer clustering is supported (e.g., for `method = "mle"`, this corresponds to a random intercept model).
 
 With a Bayesian approach (`method = "bayes"`), `abn` relies on its own implementation of the Laplace approximation and the package `INLA` to fit a single-level hierarchical model for Binomial, Poisson, and Gaussian distributed variables. Multinomial distributed variables in general (see Section [Supported Data Types](#supported-data-types)) are not yet implemented with `method = "bayes"`.
 
 # Basic Background
 
-Bayesian network modelling is a data analysis technique ideally suited to messy, highly correlated and complex datasets. 
-This methodology is rather distinct from other forms of statistical modelling in that its focus is on structure discovery—determining an optimal graphical model that describes the interrelationships in the underlying processes that generated the data. 
+Bayesian network modeling is a data analysis technique ideally suited to messy, highly correlated and complex datasets. 
+This methodology is rather distinct from other forms of statistical modeling in that its focus is on structure discovery—determining an optimal graphical model that describes the interrelationships in the underlying processes that generated the data. 
 It is a **multivariate** technique and can be used for one or many dependent variables. 
-This is a data-driven approach, as opposed to relying only on subjective expert opinion to determine how variables of interest are interrelated (for example, structural equation modelling). 
+This is a data-driven approach, as opposed to relying only on subjective expert opinion to determine how variables of interest are interrelated (for example, structural equation modeling). 
 
 [Below](#examples) and on the [package's website](https://r-bayesian-networks.org/), we provide some [cookbook](#examples)-type examples of how to perform Bayesian network **structure discovery** analyses with observational data. 
 The particular type of Bayesian network models considered here are **additive Bayesian networks**. 
-These are rather different, mathematically speaking, from the standard form of Bayesian network models (for binary or categorical data) presented in the academic literature, which typically use an analytically elegant but arguably interpretation-wise opaque contingency table parametrisation. 
-An additive Bayesian network model is simply a **multidimensional regression model**, e.g. directly analogous to generalised linear modelling but with all variables potentially dependent. 
+These are rather different, mathematically speaking, from the standard form of Bayesian network models (for binary or categorical data) presented in the academic literature, which typically use an analytically elegant but arguably interpretation-wise opaque contingency table parametrization. 
+An additive Bayesian network model is simply a **multidimensional regression model**, e.g., directly analogous to generalized linear modeling but with all variables potentially dependent. 
 
-An example can be found in the [American Journal of Epidemiology](https://academic.oup.com/aje/article-abstract/176/11/1051/178588), where this approach was used to investigate risk factors for child diarrhoea. 
-A special issue of [Preventive Veterinary Medicine](http://www.sciencedirect.com/science/journal/01675877/110/1) on graphical modelling features several articles that use [abn](https://CRAN.R-project.org/package=abn) to fit epidemiological data. 
+An example can be found in the [American Journal of Epidemiology](https://academic.oup.com/aje/article-abstract/176/11/1051/178588), where this approach was used to investigate risk factors for child diarrhea. 
+A special issue of [Preventive Veterinary Medicine](http://www.sciencedirect.com/science/journal/01675877/110/1) on graphical modeling features several articles that use [abn](https://CRAN.R-project.org/package=abn) to fit epidemiological data. 
 Introductions to this methodology can be found in [Emerging Themes in Epidemiology](https://link.springer.com/journal/12982) and in [Computers in Biology and Medicine](https://www.sciencedirect.com/science/article/pii/S0010482522005133) where it is compared to other approaches.
 
 ## What is an additive Bayesian network?
@@ -140,7 +143,7 @@ They provide a framework for representing data with multiple variables, known as
 ABN models are a graphical representation of (Bayesian) multivariate regression. 
 This form of statistical analysis enables the prediction of multiple outcomes from a given set of predictors while simultaneously accounting for the relationships between these outcomes.
 
-In other words, additive Bayesian network models extend the concept of generalised linear models (GLMs), which are typically used to predict a single outcome, to scenarios with multiple dependent variables. 
+In other words, additive Bayesian network models extend the concept of generalized linear models (GLMs), which are typically used to predict a single outcome, to scenarios with multiple dependent variables. 
 This makes them a powerful tool for understanding complex, multivariate datasets.
 
 ## The term Bayesian network is interpreted differently across various fields.
@@ -148,9 +151,9 @@ This makes them a powerful tool for understanding complex, multivariate datasets
 Bayesian network models often involve binary nodes, arguably the most frequently used type of Bayesian network. 
 These models typically use a contingency table instead of an additive parameter formulation. 
 This approach allows for mathematical elegance and enables key metrics like model goodness of fit and marginal posterior parameters to be estimated analytically (i.e., from a formula) rather than numerically (an approximation). 
-However, this parametrisation may not be parsimonious, and the interpretation of the model parameters is less straightforward than the usual Generalized Linear Model (GLM) type models, which are prevalent across all scientific disciplines.
+However, this parametrization may not be parsimonious, and the interpretation of the model parameters is less straightforward than the usual Generalized Linear Model (GLM) type models, which are prevalent across all scientific disciplines.
 
-While this is a crucial practical distinction, it’s a relatively low-level technical one, as the primary aspect of BN modelling is that it’s a form of graphical modelling – a model of the data’s joint probability distribution. 
+While this is a crucial practical distinction, it’s a relatively low-level technical one, as the primary aspect of BN modeling is that it’s a form of graphical modeling – a model of the data’s joint probability distribution. 
 This joint – multidimensional – aspect makes this methodology highly attractive for complex data analysis and sets it apart from more standard regression techniques, such as GLMs, GLMMs, etc., which are only one-dimensional as they assume all covariates are independent. 
 While this assumption is entirely reasonable in a classical experimental design scenario, it’s unrealistic for many observational studies in fields like medicine, veterinary science, ecology, and biology.
 
diff --git a/vignettes/paper.bib b/vignettes/paper.bib
index f9475adf..7d1a966f 100644
--- a/vignettes/paper.bib
+++ b/vignettes/paper.bib
@@ -117,6 +117,133 @@ @article{kratzer_additive_2023
 	file = {Full Text PDF:/home/matteo/Zotero/storage/4HPJLAUD/Kratzer et al. - 2023 - Additive Bayesian Network Modeling with the R Pack.pdf:application/pdf},
 }
 
+@article{kalisch_causal_2012,
+	title = {Causal Inference Using Graphical Models with the R Package pcalg},
+	volume = {47},
+	rights = {Copyright (c) 2010 Markus Kalisch, Martin Mächler, Diego Colombo, Marloes H. Maathuis, Peter Bühlmann},
+	issn = {1548-7660},
+	url = {https://www.jstatsoft.org/index.php/jss/article/view/v047i11},
+	doi = {10.18637/jss.v047.i11},
+	pages = {1--26},
+	number = {1},
+	journaltitle = {Journal of Statistical Software},
+	author = {Kalisch, Markus and Mächler, Martin and Colombo, Diego and Maathuis, Marloes H. and Bühlmann, Peter},
+	urldate = {2021-04-19},
+	date = {2012-05-17},
+	langid = {english},
+	note = {Number: 1},
+	file = {Full Text:/home/matteo/Zotero/storage/WMY63N9G/Kalisch et al. - 2012 - Causal Inference Using Graphical Models with the R.pdf:application/pdf;Kalisch et al. - 2012 - Causal Inference Using Graphical Models with the .pdf:/home/matteo/Zotero/storage/YPZMUHFF/Kalisch et al. - 2012 - Causal Inference Using Graphical Models with the .pdf:application/pdf;Snapshot:/home/matteo/Zotero/storage/JZHLD22K/v047i11.html:text/html},
+}
+
+@article{boettcher_deal_2003,
+	title = {deal: A Package for Learning Bayesian Networks},
+	volume = {8},
+	rights = {Copyright (c) 2003 Susanne G. Boettcher, Claus Dethlefsen},
+	issn = {1548-7660},
+	url = {https://doi.org/10.18637/jss.v008.i20},
+	doi = {10.18637/jss.v008.i20},
+	shorttitle = {deal},
+	abstract = {deal is a software package for use with R. It includes several methods for analysing data using Bayesian networks with variables of discrete and/or continuous types but restricted to conditionally Gaussian networks. Construction of priors for network parameters is supported and their parameters can be learned from data using conjugate updating. The network score is used as a metric to learn the structure of the network and forms the basis of a heuristic search strategy. deal has an interface to Hugin.},
+	pages = {1--40},
+	journaltitle = {Journal of Statistical Software},
+	author = {Boettcher, Susanne G. and Dethlefsen, Claus},
+	urldate = {2024-03-26},
+	date = {2003-12-28},
+	langid = {english},
+	file = {Submitted Version:/home/matteo/Zotero/storage/YN3JDDFW/Boettcher and Dethlefsen - 2003 - deal A Package for Learning Bayesian Networks.pdf:application/pdf},
+}
+
+@article{franzin_bnstruct_2017,
+	title = {bnstruct: an R package for Bayesian Network structure learning in the presence of missing data},
+	volume = {33},
+	issn = {1367-4803},
+	url = {https://doi.org/10.1093/bioinformatics/btw807},
+	doi = {10.1093/bioinformatics/btw807},
+	shorttitle = {bnstruct},
+	abstract = {A Bayesian Network is a probabilistic graphical model that encodes probabilistic dependencies between a set of random variables. We introduce bnstruct, an open source R package to (i) learn the structure and the parameters of a Bayesian Network from data in the presence of missing values and (ii) perform reasoning and inference on the learned Bayesian Networks. To the best of our knowledge, there is no other open source software that provides methods for all of these tasks, particularly the manipulation of missing data, which is a common situation in practice.The software is implemented in R and C and is available on {CRAN} under a {GPL} licence.Supplementary data are available at Bioinformatics online.},
+	pages = {1250--1252},
+	number = {8},
+	journaltitle = {Bioinformatics},
+	shortjournal = {Bioinformatics},
+	author = {Franzin, Alberto and Sambo, Francesco and Di Camillo, Barbara},
+	urldate = {2024-03-26},
+	date = {2017-04-15},
+	file = {Full Text PDF:/home/matteo/Zotero/storage/U7DDDXPX/Franzin et al. - 2017 - bnstruct an R package for Bayesian Network struct.pdf:application/pdf},
+}
+
+@article{hojsgaard_graphical_2012,
+	title = {Graphical Independence Networks with the {gRain} Package for R},
+	volume = {46},
+	rights = {Copyright (c) 2009 Søren  Højsgaard},
+	issn = {1548-7660},
+	url = {https://doi.org/10.18637/jss.v046.i10},
+	doi = {10.18637/jss.v046.i10},
+	abstract = {In this paper we present the R package {gRain} for propagation in graphical independence networks (for which Bayesian networks is a special instance). The paper includes a description of the theory behind the computations. The main part of the paper is an illustration of how to use the package. The paper also illustrates how to turn a graphical model and data into an independence network.},
+	pages = {1--26},
+	journaltitle = {Journal of Statistical Software},
+	author = {Højsgaard, Søren},
+	urldate = {2024-03-26},
+	date = {2012-02-28},
+	langid = {english},
+	file = {Højsgaard - 2012 - Graphical Independence Networks with the gRain Pac.pdf:/home/matteo/Zotero/storage/NVI3D9SB/Højsgaard - 2012 - Graphical Independence Networks with the gRain Pac.pdf:application/pdf},
+}
+
+@article{tsagris_new_2021,
+	title = {A New Scalable Bayesian Network Learning Algorithm with Applications to Economics},
+	volume = {57},
+	issn = {1572-9974},
+	url = {https://doi.org/10.1007/s10614-020-10065-7},
+	doi = {10.1007/s10614-020-10065-7},
+	abstract = {This paper proposes a new Bayesian network learning algorithm, termed {PCHC}, that is designed to work with either continuous or categorical data. {PCHC} is a hybrid algorithm that consists of the skeleton identification phase (learning the relationships among the variables) followed by the scoring phase that assigns the causal directions. Monte Carlo simulations clearly show that {PCHC} is dramatically faster, enjoys a nice scalability with respect to the sample size, and produces Bayesian networks of similar to, or of higher accuracy than, a competing state of the art hybrid algorithm. {PCHC} is finally applied to real data illustrating its performance and advantages.},
+	pages = {341--367},
+	number = {1},
+	journaltitle = {Computational Economics},
+	shortjournal = {Comput Econ},
+	author = {Tsagris, Michail},
+	urldate = {2024-03-26},
+	date = {2021-01-01},
+	langid = {english},
+	keywords = {Bayesian networks, Causality, Economics data},
+	file = {Full Text PDF:/home/matteo/Zotero/storage/FGN55RHT/Tsagris - 2021 - A New Scalable Bayesian Network Learning Algorithm.pdf:application/pdf},
+}
+
+@article{zanga_survey_2022,
+	title = {A Survey on Causal Discovery: Theory and Practice},
+	volume = {151},
+	issn = {0888-613X},
+	url = {https://www.sciencedirect.com/science/article/pii/S0888613X22001402},
+	doi = {10.1016/j.ijar.2022.09.004},
+	shorttitle = {A Survey on Causal Discovery},
+	abstract = {Understanding the laws that govern a phenomenon is the core of scientific progress. This is especially true when the goal is to model the interplay between different aspects in a causal fashion. Indeed, causal inference itself is specifically designed to quantify the underlying relationships that connect a cause to its effect. Causal discovery is a branch of the broader field of causality in which causal graphs are recovered from data (whenever possible), enabling the identification and estimation of causal effects. In this paper, we explore recent advancements in causal discovery in a unified manner, provide a consistent overview of existing algorithms developed under different settings, report useful tools and data, present real-world applications to understand why and how these methods can be fruitfully exploited.},
+	pages = {101--129},
+	journaltitle = {International Journal of Approximate Reasoning},
+	shortjournal = {International Journal of Approximate Reasoning},
+	author = {Zanga, Alessio and Ozkirimli, Elif and Stella, Fabio},
+	urldate = {2024-03-26},
+	date = {2022-12-01},
+	keywords = {Causal discovery, Causal models, Causality, Structural learning},
+	file = {Submitted Version:/home/matteo/Zotero/storage/54VIJP64/Zanga et al. - 2022 - A Survey on Causal Discovery Theory and Practice.pdf:application/pdf},
+}
+
+@article{kitson_survey_2023,
+	title = {A survey of Bayesian Network structure learning},
+	volume = {56},
+	issn = {1573-7462},
+	url = {https://doi.org/10.1007/s10462-022-10351-w},
+	doi = {10.1007/s10462-022-10351-w},
+	abstract = {Bayesian Networks ({BNs}) have become increasingly popular over the last few decades as a tool for reasoning under uncertainty in fields as diverse as medicine, biology, epidemiology, economics and the social sciences. This is especially true in real-world areas where we seek to answer complex questions based on hypothetical evidence to determine actions for intervention. However, determining the graphical structure of a {BN} remains a major challenge, especially when modelling a problem under causal assumptions. Solutions to this problem include the automated discovery of {BN} graphs from data, constructing them based on expert knowledge, or a combination of the two. This paper provides a comprehensive review of combinatoric algorithms proposed for learning {BN} structure from data, describing 74 algorithms including prototypical, well-established and state-of-the-art approaches. The basic approach of each algorithm is described in consistent terms, and the similarities and differences between them highlighted. Methods of evaluating algorithms and their comparative performance are discussed including the consistency of claims made in the literature. Approaches for dealing with data noise in real-world datasets and incorporating expert knowledge into the learning process are also covered.},
+	pages = {8721--8814},
+	number = {8},
+	journaltitle = {Artificial Intelligence Review},
+	shortjournal = {Artif Intell Rev},
+	author = {Kitson, Neville Kenneth and Constantinou, Anthony C. and Guo, Zhigao and Liu, Yang and Chobtham, Kiattikun},
+	urldate = {2024-03-20},
+	date = {2023-08-01},
+	langid = {english},
+	keywords = {Causal discovery, Graphical models, Knowledge-based constraints, Structure learning evaluation, Structure learning review},
+	file = {Full Text PDF:/home/matteo/Zotero/storage/E9QZ3GQ4/Kitson et al. - 2023 - A survey of Bayesian Network structure learning.pdf:application/pdf},
+}
+
 @Manual{rcore2024,
   title = {R: A Language and Environment for Statistical Computing},
   author = {{R Core Team}},
diff --git a/vignettes/paper.md b/vignettes/paper.md
new file mode 100644
index 00000000..9addb34b
--- /dev/null
+++ b/vignettes/paper.md
@@ -0,0 +1,118 @@
+---
+title: "Additive Bayesian Networks"
+tags:
+- data science
+- R
+- mixed-effects models
+- Bayesian networks
+- graphical models
+authors:
+- name: Matteo Delucchi
+  orcid: 0000-0002-9327-1496
+  affiliation: "1, 2"
+- name: Jonas I. Liechti
+  orcid: 0000-0003-3447-3060
+  affiliation: "3"
+- name: Georg R. Spinner
+  orcid: 0000-0001-9640-8155
+  affiliation: "2"
+- name: Reinhard Furrer
+  orcid: 0000-0002-6319-2332
+  corresponding: true
+  affiliation: "1"
+affiliations:
+ - name: Department of Mathematical Modeling and Machine Learning, University of Zurich, Zürich, Switzerland
+   index: 1
+ - name: Centre for Computational Health, Institute of Computational Life Sciences, Zurich University of Applied Sciences (ZHAW), Wädenswil, Switzerland
+   index: 2
+ - name: www.T4D.ch, T4D GmbH, Zurich, Switzerland
+   index: 3
+date: 20. Mai 2024
+bibliography: paper.bib
+---
+
+# Summary
+The R package `abn` is a comprehensive tool for Bayesian Network (BN) analysis, a form of probabilistic graphical model. 
+BNs are a type of statistical model that leverages the principles of Bayesian statistics and graph theory to provide a framework for representing complex multivariate data. 
+They can derive a directed acyclic graph from empirical data to describe the dependency structure between random variables. 
+
+Additive Bayesian Network (ABN) models extend the concept of generalized linear models, typically used for predicting a single outcome, to scenarios with multiple dependent variables (e.g., @kratzer_additive_2023).
+This makes them a powerful tool for understanding complex, multivariate datasets.
+This package provides routines for structure learning and parameter estimation of ABN models.
+
+# Statment of need
+The increasing complexity of data in various fields, ranging from healthcare research to environmental science and ecology, has resulted in a need for a tool like `abn`.
+Researchers often face multivariate, tabular data where the relationships between variables are not straightforward. 
+BN analysis becomes essential when traditional statistical methods fail to analyze multivariate data with intricate relationships, as it models these relationships graphically for more straightforward data interpretation.
+
+Commonly used implementations of BN models, such as `bnlearn` [@bnlearn2010], `bnstruct` [@franzin_bnstruct_2017], `deal` [@boettcher_deal_2003], `gRain` [@hojsgaard_graphical_2012], `pcalg` [@kalisch_causal_2012] and `pchc` [@tsagris_new_2021], limit variable types, often allowing discrete variables to have only discrete parent variables, where a parent starts a directed edge in the graph.
+This limitation can pose challenges when dealing with continuous or mixed-type data (i.e., data that includes both continuous and discrete variables) or when attempting to model complex relationships that do not fit these restricted categories.
+For a comprehensive overview of structure learning algorithms, including those applicable to mixed-type data, we refer the reader to the works of @kitson_survey_2023 and @zanga_survey_2022.
+In the context of patient data, the study from @delucchi_bayesian_2022 has discussed further details and strategies for handling this scenario, particularly in relation to the `abn` package and the widely used `bnlearn` package [@bnlearn2010].
+
+The `abn` package overcomes this limitation through its additive model formulation, which generalizes the usual (Bayesian) multivariable regression to accommodate multiple dependent variables.
+Additionally, the `abn` package offers a comprehensive suite of features for model selection, structure learning, and parameter estimation.
+It includes exact and greedy search algorithms for structure learning and allows for integrating prior expert knowledge into the model selection process by specifying structural constraints.
+For model selection, a Bayesian and an information-theoretic model scoring approach are available, allowing users to choose between a Bayesian and frequentist paradigm.
+To our knowledge, this feature is not available in other software.
+Furthermore, it supports mixed-effect models to control one-layer clustering, making it suitable, e.g., for handling data from different sources.
+
+Previous versions of the `abn` package have been successfully used in various fields, including epidemiology [@pittavino_comparison_2017, @kratzer_information-theoretic_2018] and health [@hartnack_additive_2019, @kratzer_bayesian_2020, @delucchi_bayesian_2022].
+Despite its promise, the `abn` package encountered historical obstacles.
+Sporadic maintenance and an incomplete codebase hindered its full potential. 
+Recognizing the need for enhancement, we undertook a substantial upgrade and meticulously addressed legacy issues, revamped the codebase, and introduced significant improvements. 
+The latest version 3 of `abn` is now a robust and reliable tool for BN analysis.
+Applying the latest standards for open-source software, we guarantee active maintenance of `abn`. 
+Future updates are planned to enhance its functionality and user experience further. 
+We highly value feedback from the user community, which will guide our ongoing developments.
+
+In summary, `abn` sets itself apart by emphasizing ABNs and its exhaustive features for model selection and structure learning. 
+Its unique contribution is the implementation of mixed-effect BN models, thereby extending its applicability to a broader range of complex, multivariate datasets of mixed, continuous, and discrete data.
+
+# Implementation
+As outlined in @kratzer_additive_2023, the package's comprehensive framework integrates the mixed-effects model for clustered data, considering data heterogeneity and grouping effects.
+However, this was confined to a Bayesian context and was only a preliminary implementation.
+With the release of `abn` major version 3, this was completed with an implementation under the information-theoretic (`method = "mle"`) setting.
+
+Analyzing hierarchical or grouped data, i.e., observations nested within higher-level units, requires statistical models with group-varying parameters (e.g., mixed-effect models). 
+The `abn` package facilitates single-layer clustering, where observations are grouped. 
+These clusters are assumed to be independent, but intra-cluster observations may exhibit correlation (e.g., students within schools, patient-specific measurements over time, etc.). 
+The ABN model is fitted independently as a varying intercept model, where the intercept can vary while the slope is assumed constant across all group levels.
+
+Under the frequentist paradigm (`method = "mle"`), `abn` employs the `lme4` package [@lme42015] to fit generalized linear mixed models for each of the Binomial, Poisson, and Gaussian distributed variables. 
+For multinomial distributed variables, `abn` fits a multinomial baseline category logit model with random effects using the `mclogit` package [@mclogit2022]. 
+Currently, only single-layer clustering is supported (e.g., for `method = "mle"`, this corresponds to a random intercept model).
+
+With a Bayesian approach (`method = "bayes"`), `abn` utilizes its own implementation of the Laplace approximation as well as the `INLA` package [@inla2013] to fit a single-level hierarchical model for Binomial, Poisson, and Gaussian distributed variables. 
+
+Furthermore, the code base has been enhanced to be more efficient, reliable, and user-friendly through code optimization, regular reviews, and continuous integration practices. 
+We have adhered to the latest open-source software standards, including active maintenance of `abn`. 
+Future updates to augment its functionality are planned via a flexible roadmap.
+User feedback is valued through open communication channels, which will steer our ongoing developments. 
+Consequently, the latest version of `abn` is now a robust and reliable tool for BN analysis.
+
+# Validation and Testing
+A comprehensive set of documented case studies has been published to validate the `abn` package (see the `abn` [website](https://r-bayesian-networks.org/)).
+The numerical accuracy and quality assurance exercises were demonstrated in @kratzer_additive_2023.
+A rigorous testing framework is implemented using the `testthat` package [@testthat2011], which is executed as part of an extensive continuous integration pipeline designed explicitly for non-standard R packages that rely on `Rcpp` [@rcpp2023] and `JAGS` [@plummer_jags_2003].
+Additional documentation and resources are available on the `abn` [website](https://r-bayesian-networks.org/) for further reference and guidance.
+
+# Availability
+
+The latest version of the `abn` package is available on [GitHub](https://github.com/furrer-lab/abn) and can be installed using the `devtools` package:
+
+```r
+devtools::install_github("furrer-lab/abn")
+```
+
+# Acknowledgments
+
+The development of the `abn` package would not have been possible without the significant contributions of the former developers whose efforts have been instrumental in shaping this project. 
+We acknowledge the contributions of Fraser Iain Lewis, Marta Pittavino, Gilles Kratzer, and Kalina Cherneva, in particular.
+We extend our gratitude to the faculty staff at the [Department of Mathematical Modeling and Machine Learning](https://dm3l.uzh.ch/home), University of Zurich (UZH), and the [Department of Mathematics](https://www.math.uzh.ch/home), UZH, who maintain the research and teaching infrastructure.
+Our appreciation also goes to the UZH and the ZHAW for their financial support. 
+We want to highlight the funding from both the Zurich University of Applied Sciences (ZHAW) and the Digitalization Initiative of the Zurich Higher Education Institutions (DIZH), which were instrumental in realizing this project, particularly within the context of the "Modeling of multicentric and dynamic stroke health data" and "Stroke DynamiX" projects, respectively.
+This work was conducted as part of M.D.'s PhD project, co-supervised by Prof. Dr. Sven Hirsch (ZHAW) and Prof. Dr. Reinhard Furrer (UZH).
+
+# References
+