-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MackChainLadder should return the same result if Triangle is passed through a pipe #57
Comments
The same is true for other functions like m_piped <- data.frame(x=1:10, y=1:10) %>% lm
m <- lm(y~x, data=data.frame(x=1:10, y=1:10))
identical(m , m_piped)
FALSE How do you deal with those situations? |
From what I can tell the only differences are based on the call. I've modified the example to make it more clear. If you look at the differences in the original example they are all about the formula and call. library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
m_piped <- data.frame(x=1:10, y=1:10) %>% lm(formula = y ~ x)
m <- lm(y~x, data=data.frame(x=1:10, y=1:10))
all.equal(m, m_piped)
#> [1] "Component \"call\": target, current do not match when deparsed" Created on 2019-01-16 by the reprex package (v0.2.1) |
The two objects were created differently -- with different calls:
m_piped$call
lm(formula = y ~ x, data = .)
m$call
lm(formula = y ~ x, data = data.frame(x = 1:10, y = 1:10))
Is your concern the loss of information regarding the source of 'data' in
m_piped? "data = ." is a common idiom in the tidyverse. If that source is
important to you -- and I can see why it would be -- then I suggest
avoiding piping. Otherwise, I am happy that is the only difference in the
two objects.
Maybe someone in a tidyverse list can help with the "lm(formula = y ~ x,
data = .)" issue.
Thank you for your interest in ChainLadder!
Dan
…On Wed, Jan 16, 2019 at 2:45 PM Ryan Thomas ***@***.***> wrote:
From what I can tell the only differences are based on the call. I've
modified the example to make it more clear. If you look at the differences
in the original example they are all about the formula and call.
library(dplyr)#> #> Attaching package: 'dplyr'#> The following objects are masked from 'package:stats':#> #> filter, lag#> The following objects are masked from 'package:base':#> #> intersect, setdiff, setequal, unionm_piped <- data.frame(x=1:10, y=1:10) %>% lm(formula = y ~ x)m <- lm(y~x, data=data.frame(x=1:10, y=1:10))
all.equal(m, m_piped)#> [1] "Component \"call\": target, current do not match when deparsed"
Created on 2019-01-16 by the reprex package <https://reprex.tidyverse.org>
(v0.2.1)
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#57 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGKcB0fqcY5zAop6XYdOS7Imq-jfp2E1ks5vD6uLgaJpZM4V3vBL>
.
|
That's what I meant by "the only difference is in the original name of the Triangle object". Apologies for being unclear. As for my concern: This behavior got me when I wrote unit tests for a function that uses I have no strong opinion what to do about this. By having 'call' in the return value, Obviously, there are wars fought over the merits and drawbacks of the pipe and we should probably not repeat this here. Therefore, feel free to close the issue if you conclude that consistency over time and with |
I believe this is a *feature*, not an issue, of the piping paradigm. E.g., if the formula had not been omitted in the toy example, the call of the result would have been different still:
# original toy example
m_piped <- data.frame(x=1:10, y=1:10) %>% lm
m_piped$call
lm(formula = .)
# toy example including formula
m_piped <- data.frame(x=1:10, y=1:10) %>% lm(y~x, data = .)
m_piped$call
lm(formula = y ~ x, data = .)
# toy example including formula and another default argument value
m_piped <- data.frame(x=1:10, y=1:10) %>% lm(x~y, data = ., model = TRUE)
m_piped$call
lm(formula = x ~ y, data = ., model = TRUE)
These example results are supported by the following technical note at the magrittr site (https://magrittr.tidyverse.org/reference/pipe.html):
“For most purposes, one can disregard the subtle aspects of magrittr's evaluation, but some functions may capture their calling environment, and thus using the operators will not be exactly equivalent to the "standard call" without pipe-operators.”
From: msenn <[email protected]>
Sent: Monday, January 21, 2019 11:20 PM
To: mages/ChainLadder <[email protected]>
Cc: Dan Murphy <[email protected]>; Comment <[email protected]>
Subject: Re: [mages/ChainLadder] MackChainLadder should return the same result if Triangle is passed through a pipe (#57)
From what I can tell the only differences are based on the call
That's what I meant by "the only difference is in the original name of the Triangle object". Apologies for being unclear.
As for my concern: This behavior got me when I wrote unit tests for a function that uses MackChainLadder(). The tests would fail when using the pipe but not otherwise. The reason turned out to be the call object.
I have no strong opinion what to do about this. By having 'call' in the return value, MackChainLadder() is in line with lm() as demonstrated by @mages <https://github.com/mages> . However, its output varies slightly if passed directly versus piped.
Obviously, there are wars fought over the merits and drawbacks of the pipe and we should probably not repeat this here. Therefore, feel free to close the issue if you conclude that consistency over time and with lm() weights heavier than consistency if piped.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#57 (comment)> , or mute the thread <https://github.com/notifications/unsubscribe-auth/AGKcB0hr92hr85pcshgDmfHOO7-lVxO3ks5vFruogaJpZM4V3vBL> . <https://github.com/notifications/beacon/AGKcB4R3_HIz5K26d6VaW7DGCg1uArGYks5vFruogaJpZM4V3vBL.gif>
|
I have two final comments:
Thanks for raising this issue, and thanks again for your interest in ChainLadder. |
Leaving this here in case it might be of help to someone else. I use
suppressPackageStartupMessages(library(ChainLadder))
set.seed(1024)
mcl <- MackChainLadder(RAA)
set.seed(1024)
mcl2 <- MackChainLadder(RAA)
identical(mcl, mcl2)
#> [1] FALSE
# TL;DR:
# Difference is in model terms attribute '.Environment'. I suppose that's
# the environment in which the calls are evaluated in. Nothing to worry about,
# if you ask me.
# Explanation:
# which elements aren't identical:
for (nm in names(mcl)) {
if (!identical(mcl[[nm]], mcl2[[nm]])) {
print(nm)
}
}
#> [1] "Models"
# The 'Models' are a bunch of calls and coefficients. Let's work with the first
# item in their list:
a <- mcl$Models[[1]]
b <- mcl2$Models[[1]]
# Which elements are different:
for (nm in names(a)) {
if (!identical(a[[nm]], b[[nm]])) {
print(nm)
}
}
#> [1] "terms"
#> [1] "model"
a_terms <- a[['terms']]
b_terms <- b[['terms']]
# check which attributes aren't identical:
for (att in names(attributes(a_terms))) {
if (!identical(attr(a_terms, which = att), attr(b_terms, which = att))) {
print(att)
}
}
#> [1] ".Environment"
# attr '.Environment'
a_model <- a[['model']]
b_model <- b[['model']]
# Again for the models, only the environment attribute is different since the
# columns are identical:
for (nm in names(a_model)) {
if (!identical(a_model[[nm]], b_model[[nm]])) {
print(nm)
}
}
# checking the attributes:
for (att in names(attributes(a_model))) {
if (!identical(attr(a_model, which = att), attr(b_model, which = att))) {
print(att)
}
}
#> [1] "terms"
# The 'terms' are the same as we had seen before:
identical(a[['terms']], attr(a_model, which = 'terms'))
#> [1] TRUE
# Meaning only the '.Environment' attribute is different. Created on 2022-08-29 with reprex v2.0.2 |
Problem
The value returned by
MackChainLadder()
depends on whetherTriangle
is passed directly (i.e. as a function argument) or usingmagrittr
's pipe operator (%>%
):Further information
Differences are in elements "call" and "Model":
Arguably, the only difference is in the original name of the Triangle object. This difference may look minor and cosmetic. However, it will create confusion to anybody trying verify that two pieces of code lead to the same outcome. Also, pipes are so prevalent these days that they shouldn't be ignored.
System info
I am using the current GitHub version of ChainLadder. Here's my sessionInfo():
The text was updated successfully, but these errors were encountered: