RStudio Freezing When Dataframe Includes Factor #18

benjaminwnelson · 2021-07-17T16:17:18Z

Synthpop is working with numeric data, but anytime I include a factor with more than 2 levels it causes the program to freeze. Any thoughts on why this might be the case? Thanks!

Sinan-Yavuz · 2021-07-19T18:30:18Z

Synthpop is working with numeric data, but anytime I include a factor with more than 2 levels it causes the program to freeze. Any thoughts on why this might be the case? Thanks!

I have the same problem, RStudio crashes.

gillian-raab · 2021-07-21T09:05:52Z

Seems very odd. Which version of synthpop are you using CRAN or github? Gillian M Raab Emeritus Professor, Edinburgh Napier University Part-time Research Fellow Administrative Data Research Centre - Scotland Edinburgh +44 7748 678 551

…

________________________________ From: Sinan Yavuz ***@***.***> Sent: 19 July 2021 19:30 To: bnowok/synthpop ***@***.***> Cc: Subscribed ***@***.***> Subject: Re: [bnowok/synthpop] RStudio Freezing When Dataframe Includes Factor (#18) This email was sent to you by someone outside the University. You should only click on links or attachments if you are certain that the email is genuine and the content is safe. Synthpop is working with numeric data, but anytime I include a factor with more than 2 levels it causes the program to freeze. Any thoughts on why this might be the case? Thanks! I have the same problem, RStudio crashes. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub<#18 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AE3HB7BZVPBGLU3CMNTI7TLTYRVMNANCNFSM5ARGUS4Q>. The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.

Sinan-Yavuz · 2021-07-22T07:39:09Z

I am using CRAN version, 1.6.0

benjaminwnelson · 2021-07-22T15:56:11Z

I am using R 4.1.0.

gillian-raab · 2021-07-22T23:26:25Z

and synthpop from CRAN? Gillian M Raab Emeritus Professor, Edinburgh Napier University Part-time Research Fellow Administrative Data Research Centre - Scotland Edinburgh +44 7748 678 551

…

________________________________ From: Benjamin Nelson ***@***.***> Sent: 22 July 2021 16:56 To: bnowok/synthpop ***@***.***> Cc: RAAB Gillian ***@***.***>; Comment ***@***.***> Subject: Re: [bnowok/synthpop] RStudio Freezing When Dataframe Includes Factor (#18) This email was sent to you by someone outside the University. You should only click on links or attachments if you are certain that the email is genuine and the content is safe. I am using R 4.1.0. — You are receiving this because you commented. Reply to this email directly, view it on GitHub<#18 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AE3HB7HFPJHNASK6IBHATF3TZA5SNANCNFSM5ARGUS4Q>. The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.

gillian-raab · 2021-07-23T00:03:09Z

I'm using R version 4.5 either on its own or in R studio and this code that has 3 non-binary factors runs fine. Can you see if that works for you and/or send us the code that failed. library(synthpop) #help(synthpop) # version 1.6-0 ods <- SD2011[,c(1,4:6)] tt <- syn(ods) compare(tt,ods) Best Gillian Gillian M Raab Emeritus Professor, Edinburgh Napier University Part-time Research Fellow Administrative Data Research Centre - Scotland Edinburgh +44 7748 678 551

…

________________________________ From: Benjamin Nelson ***@***.***> Sent: 22 July 2021 16:56 To: bnowok/synthpop ***@***.***> Cc: RAAB Gillian ***@***.***>; Comment ***@***.***> Subject: Re: [bnowok/synthpop] RStudio Freezing When Dataframe Includes Factor (#18) This email was sent to you by someone outside the University. You should only click on links or attachments if you are certain that the email is genuine and the content is safe. I am using R 4.1.0. — You are receiving this because you commented. Reply to this email directly, view it on GitHub<#18 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AE3HB7HFPJHNASK6IBHATF3TZA5SNANCNFSM5ARGUS4Q>. The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.

benjaminwnelson · 2021-07-23T18:34:32Z

I'm using synthpop 1.6-0.

When I run your code it works perfectly. I tried it on my dataset again with 17 variables and 2,000 observations and it can't get past the gender variable. I tried only loading synthpop and no other packages and the same thing happened.

gillian-raab · 2021-07-26T11:27:05Z

Very odd. Can you send me the data set that caused problems, or if you can't for confidentiality reasons the error message you received and some details of the variables, so I can suggest other ways. G Gillian M Raab Emeritus Professor, Edinburgh Napier University Part-time Research Fellow Administrative Data Research Centre - Scotland Edinburgh +44 7748 678 551

…

________________________________ From: Benjamin Nelson ***@***.***> Sent: 23 July 2021 19:34 To: bnowok/synthpop ***@***.***> Cc: RAAB Gillian ***@***.***>; Comment ***@***.***> Subject: Re: [bnowok/synthpop] RStudio Freezing When Dataframe Includes Factor (#18) This email was sent to you by someone outside the University. You should only click on links or attachments if you are certain that the email is genuine and the content is safe. I'm using synthpop 1.6-0. When I run your code it works perfectly. I tried it on my dataset again with 17 variables and 2,000 observations and it can't get past the gender variable. — You are receiving this because you commented. Reply to this email directly, view it on GitHub<#18 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AE3HB7HGRWTZ5LLIMO2D7J3TZGY4HANCNFSM5ARGUS4Q>. The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.

wbuchanan · 2023-04-10T19:52:34Z

Using Version 1.8-0
R 4.2.3

set.seed(7779311)
library(haven)
library(dplyr)
filenm <- "https://github.com/OpenSDP/faketucky/raw/master/faketucky.dta"
df <- haven::read_dta(filenm, 
	  col_select = c("sid", "first_dist_code", "first_hs_code", 
			         "first_hs_alt", "first_hs_urbanicity", "chrt_ninth", 
	  			     "male", "race_ethnicity", "frpl_ever_in_hs", 
	  			     "sped_ever_in_hs", "lep_ever_in_hs", "gifted_ever_in_hs",
	  			     "ever_alt_sch_in_hs", "scale_score_6_math", 
	  			     "scale_score_6_read", "scale_score_8_math", 
	  			     "scale_score_8_read", "pct_absent_in_hs", 
	  			     "pct_excused_in_hs", "avg_gpa_hs", "scale_score_11_eng", 
	  			     "scale_score_11_math", "scale_score_11_read",
	  			     "scale_score_11_comp", "collegeready_ever_in_hs", 
	  			     "careerready_ever_in_hs", "ap_ever_take_class", 
	  			     "last_acadyr_observed", "transferout", "dropout", 
	  			     "still_enrolled", "ontime_grad", "chrt_grad", "hs_diploma",
	  			     "enroll_yr1_any", "enroll_yr1_2yr", "enroll_yr1_4yr",
	  			     "enroll_yr2_any"))
names(df) <- c("stdid", "distid", "schcd", "altsch", "urbanicity", 
			   "cohort", "male", "race", "frleverhs", "swdeverhs", "eleverhs",
			   "tageverhs", "alteverhs", "mthss6", "rlass6", "mthss8", 
			   "rlass8", "pctabshs", "pctexcusedhs", "hsgpa", "acteng11", 
			   "actmth11", "actrla11", "actcmp11", "evercollrdyhs", 
			   "evercarrdyhs", "aptakenever", "lastobsyr", "transfer", 
			   "dropout", "stillenrolled", "gradontime", "gradcohort", 
			   "diploma", "yr1psenrany", "yr1psenr2yr", "yr1psenr4yr", 
			   "yr2psenrany")
df$schid <- paste0(df$distid, df$schcd)
validSchools <- data.frame("schid" = sample(unique(df$schid), size = 60))
df <- dplyr::inner_join(df, validSchools)
df$altsch <- as.factor(df$altsch)
df$cohort <- as.factor(df$cohort)
df$male <- as.factor(df$male)
df$swdeverhs <- as.factor(df$swdeverhs)
df$eleverhs <- as.factor(df$eleverhs)
df$schid <- as.factor(df$schid)
df$tageverhs <- as.factor(df$tageverhs)
df$alteverhs <- as.factor(df$alteverhs)
df$evercollrdyhs <- as.factor(df$evercollrdyhs)
df$evercarrdyhs <- as.factor(df$evercarrdyhs)
df$aptakenever <- as.factor(df$aptakenever)
df$transfer <- as.factor(df$transfer)
df$dropout <- as.factor(df$dropout)
df$stillenrolled <- as.factor(df$stillenrolled)
df$gradontime <- as.factor(df$gradontime)
df$diploma <- as.factor(df$diploma)
df$yr1psenrany <- as.factor(df$yr1psenrany)
df$yr1psenr2yr <- as.factor(df$yr1psenr2yr)
df$yr1psenr4yr <- as.factor(df$yr1psenr4yr)
df$yr2psenrany <- as.factor(df$yr2psenrany)
df$schid <- as.factor(df$schid)
df$race <- as.factor(df$race)
df$urbanicity <- as.factor(df$urbanicity)
df$frleverhs <- as.factor(df$frleverhs)
df$lastobsyr <- as.factor(df$lastobsyr)
df$gradcohort <- as.factor(df$gradcohort)
df <- df[-c(2, 3)]
library(synthpop)
# This works fine and executes relatively quickly
syn <- synthpop::syn(df)
# This freezes and fails to execute every time:
syn2 <- synthpop::syn(df[-c(1)], models = TRUE, 
                    visit.sequence = c("schid", "altsch", "male", "race", "cohort", "urbanicity", 
		 "frleverhs", "swdeverhs", "eleverhs", "tageverhs", "alteverhs", 
		 "mthss6", "rlass6", "mthss8", "rlass8", "pctabshs", "pctexcusedhs", 
		 "aptakenever", "lastobsyr", "transfer", "dropout", "stillenrolled", 
		 "hsgpa", "gradontime", "gradcohort", "diploma", "evercollrdyhs", 
		 "evercarrdyhs", "actmth11", "actrla11", "acteng11", "actcmp11", 
		 "yr1psenr2yr", "yr1psenr4yr", "yr1psenrany", "yr2psenrany"))

The second call to synthpop should sample school identifiers and then start modeling student level attributes. It fails consistently. It is only using a single core, even though the machine has 12 available and doesn't use all of the RAM available.

gillian-raab · 2023-04-11T10:56:51Z

DearWilliam, I have doiagnosed your problem and can offer a couple of solutions. Before I supply you with detailsa can you please let me know if you get this. Then I can send you code and/or, we could have a chat about how you are running your synthesis. I can offer someuggestions on how you might improve it. Best wishes Gillian Gillian M Raab Research Fellow (part-time) Scottish Centre for Administrative Data Research My core working days are Tuesdays and Thursdays Though I sometimes swap them for other days 07748 678 551

…

________________________________ From: William Buchanan ***@***.***> Sent: 10 April 2023 20:52 To: bnowok/synthpop ***@***.***> Cc: Gillian Raab ***@***.***>; Comment ***@***.***> Subject: Re: [bnowok/synthpop] RStudio Freezing When Dataframe Includes Factor (#18) This email was sent to you by someone outside the University. You should only click on links or attachments if you are certain that the email is genuine and the content is safe. Using Version 1.8-0 R 4.2.3 set.seed(7779311) library(haven) library(dplyr) filenm <- "https://github.com/OpenSDP/faketucky/raw/master/faketucky.dta" df <- haven::read_dta(filenm, col_select = c("sid", "first_dist_code", "first_hs_code", "first_hs_alt", "first_hs_urbanicity", "chrt_ninth", "male", "race_ethnicity", "frpl_ever_in_hs", "sped_ever_in_hs", "lep_ever_in_hs", "gifted_ever_in_hs", "ever_alt_sch_in_hs", "scale_score_6_math", "scale_score_6_read", "scale_score_8_math", "scale_score_8_read", "pct_absent_in_hs", "pct_excused_in_hs", "avg_gpa_hs", "scale_score_11_eng", "scale_score_11_math", "scale_score_11_read", "scale_score_11_comp", "collegeready_ever_in_hs", "careerready_ever_in_hs", "ap_ever_take_class", "last_acadyr_observed", "transferout", "dropout", "still_enrolled", "ontime_grad", "chrt_grad", "hs_diploma", "enroll_yr1_any", "enroll_yr1_2yr", "enroll_yr1_4yr", "enroll_yr2_any")) names(df) <- c("stdid", "distid", "schcd", "altsch", "urbanicity", "cohort", "male", "race", "frleverhs", "swdeverhs", "eleverhs", "tageverhs", "alteverhs", "mthss6", "rlass6", "mthss8", "rlass8", "pctabshs", "pctexcusedhs", "hsgpa", "acteng11", "actmth11", "actrla11", "actcmp11", "evercollrdyhs", "evercarrdyhs", "aptakenever", "lastobsyr", "transfer", "dropout", "stillenrolled", "gradontime", "gradcohort", "diploma", "yr1psenrany", "yr1psenr2yr", "yr1psenr4yr", "yr2psenrany") df$schid <- paste0(df$distid, df$schcd) validSchools <- data.frame("schid" = sample(unique(df$schid), size = 60)) df <- dplyr::inner_join(df, validSchools) df$altsch <- as.factor(df$altsch) df$cohort <- as.factor(df$cohort) df$male <- as.factor(df$male) df$swdeverhs <- as.factor(df$swdeverhs) df$eleverhs <- as.factor(df$eleverhs) df$schid <- as.factor(df$schid) df$tageverhs <- as.factor(df$tageverhs) df$alteverhs <- as.factor(df$alteverhs) df$evercollrdyhs <- as.factor(df$evercollrdyhs) df$evercarrdyhs <- as.factor(df$evercarrdyhs) df$aptakenever <- as.factor(df$aptakenever) df$transfer <- as.factor(df$transfer) df$dropout <- as.factor(df$dropout) df$stillenrolled <- as.factor(df$stillenrolled) df$gradontime <- as.factor(df$gradontime) df$diploma <- as.factor(df$diploma) df$yr1psenrany <- as.factor(df$yr1psenrany) df$yr1psenr2yr <- as.factor(df$yr1psenr2yr) df$yr1psenr4yr <- as.factor(df$yr1psenr4yr) df$yr2psenrany <- as.factor(df$yr2psenrany) df$schid <- as.factor(df$schid) df$race <- as.factor(df$race) df$urbanicity <- as.factor(df$urbanicity) df$frleverhs <- as.factor(df$frleverhs) df$lastobsyr <- as.factor(df$lastobsyr) df$gradcohort <- as.factor(df$gradcohort) df <- df[-c(2, 3)] library(synthpop) # This works fine and executes relatively quickly syn <- synthpop::syn(df) # This freezes and fails to execute every time: syn2 <- synthpop::syn(df[-c(1)], models = TRUE, visit.sequence = c("schid", "altsch", "male", "race", "cohort", "urbanicity", "frleverhs", "swdeverhs", "eleverhs", "tageverhs", "alteverhs", "mthss6", "rlass6", "mthss8", "rlass8", "pctabshs", "pctexcusedhs", "aptakenever", "lastobsyr", "transfer", "dropout", "stillenrolled", "hsgpa", "gradontime", "gradcohort", "diploma", "evercollrdyhs", "evercarrdyhs", "actmth11", "actrla11", "acteng11", "actcmp11", "yr1psenr2yr", "yr1psenr4yr", "yr1psenrany", "yr2psenrany")) The second call to synthpop should sample school identifiers and then start modeling student level attributes. It fails consistently. It is only using a single core, even though the machine has 12 available and doesn't use all of the RAM available. — Reply to this email directly, view it on GitHub<#18 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AE3HB7DHZI5WNGKLX3S364TXARQIZANCNFSM5ARGUS4Q>. You are receiving this because you commented.Message ID: ***@***.***> The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.

wbuchanan · 2023-04-11T11:21:43Z

I did receive your email and am all ears for a solution.Thanks for the quick response and I hope my example might be useful for some of the other similar issues others have raised.

gillian-raab · 2023-04-11T14:23:19Z

Dear William, Here is a revised and edited version of your code. You should be able to rerun your synthesis by one of two methods. Here are somerelevant comments. Sometimes CART models can get stuck. This is what seems to have happened in your case when you used the default cart method used in synthpop which picks up the rpart function from the rpart package. Usually this is caused by some curious patterns in the data often involving small numbers. In this sort of case I wouldt alter the model in some way to see if I can get the synthesis to run, My first try is usually to try another version of CART. This worked fine here. There is another cart option that uses the ctree function ( I think it comes from the party package). When something goes wrong I usually just try the other routine, as this often cures it. This worked here asnd the code to create syn4 does it. I prefer the ctree method in general because it provides nice model plots (see code). We don't have it as the default because it too can occasionally fail with a cryptic error message. The package authors have not been able to help with this. But I went a bit further to see why rpart had gone wrong. It can often be due to small numbers and/or exact dependencies between variables in the initial models. In your case it was because altsh is actually at the school level, and the variable male has only 2 missing values. I messed around a bit to see what would work. Changing the order of synthesis can usually cure this. Moving the school id variable to the end of the synthesis allowed this to work OK. (syn5) I don't know what you wanted to put this in first. There are several reasons why this is a bad idea. 1. The schid divides the data into subgroups many of which are small. This means they will be unlikely toappear in tree based models. 2. Altsch is derived from schid 3. The dependency of aone variable on others does not require it to be synthesisied first. You can see this if you look at the model for schid that the code prints out for syn5. Where are you and what are you using our package for? We are always interested to know. Best wishes Gillian Gillian M Raab Research Fellow (part-time) Scottish Centre for Administrative Data Research My core working days are Tuesdays and Thursdays Though I sometimes swap them for other days +44 07748 678 551

…

________________________________ From: William Buchanan ***@***.***> Sent: 11 April 2023 12:21 To: bnowok/synthpop ***@***.***> Cc: Gillian Raab ***@***.***>; Comment ***@***.***> Subject: Re: [bnowok/synthpop] RStudio Freezing When Dataframe Includes Factor (#18) This email was sent to you by someone outside the University. You should only click on links or attachments if you are certain that the email is genuine and the content is safe. I did receive your email and am all ears for a solution.Thanks for the quick response and I hope my example might be useful for some of the other similar issues others have raised.Sent from my iPhoneOn Apr 11, 2023, at 06:57, Gillian Raab ***@***.***> wrote: DearWilliam, I have doiagnosed your problem and can offer a couple of solutions. Before I supply you with detailsa can you please let me know if you get this. Then I can send you code and/or, we could have a chat about how you are running your synthesis. I can offer someuggestions on how you might improve it. Best wishes Gillian Gillian M Raab Research Fellow (part-time) Scottish Centre for Administrative Data Research My core working days are Tuesdays and Thursdays Though I sometimes swap them for other days 07748 678 551

________________________________ From: William Buchanan ***@***.***> Sent: 10 April 2023 20:52 To: bnowok/synthpop ***@***.***> Cc: Gillian Raab ***@***.***>; Comment ***@***.***> Subject: Re: [bnowok/synthpop] RStudio Freezing When Dataframe Includes Factor (#18) This email was sent to you by someone outside the University. You should only click on links or attachments if you are certain that the email is genuine and the content is safe. Using Version 1.8-0 R 4.2.3 set.seed(7779311) library(haven) library(dplyr) filenm <- "https://github.com/OpenSDP/faketucky/raw/master/faketucky.dta" df <- haven::read_dta(filenm, col_select = c("sid", "first_dist_code", "first_hs_code", "first_hs_alt", "first_hs_urbanicity", "chrt_ninth", "male", "race_ethnicity", "frpl_ever_in_hs", "sped_ever_in_hs", "lep_ever_in_hs", "gifted_ever_in_hs", "ever_alt_sch_in_hs", "scale_score_6_math", "scale_score_6_read", "scale_score_8_math", "scale_score_8_read", "pct_absent_in_hs", "pct_excused_in_hs", "avg_gpa_hs", "scale_score_11_eng", "scale_score_11_math", "scale_score_11_read", "scale_score_11_comp", "collegeready_ever_in_hs", "careerready_ever_in_hs", "ap_ever_take_class", "last_acadyr_observed", "transferout", "dropout", "still_enrolled", "ontime_grad", "chrt_grad", "hs_diploma", "enroll_yr1_any", "enroll_yr1_2yr", "enroll_yr1_4yr", "enroll_yr2_any")) names(df) <- c("stdid", "distid", "schcd", "altsch", "urbanicity", "cohort", "male", "race", "frleverhs", "swdeverhs", "eleverhs", "tageverhs", "alteverhs", "mthss6", "rlass6", "mthss8", "rlass8", "pctabshs", "pctexcusedhs", "hsgpa", "acteng11", "actmth11", "actrla11", "actcmp11", "evercollrdyhs", "evercarrdyhs", "aptakenever", "lastobsyr", "transfer", "dropout", "stillenrolled", "gradontime", "gradcohort", "diploma", "yr1psenrany", "yr1psenr2yr", "yr1psenr4yr", "yr2psenrany") df$schid <- paste0(df$distid, df$schcd) validSchools <- data.frame("schid" = sample(unique(df$schid), size = 60)) df <- dplyr::inner_join(df, validSchools) df$altsch <- as.factor(df$altsch) df$cohort <- as.factor(df$cohort) df$male <- as.factor(df$male) df$swdeverhs <- as.factor(df$swdeverhs) df$eleverhs <- as.factor(df$eleverhs) df$schid <- as.factor(df$schid) df$tageverhs <- as.factor(df$tageverhs) df$alteverhs <- as.factor(df$alteverhs) df$evercollrdyhs <- as.factor(df$evercollrdyhs) df$evercarrdyhs <- as.factor(df$evercarrdyhs) df$aptakenever <- as.factor(df$aptakenever) df$transfer <- as.factor(df$transfer) df$dropout <- as.factor(df$dropout) df$stillenrolled <- as.factor(df$stillenrolled) df$gradontime <- as.factor(df$gradontime) df$diploma <- as.factor(df$diploma) df$yr1psenrany <- as.factor(df$yr1psenrany) df$yr1psenr2yr <- as.factor(df$yr1psenr2yr) df$yr1psenr4yr <- as.factor(df$yr1psenr4yr) df$yr2psenrany <- as.factor(df$yr2psenrany) df$schid <- as.factor(df$schid) df$race <- as.factor(df$race) df$urbanicity <- as.factor(df$urbanicity) df$frleverhs <- as.factor(df$frleverhs) df$lastobsyr <- as.factor(df$lastobsyr) df$gradcohort <- as.factor(df$gradcohort) df <- df[-c(2, 3)] library(synthpop) # This works fine and executes relatively quickly syn <- synthpop::syn(df) # This freezes and fails to execute every time: syn2 <- synthpop::syn(df[-c(1)], models = TRUE, visit.sequence = c("schid", "altsch", "male", "race", "cohort", "urbanicity", "frleverhs", "swdeverhs", "eleverhs", "tageverhs", "alteverhs", "mthss6", "rlass6", "mthss8", "rlass8", "pctabshs", "pctexcusedhs", "aptakenever", "lastobsyr", "transfer", "dropout", "stillenrolled", "hsgpa", "gradontime", "gradcohort", "diploma", "evercollrdyhs", "evercarrdyhs", "actmth11", "actrla11", "acteng11", "actcmp11", "yr1psenr2yr", "yr1psenr4yr", "yr1psenrany", "yr2psenrany")) The second call to synthpop should sample school identifiers and then start modeling student level attributes. It fails consistently. It is only using a single core, even though the machine has 12 available and doesn't use all of the RAM available. — Reply to this email directly, view it on GitHub<#18 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AE3HB7DHZI5WNGKLX3S364TXARQIZANCNFSM5ARGUS4Q>. You are receiving this because you commented.Message ID: ***@***.***> The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***> — Reply to this email directly, view it on GitHub<#18 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AE3HB7C3JUNDPV5CBRRWOM3XAU5FDANCNFSM5ARGUS4Q>. You are receiving this because you commented.Message ID: ***@***.***>

wbuchanan · 2023-04-11T14:33:09Z

Hi @gillian-raab,

I included the school ID variable first purposefully to sample school IDs (hopefully in a manner that would retain the marginal distribution of school IDs). I intially had school level variables in the visit sequence listed first, followed by demographic characteristics of students, and then test scores and outcomes. In terms of use, it is purely for demonstration purposes to explain how synthetic data can be used for privacy protection to increase access to data, for this particular example.

That said, I didn't see any other code listed, but can at least try making some of the modifications you mentioned.

gillian-raab · 2023-04-12T08:19:04Z

Apologies William, the code appears to have failed to attach. Here it is now. It is not usually necessary to put a variable at the start of the synthesis to maintain the marginal distributions. Good luck with using synthpop. Gillian Gillian M Raab Research Fellow (part-time) Scottish Centre for Administrative Data Research My core working days are Tuesdays and Thursdays Though I sometimes swap them for other days 07748 678 551

…

________________________________ From: William Buchanan ***@***.***> Sent: 11 April 2023 15:33 To: bnowok/synthpop ***@***.***> Cc: Gillian Raab ***@***.***>; Mention ***@***.***> Subject: Re: [bnowok/synthpop] RStudio Freezing When Dataframe Includes Factor (#18) This email was sent to you by someone outside the University. You should only click on links or attachments if you are certain that the email is genuine and the content is safe. Hi @gillian-raab<https://github.com/gillian-raab>, I included the school ID variable first purposefully to sample school IDs (hopefully in a manner that would retain the marginal distribution of school IDs). I intially had school level variables in the visit sequence listed first, followed by demographic characteristics of students, and then test scores and outcomes. In terms of use, it is purely for demonstration purposes to explain how synthetic data can be used for privacy protection to increase access to data, for this particular example. That said, I didn't see any other code listed, but can at least try making some of the modifications you mentioned. — Reply to this email directly, view it on GitHub<#18 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AE3HB7H2GTGYOGNFXFA765LXAVTS7ANCNFSM5ARGUS4Q>. You are receiving this because you were mentioned.Message ID: ***@***.***> The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RStudio Freezing When Dataframe Includes Factor #18

RStudio Freezing When Dataframe Includes Factor #18

benjaminwnelson commented Jul 17, 2021

Sinan-Yavuz commented Jul 19, 2021

gillian-raab commented Jul 21, 2021 via email

Sinan-Yavuz commented Jul 22, 2021

benjaminwnelson commented Jul 22, 2021

gillian-raab commented Jul 22, 2021 via email

gillian-raab commented Jul 23, 2021 via email

benjaminwnelson commented Jul 23, 2021 •

edited

Loading

gillian-raab commented Jul 26, 2021 via email

wbuchanan commented Apr 10, 2023

gillian-raab commented Apr 11, 2023 via email

wbuchanan commented Apr 11, 2023 via email •

edited

Loading

gillian-raab commented Apr 11, 2023 via email

wbuchanan commented Apr 11, 2023

gillian-raab commented Apr 12, 2023 via email

RStudio Freezing When Dataframe Includes Factor #18

RStudio Freezing When Dataframe Includes Factor #18

Comments

benjaminwnelson commented Jul 17, 2021

Sinan-Yavuz commented Jul 19, 2021

gillian-raab commented Jul 21, 2021 via email

Sinan-Yavuz commented Jul 22, 2021

benjaminwnelson commented Jul 22, 2021

gillian-raab commented Jul 22, 2021 via email

gillian-raab commented Jul 23, 2021 via email

benjaminwnelson commented Jul 23, 2021 • edited Loading

gillian-raab commented Jul 26, 2021 via email

wbuchanan commented Apr 10, 2023

gillian-raab commented Apr 11, 2023 via email

wbuchanan commented Apr 11, 2023 via email • edited Loading

gillian-raab commented Apr 11, 2023 via email

wbuchanan commented Apr 11, 2023

gillian-raab commented Apr 12, 2023 via email

benjaminwnelson commented Jul 23, 2021 •

edited

Loading

wbuchanan commented Apr 11, 2023 via email •

edited

Loading