A new style spreadsheet for statistics and data science #8453
rdstern
started this conversation in
Show and tell
Replies: 1 comment
-
@rdstern this sounds great! What would we want them to write up for climatic work? I'm not sure who uses R-Instat on the Agriculture side. Do you have any ideas who might be good for this? There are some bits that Cedric suggested plots-wise back in January, that I haven't been able to get to yet too - it is on my radar still though! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
The recent work by @N-thony (including find and word-wrap), plus the work that @Patowhiz is doing on wide data sets is getting us ready to extol the virtues of the data view in R-Instat. I suggest it can become the spreadsheet for statistical work that you would always have wanted, if only you knew it existed. This needs to be written up - and soon. It may be that we can also tempt Cedric (CIMH) to write it up for climatic work, and perhaps someone else for agriculture - rather than just us? @lilyclements what do you think?
This is exciting, because it will join the output windows in looking really good and being an excellent way to get and organise the results. And the script windows are already looking good - and are soon planned to be able to process everything that is currently in R in RStudio.
And just to be clear that we are not competing. We are adding. So, if you are a spreadsheet addict, then that's fine. You can continue and R-Instat will read your data and also write any changed data, back to your spreadsheet if you wish.
And, if you are a spreadsheet user, then adding code is a big step. In R-Instat, adding a bit of R code is a very small step. However, just as there are currently many spreadsheet addicts, there are also many R addicts. And they use RStudio, just as many spreadsheet addicts us Excel. And we like RStudio - just as we like spreadsheets. Etc, on this bit. I want to get back to the spreadsheet user.
Now there are various things you can't do in the data view in R-Instat that you can do in your ordinary spreadsheet. You can't have a messy spreadsheet, where you have bits of data all over the sheet, and perhaps with some summary results mixed in with the data. But, if you are using your spreadsheet for data science or statistics work, then you are probably using pivot tables and graphs. They are great, so we hope so? And, if your spreadsheet data are ok for a pivot table, then they are also fine for a data frame in R-Instat.
A second limitation is that data frames in R-Instat don't (yet) change values dynamically in the way you can do in a spreadsheet.
And third is a difference, rather than a limitation, and that is where the results go. With a spreadsheet, you put the results back on the same sheet, or they go into another sheet. In R-Instat they don't go back into the same sheet - or data frame. They could go onto another sheet, or they go into a special results window. They go onto another sheet, if they might then also become data, for the next step in the analysis. Otherwise they go into the results window.
We are keen on spreadsheets, because they tempt users to look at their data. And we are concerned that when people stop using spreadsheets, so much, they also stop looking at their data as much. Good statisticians are also data detectives and data detectives look at data, as well as the results. You just have to get a bit more skilled, when you get more data. For example boxplots become a neat way to look for oddities in data. They also take us towards one of the bonuses you get, from your sheets (or data frames) in R-Instat compared to an ordinary spreadsheet.
Types of data
The column names - Select features.
Tidy data - stack and unstack. - no repeated copy and paste
How long and how wide?
etc.
Beta Was this translation helpful? Give feedback.
All reactions