Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More expansive Introduction lecture #324

Closed
maxnoe opened this issue Jan 23, 2023 · 9 comments · Fixed by #339
Closed

More expansive Introduction lecture #324

maxnoe opened this issue Jan 23, 2023 · 9 comments · Fixed by #339

Comments

@maxnoe
Copy link
Member

maxnoe commented Jan 23, 2023

As discussed, we should start the workshop with a proper introduction to the core concepts / ideas of the workshop and motivation for all the parts, make it easier for students to see "the big picture".

Rough outline for dicussion:

  • Basic Goals

    • Teach everyone basics in programming → Python
    • Basic data analysis for the lab courses → numpy, matplotlib, fitting (scipy or iminuit?)
    • Scientific Writing → LaTeX
  • Open Science

    • Free and Open Source Software → Everything
  • Reproducible Science

    • Reproducibility Crisis
    • Version Control → git
    • Workflow Automatization → (snake-?) make
  • The different operating systems (aka windows sucks)

  • Text Editors

@jpwgnr
Copy link

jpwgnr commented Jan 27, 2023

Another point for the discussion:

  • Present content with the exact same working environments as described in the workshop, so codium instad of vim, Ubuntu Desktop instead of i3, etc.
  • Also discuss pro and con usability of jupyter notebooks as many students use notebooks basically for their full documentation
  • For the theory part and error propagation a basic introduction to sympy would be nice (then students with interest for theory, learn that you don't have to compute everything by hand *hust, Yanick*, also in terms of reproducability and open source in comparison with mathematica)
  • I would really prefere snakemake as the learning threshold is way lower for beginners and makes way more sense as it is completely pythonic

In my opinion the overall motivation should be to have an example protocol, with a example data set, that needs to be evaluated in the first week and brought in a nice pdf in the second week. The unix, git and snakemake parts should rather be distributed over the whole time of the workshop by giving the needed parts, when they are useful in the process.
For example:

  • one would start in the terminal -> short explanation of terminals, basic commands and folder structure (no talk about access, bash scripts etc. this can be done later)
  • open codium -> quick explanation of editors
  • Create txt file with data like one would to in the Praktikum -> copy paste from some example file on our website
  • Then one could start with the python stuff but always refer to this original data set and the main goal of creating a nice analysis of the data set
  • after first parts are finished this can then be plugged in the snakemake automation with a simple introduction
  • git can still be put between python and latex blocks, but still in the context of our main document we want to create

@maxnoe
Copy link
Member Author

maxnoe commented Jan 27, 2023

snakemake as the learning threshold is way lower for beginners

Is it though? The syntax is not dramatically simpler and the hard thing is learning how to think in targets and dependencies, not the actual tool I'd say.

Make is available and used every where, snakemake is a relative niche (although a nice one). Learning snakemake might only be easy for people already familiar with make and python.

@LuckyJosh
Copy link
Contributor

My two cents on snakemake:
The availability aspect is if at all not a big issue.
If you are stuck on a machine that can only,run make and not snakemake the participants are out of luck anyway.
These specific problems can be adressed during the work on their theses or whereever these problems arise.

In my experience over the last years, make is one of the
programs with the lowest adoption rate.

I'd would agree that the basic synatx is very similar,
yet I would hope that the takeaway for the participants is
"its just 'special' python".

@LuckyJosh
Copy link
Contributor

In general I am in favor of softening the boundaries between the different tool, and an example report should be a helpful orientation

@chrbeckm
Copy link
Member

I think the red line of a lab report is a good idea. And it would tie every part of the workshop too it.

@chrbeckm
Copy link
Member

If we build the exemplary lab report completely, it is on one hand our test document (#48), and a second reference for the attendees. Maybe structure and incorporate it in the following way

  • Data files with x and y values (can be as boring as a sinus and cosinus wave)
  • read it in during numpy notebook
  • matplotlib: use it after a quick intro to show labels, legends, title, limits, ticks (have multiple files at subplots)
  • one maybe even with uncertainties for the errorbar plot
  • scipy: use it as a filler between polyfit and sigmoid
  • have a final exercise where it will be prepared for the LaTeX week. Smaller steps maybe with each notebook

@maxnoe
Copy link
Member Author

maxnoe commented Jan 30, 2023

If we build the exemplary lab report completely, it

I don't really understand where this is going. With "introduction lecture" I meant exactly that, a lecture part (30 minutes to an hour) at the start of the workshop.

Are you saying that we should structure the python, numpy and matplotlib lectures exactly so that it follows a lab report? If yes, I am opposed. We should show the relevant functionality, yes, but I think that is much easier from first principles and with a general introduction than directly connecting that to a lab report.

@chrbeckm
Copy link
Member

Okay, then I misunderstood you. I thought you meant to use a lab report as motivation in the introduction lecture and as a guide through the notebooks.

@SepplL
Copy link
Contributor

SepplL commented Jan 30, 2023

But maybe as a compromise we could indeed change or adapt some of the exercises to refer to or match the example/introductory lab report that we show.
Just to keep both - the grand story for achieving the lab report and at the same time it's an opportunity for showing the participants "how far they got".
That can of course be done in the same order as before.

Regarding the (snake-)make issue. I agree, that there can be portability problems, but for the first set of lab reports and most likely even most of the thesis, snakemake should be fine.
As long as you are not on remote HPC clusters, it should not be a big deal to get your system of choice running.
And for the people who are - they have to have a lot more introduction and time spent on these things than we can offer in the two short weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants