Skip to content

Commit

Permalink
ps 1
Browse files Browse the repository at this point in the history
  • Loading branch information
kriemo committed Nov 29, 2023
1 parent 865bc92 commit 1ccd202
Show file tree
Hide file tree
Showing 3 changed files with 61 additions and 24 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -118,11 +118,11 @@ See `Help > Cheatsheets` for very helpful graphical references.The `base R`, `dp

## The more you try the more you will learn.

Learning a foreign language requires continual practice speaking and writing the language. To learn you need to try new phrases and expressions. To learn you have to make mistakes. The more you try and experiment the quicker you will learn.
Learning a foreign language requires continual practice speaking and writing the language. To learn you need to try new phrases and expressions. To learn you have to make mistakes. The more you try and experiment the quicker you will learn.

Learning a programming language is very similar. We communicate by organizing a series of steps in the right order to instruct the computer to accomplish a task.
Learning a programming language is very similar. We communicate by organizing a series of steps in the right order to instruct the computer to accomplish a task.

Type and execute commands, rather than copy and pasting, you will learn faster. Fiddle around with the code, see what works and what doesn't.
Type and execute commands, rather than copy and pasting, you will learn faster. Fiddle around with the code, see what works and what doesn't.

Probably everything we do in the class can be done by a LLM such as ChatGPT. These tools can help you, but you will be more effective at using them if you understand the fundamentals. You will also be more productive in the long term if you understand the basics.

Expand All @@ -146,6 +146,7 @@ R can be used as a simple calculator:
5^2 # 5 raised to the second power
2 + 3 * 5 # R respects the order of math operations.
```

## Example datasets in R

R and R packages include small datasets to demonstrate how to use a package or functionality. `data()` will show you many of the datasets included with a base R installation. We will use the state datasets, which contain data on the 50 US states.
Expand Down Expand Up @@ -199,6 +200,9 @@ Now, what is the value of x?

```{r, eval = FALSE}
x = ... ?
...
#[1] 60
```


Expand Down Expand Up @@ -332,7 +336,10 @@ state.name[51]

```{r}
# hint use the `sort()` function
sum(sort(state.area)[1:10])
sum(sort(state.area)[41:50])
sum(sort(state.area, decreasing = TRUE)[1:10])
```


Expand Down Expand Up @@ -403,13 +410,15 @@ Let's answer a related question, how **many** states are larger than 100,000 squ

```{r}
# multiple approaches will work
length(which(state.area > 100000))
sum(state.area > 100000)
```

Using the sum() function works because TRUE is stored as 1 and FALSE is stored as 0.

```{r}
as.integer(c(TRUE, FALSE, TRUE))
sum(c(TRUE, FALSE, TRUE))
```


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -93,8 +93,8 @@


<!-- https://schema.org/Article -->
<meta property="article:published" itemprop="datePublished" content="2023-11-28"/>
<meta property="article:created" itemprop="dateCreated" content="2023-11-28"/>
<meta property="article:published" itemprop="datePublished" content="2023-11-29"/>
<meta property="article:created" itemprop="dateCreated" content="2023-11-29"/>
<meta name="article:author" content="Kent Riemondy"/>

<!-- https://developers.facebook.com/docs/sharing/webmasters#markup -->
Expand All @@ -110,7 +110,7 @@
<!--radix_placeholder_rmarkdown_metadata-->

<script type="text/json" id="radix-rmarkdown-metadata">
{"type":"list","attributes":{"names":{"type":"character","attributes":{},"value":["title","author","output","date"]}},"value":[{"type":"character","attributes":{},"value":["Class 1: Introduction to the R statistical programming language"]},{"type":"list","attributes":{},"value":[{"type":"list","attributes":{"names":{"type":"character","attributes":{},"value":["name","url","affiliation","affiliation_url","orcid_id"]}},"value":[{"type":"character","attributes":{},"value":["Kent Riemondy"]},{"type":"character","attributes":{},"value":["https://github.com/kriemo"]},{"type":"character","attributes":{},"value":["RNA Bioscience Initiative"]},{"type":"character","attributes":{},"value":["https://medschool.cuanschutz.edu/rbi"]},{"type":"character","attributes":{},"value":["0000-0003-0750-1273"]}]}]},{"type":"list","attributes":{"names":{"type":"character","attributes":{},"value":["distill::distill_article"]}},"value":[{"type":"list","attributes":{"names":{"type":"character","attributes":{},"value":["self_contained","toc"]}},"value":[{"type":"logical","attributes":{},"value":[false]},{"type":"logical","attributes":{},"value":[true]}]}]},{"type":"character","attributes":{},"value":["11-28-2023"]}]}
{"type":"list","attributes":{"names":{"type":"character","attributes":{},"value":["title","author","output","date"]}},"value":[{"type":"character","attributes":{},"value":["Class 1: Introduction to the R statistical programming language"]},{"type":"list","attributes":{},"value":[{"type":"list","attributes":{"names":{"type":"character","attributes":{},"value":["name","url","affiliation","affiliation_url","orcid_id"]}},"value":[{"type":"character","attributes":{},"value":["Kent Riemondy"]},{"type":"character","attributes":{},"value":["https://github.com/kriemo"]},{"type":"character","attributes":{},"value":["RNA Bioscience Initiative"]},{"type":"character","attributes":{},"value":["https://medschool.cuanschutz.edu/rbi"]},{"type":"character","attributes":{},"value":["0000-0003-0750-1273"]}]}]},{"type":"list","attributes":{"names":{"type":"character","attributes":{},"value":["distill::distill_article"]}},"value":[{"type":"list","attributes":{"names":{"type":"character","attributes":{},"value":["self_contained","toc"]}},"value":[{"type":"logical","attributes":{},"value":[false]},{"type":"logical","attributes":{},"value":[true]}]}]},{"type":"character","attributes":{},"value":["11-29-2023"]}]}
</script>
<!--/radix_placeholder_rmarkdown_metadata-->

Expand Down Expand Up @@ -1518,7 +1518,7 @@
<!--radix_placeholder_front_matter-->

<script id="distill-front-matter" type="text/json">
{"title":"Class 1: Introduction to the R statistical programming language","authors":[{"author":"Kent Riemondy","authorURL":"https://github.com/kriemo","affiliation":"RNA Bioscience Initiative","affiliationURL":"https://medschool.cuanschutz.edu/rbi","orcidID":"0000-0003-0750-1273"}],"publishedDate":"2023-11-28T00:00:00.000-07:00","citationText":"Riemondy, 2023"}
{"title":"Class 1: Introduction to the R statistical programming language","authors":[{"author":"Kent Riemondy","authorURL":"https://github.com/kriemo","affiliation":"RNA Bioscience Initiative","affiliationURL":"https://medschool.cuanschutz.edu/rbi","orcidID":"0000-0003-0750-1273"}],"publishedDate":"2023-11-29T00:00:00.000-07:00","citationText":"Riemondy, 2023"}
</script>

<!--/radix_placeholder_front_matter-->
Expand All @@ -1538,7 +1538,7 @@ <h1>Class 1: Introduction to the R statistical programming language</h1>
<div class="d-byline">
Kent Riemondy <a href="https://github.com/kriemo" class="uri">https://github.com/kriemo</a> (RNA Bioscience Initiative)<a href="https://medschool.cuanschutz.edu/rbi" class="uri">https://medschool.cuanschutz.edu/rbi</a>

<br/>11-28-2023
<br/>11-29-2023
</div>

<div class="d-article">
Expand Down Expand Up @@ -1770,7 +1770,12 @@ <h2 id="assigning-values-to-variables">Assigning values to variables</h2>
</div>
<p>Now, what is the value of x?</p>
<div class="layout-chunk" data-layout="l-body">
<div class="sourceCode" id="cb20"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb20-1"><a href="#cb20-1" aria-hidden="true" tabindex="-1"></a>x <span class="ot">=</span> ... ?</span></code></pre></div>
<div class="sourceCode">
<pre class="sourceCode r"><code class="sourceCode r"><span><span class='va'>x</span> <span class='op'>=</span> <span class='va'>...</span> <span class='op'>?</span></span>
<span><span class='va'>...</span></span>
<span></span>
<span><span class='co'>#[1] 60</span></span></code></pre>
</div>
</div>
<h2 id="vectors-and-atomic-types-in-r">Vectors and atomic types in R</h2>
<p>There are fundamental data types in R which represent integer, `characters, numeric, and logical values, as well as a few other specialized types.</p>
Expand Down Expand Up @@ -1924,17 +1929,17 @@ <h3 id="making-vectors-from-scratch">making vectors from scratch</h3>
<pre class="sourceCode r"><code class="sourceCode r"><span><span class='co'># get 5 values from a normal distribution with mean of 0 and sd of 1</span></span>
<span><span class='fu'><a href='https://rdrr.io/r/stats/Normal.html'>rnorm</a></span><span class='op'>(</span><span class='fl'>5</span><span class='op'>)</span></span></code></pre>
</div>
<pre><code>[1] 0.10347970 -0.32270985 0.88511658 -1.25701091 0.01266694</code></pre>
<pre><code>[1] 0.46625168 -1.08376829 -0.39536586 -0.08183084 0.33499270</code></pre>
<div class="sourceCode">
<pre class="sourceCode r"><code class="sourceCode r"><span><span class='co'># get 5 values from uniform distribution from 0 to 1</span></span>
<span><span class='fu'><a href='https://rdrr.io/r/stats/Uniform.html'>runif</a></span><span class='op'>(</span><span class='fl'>5</span><span class='op'>)</span></span></code></pre>
</div>
<pre><code>[1] 0.1643224 0.2905542 0.8360480 0.2602267 0.2875998</code></pre>
<pre><code>[1] 0.5951937 0.3089459 0.8601477 0.1417087 0.3356661</code></pre>
<div class="sourceCode">
<pre class="sourceCode r"><code class="sourceCode r"><span><span class='co'># sample 5 area values </span></span>
<span><span class='fu'><a href='https://rdrr.io/r/base/sample.html'>sample</a></span><span class='op'>(</span><span class='va'>state.area</span>, <span class='fl'>5</span><span class='op'>)</span></span></code></pre>
</div>
<pre><code>[1] 24181 82264 68192 96981 58560</code></pre>
<pre><code>[1] 121666 9304 2057 589757 24181</code></pre>
</div>
<h3 id="subsetting-vectors-in-r">Subsetting vectors in R</h3>
<p>R uses 1-based indexing to select values from a vector. The first element of a vector is at index 1. The <code>[</code> operator can be used to extract (or assign) elements in a vector. Integer vectors or logical vectors can be used to extract values.</p>
Expand Down Expand Up @@ -1973,8 +1978,18 @@ <h3 id="exercise">Exercise:</h3>
<p><em>What is the total area occupied by the 10 smallest states? What is the total area occupied by the 10 largest states?</em></p>
<div class="layout-chunk" data-layout="l-body">
<div class="sourceCode">
<pre class="sourceCode r"><code class="sourceCode r"><span><span class='co'># hint use the `sort()` function </span></span></code></pre>
<pre class="sourceCode r"><code class="sourceCode r"><span><span class='co'># hint use the `sort()` function </span></span>
<span><span class='fu'><a href='https://rdrr.io/r/base/sum.html'>sum</a></span><span class='op'>(</span><span class='fu'><a href='https://rdrr.io/r/base/sort.html'>sort</a></span><span class='op'>(</span><span class='va'>state.area</span><span class='op'>)</span><span class='op'>[</span><span class='fl'>1</span><span class='op'>:</span><span class='fl'>10</span><span class='op'>]</span><span class='op'>)</span></span></code></pre>
</div>
<pre><code>[1] 84494</code></pre>
<div class="sourceCode">
<pre class="sourceCode r"><code class="sourceCode r"><span><span class='fu'><a href='https://rdrr.io/r/base/sum.html'>sum</a></span><span class='op'>(</span><span class='fu'><a href='https://rdrr.io/r/base/sort.html'>sort</a></span><span class='op'>(</span><span class='va'>state.area</span><span class='op'>)</span><span class='op'>[</span><span class='fl'>41</span><span class='op'>:</span><span class='fl'>50</span><span class='op'>]</span><span class='op'>)</span></span></code></pre>
</div>
<pre><code>[1] 1808184</code></pre>
<div class="sourceCode">
<pre class="sourceCode r"><code class="sourceCode r"><span><span class='fu'><a href='https://rdrr.io/r/base/sum.html'>sum</a></span><span class='op'>(</span><span class='fu'><a href='https://rdrr.io/r/base/sort.html'>sort</a></span><span class='op'>(</span><span class='va'>state.area</span>, decreasing <span class='op'>=</span> <span class='cn'>TRUE</span><span class='op'>)</span><span class='op'>[</span><span class='fl'>1</span><span class='op'>:</span><span class='fl'>10</span><span class='op'>]</span><span class='op'>)</span></span></code></pre>
</div>
<pre><code>[1] 1808184</code></pre>
</div>
<h3 id="using-vectors-and-subsetting-to-perform-more-complex-operations">Using vectors and subsetting to perform more complex operations</h3>
<p><em>What if we wanted to know which states have an area greater than 100,000 (square miles)?</em></p>
Expand Down Expand Up @@ -2041,12 +2056,25 @@ <h3 id="exercise-1">Exercise:</h3>
<p>Let’s answer a related question, how <strong>many</strong> states are larger than 100,000 square miles?</p>
<div class="layout-chunk" data-layout="l-body">
<div class="sourceCode">
<pre class="sourceCode r"><code class="sourceCode r"><span><span class='co'># multiple approaches will work</span></span></code></pre>
<pre class="sourceCode r"><code class="sourceCode r"><span><span class='co'># multiple approaches will work</span></span>
<span><span class='fu'><a href='https://rdrr.io/r/base/length.html'>length</a></span><span class='op'>(</span><span class='fu'><a href='https://rdrr.io/r/base/which.html'>which</a></span><span class='op'>(</span><span class='va'>state.area</span> <span class='op'>&gt;</span> <span class='fl'>100000</span><span class='op'>)</span><span class='op'>)</span></span></code></pre>
</div>
<pre><code>[1] 8</code></pre>
<div class="sourceCode">
<pre class="sourceCode r"><code class="sourceCode r"><span><span class='fu'><a href='https://rdrr.io/r/base/sum.html'>sum</a></span><span class='op'>(</span><span class='va'>state.area</span> <span class='op'>&gt;</span> <span class='fl'>100000</span><span class='op'>)</span></span></code></pre>
</div>
<pre><code>[1] 8</code></pre>
</div>
<p>Using the sum() function works because TRUE is stored as 1 and FALSE is stored as 0.</p>
<div class="layout-chunk" data-layout="l-body">

<div class="sourceCode">
<pre class="sourceCode r"><code class="sourceCode r"><span><span class='fu'><a href='https://rdrr.io/r/base/integer.html'>as.integer</a></span><span class='op'>(</span><span class='fu'><a href='https://rdrr.io/r/base/c.html'>c</a></span><span class='op'>(</span><span class='cn'>TRUE</span>, <span class='cn'>FALSE</span>, <span class='cn'>TRUE</span><span class='op'>)</span><span class='op'>)</span></span></code></pre>
</div>
<pre><code>[1] 1 0 1</code></pre>
<div class="sourceCode">
<pre class="sourceCode r"><code class="sourceCode r"><span><span class='fu'><a href='https://rdrr.io/r/base/sum.html'>sum</a></span><span class='op'>(</span><span class='fu'><a href='https://rdrr.io/r/base/c.html'>c</a></span><span class='op'>(</span><span class='cn'>TRUE</span>, <span class='cn'>FALSE</span>, <span class='cn'>TRUE</span><span class='op'>)</span><span class='op'>)</span></span></code></pre>
</div>
<pre><code>[1] 2</code></pre>
</div>
<h3 id="replacing-or-adding-values-at-position">Replacing or adding values at position</h3>
<p>Values in a vector can be also replaced or added by assignment at specific indexes. In this case the bracket <code>[</code> notation is left of the assignment operator <code>&lt;-</code>. You can read this as assign value on right to positions in the object on the left.</p>
Expand Down Expand Up @@ -2090,9 +2118,9 @@ <h1 id="r-operations-are-vectorized">R operations are vectorized</h1>
<pre><code>[1] 0.0000000 0.6931472 1.0986123 1.3862944 1.6094379</code></pre>
</div>
<p>If you are used to programming in other languages (e.g <code>C</code> or <code>python</code>) you might have written a <code>for</code> loop to do the same, something like this.</p>
<div class="sourceCode" id="cb64"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb64-1"><a href="#cb64-1" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> (i <span class="cf">in</span> x) { </span>
<span id="cb64-2"><a href="#cb64-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">log</span>(i)</span>
<span id="cb64-3"><a href="#cb64-3" aria-hidden="true" tabindex="-1"></a>}</span></code></pre></div>
<div class="sourceCode" id="cb70"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb70-1"><a href="#cb70-1" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> (i <span class="cf">in</span> x) { </span>
<span id="cb70-2"><a href="#cb70-2" aria-hidden="true" tabindex="-1"></a> <span class="fu">log</span>(i)</span>
<span id="cb70-3"><a href="#cb70-3" aria-hidden="true" tabindex="-1"></a>}</span></code></pre></div>
<p>In R this is generally not necessary. The built in vectorization saves typing and makes for very compact and efficient code in R. You can write <code>for</code> loops in R (more on this later in the course) however using the built in vectorization is generally a faster and easier to read solution.</p>
<h2 id="review">Review</h2>
<p>To review today’s material, do the following:</p>
Expand All @@ -2104,7 +2132,7 @@ <h2 class="appendix" id="acknowledgements-and-additional-references">Acknowledge
<a href="https://r4ds.had.co.nz/index.html" class="uri">https://r4ds.had.co.nz/index.html</a>
<a href="https://bookdown.org/rdpeng/rprogdatascience/" class="uri">https://bookdown.org/rdpeng/rprogdatascience/</a>
<a href="http://adv-r.had.co.nz/Style.html" class="uri">http://adv-r.had.co.nz/Style.html</a></p>
<div class="sourceCode" id="cb65"><pre class="sourceCode r distill-force-highlighting-css"><code class="sourceCode r"></code></pre></div>
<div class="sourceCode" id="cb71"><pre class="sourceCode r distill-force-highlighting-css"><code class="sourceCode r"></code></pre></div>
<!--radix_placeholder_article_footer-->
<!--/radix_placeholder_article_footer-->
</div>
Expand Down
6 changes: 3 additions & 3 deletions ex/problem-set-1/ps-1.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -101,9 +101,9 @@ What is the total area of states whose abbreviation starts with `C`?
The state.x77 dataset containing various statistics on each state. Answer the following in
text or code. Consider reading the documentation for assistance (`?state.x77`).

What data structure (e.g. vector, data.frame, matrix, array, list, etc.) and what data type(s) (e.g. logical, numeric/double, character) are in the `state.x77` object ? (1 point)
What data structure (e.g. vector, matrix, array, list, etc.) and what data type(s) (e.g. logical, numeric/double, character) are in the `state.x77` object ? (1 point)

Convert the state.x77 to a data.frame and answer the following:
Convert the state.x77 to a data.frame, if needed, and answer the following:

How many states have more than 120 days of frost on average (see the `Frost` column)?

Expand All @@ -113,7 +113,7 @@ How many states have more than 120 days of frost on average (see the `Frost` col

### q10

What is the average population of Western states? (consider using the state.region vector to assist)
What is the average population of the Western states? (consider using the state.region vector to assist)

```{r}
Expand Down

0 comments on commit 1ccd202

Please sign in to comment.