+<p>Data import is crucial and can be a time-consuming step in quantitative/computational research (maybe especially in R). In the previous tutorial and problem set you needed to load a library/package in R. Many packages come with datasets pre-installed that you will use for assignments in the course and/or to try out example code. You will also need to learn how to import datasets from the web and locally from files stored on your computer. Here are examples of each.</p>
+<div id="loading-a-dataset-from-an-r-package" class="section level4">
+<h4>Loading a dataset from an R package</h4>
+<p>Let’s find the <code>email50</code> dataset that’s included in the <code>openintro</code> package provided by the textbook authors. First, I’ll load the library, then I can use the <code>data()</code> command to call the dataset.</p>
+<pre class="r"><code>library(openintro)</code></pre>
+<pre><code>## Loading required package: airports</code></pre>
+<pre><code>## Loading required package: cherryblossom</code></pre>
+<pre><code>## Loading required package: usdata</code></pre>
+<pre class="r"><code>data(email50)
+
+## Take a look at the first few rows of the email50 dataset
+head(email50)</code></pre>
+<pre><code>## # A tibble: 6 x 21
+## spam to_multiple from cc sent_email time image attach
+## <dbl> <dbl> <dbl> <int> <dbl> <dttm> <dbl> <dbl>
+## 1 0 0 1 0 1 2012-01-04 07:19:16 0 0
+## 2 0 0 1 0 0 2012-02-16 14:10:06 0 0
+## 3 1 0 1 4 0 2012-01-04 09:36:23 0 2
+## 4 0 0 1 0 0 2012-01-04 11:49:52 0 0
+## 5 0 0 1 0 0 2012-01-27 03:34:45 0 0
+## 6 0 0 1 0 0 2012-01-17 11:31:57 0 0
+## # … with 13 more variables: dollar <dbl>, winner <fct>, inherit <dbl>,
+## # viagra <dbl>, password <dbl>, num_char <dbl>, line_breaks <int>,
+## # format <dbl>, re_subj <dbl>, exclaim_subj <dbl>, urgent_subj <dbl>,
+## # exclaim_mess <dbl>, number <fct></code></pre>
+</div>
+<div id="loading-a-dataset-from-the-web" class="section level4">
+<h4>Loading a dataset from the web</h4>
+<p>This gets a bit more complicated because you have to use the <code>url()</code> command to tell R the address you want to use, then you will need to use a second command to actually import the dataset file. In this case, I’m going to point to another dataset provided by the OpenIntro authors containing NOAA temperature information (<a href="https://www.openintro.org/data/index.php?data=climate70">more information about the dataset is available on the OpenIntro website</a>). The format for the file is <code>.rda</code> which is one of several common R dataset file format suffixes (another one is .rdata) and R you’ll usually use the <code>load()</code> command to import an .rda or .rdata file.</p>
+<pre class="r"><code>load(url("https://www.openintro.org/data/rda/climate70.rda"))
+
+## Again, check out the first few rows to see what you've got.
+head(climate70)</code></pre>
+<pre><code>## station latitude longitude dx70_1948 dx70_2018 dx90_1948 dx90_2018
+## 1 USC00203823 41.93520 -84.64110 131 147 11 16
+## 2 USC00276818 44.25800 -71.25250 80 99 1 1
+## 3 USC00186620 39.41317 -79.40025 143 150 4 1
+## 4 USC00331890 40.24030 -81.87100 156 158 18 15
+## 5 USC00235987 37.83950 -94.37400 216 175 59 51
+## 6 USC00395691 45.56550 -100.44880 138 132 39 18</code></pre>
+</div>
+<div id="loading-a-dataset-stored-locally" class="section level4">
+<h4>Loading a dataset stored locally</h4>
+<p>Loading from local storage is last because, ironically, it may be the least intuitive. The best practice here is to use an <a href="https://en.wikipedia.org/wiki/Path_%28computing%29">absolute path</a> to point R to the unique location on your computer where the file in question is stored. In the example below, my code reflects the operating system and directory structure of my laptop. Your computer will likely (I assume/hope!) use something quite different. Nevertheless, I am providing an example because I think you may be able to work with it and it can at least provide a demonstration that we can talk about later on.</p>
+<pre class="r"><code>load("/home/ads/Documents/Teaching/2020/stats/data/week_03/group_07.RData")
+
+ls() ## list objects in my global environment</code></pre>
+<pre><code>## [1] "climate70" "d" "email50"</code></pre>
+<pre class="r"><code>head(d) ## and inspect the first few rows of the new object</code></pre>
+<pre><code>## [1] -2452.018457 2.637751 3.241824 1.183585 15746.070789
+## [6] 65.013141</code></pre>
+</div>