+ summarize(
+ n_stops = n(),
+ prop_total_stops = round(n() / nrow(ilstops), digits = 3),
+ )</code></pre>
+<pre><code>## # A tibble: 5 x 3
+## subject_race n_stops prop_total_stops
+## <fct> <int> <dbl>
+## 1 asian/pacific islander 4053 0.032
+## 2 black 25627 0.202
+## 3 hispanic 16940 0.133
+## 4 other 335 0.003
+## 5 white 80105 0.63</code></pre>
+<p>In that block I first make a call to <code>group_by()</code> to tell R that I want it to run subsequent commands on the data “grouped” within the categories of <code>subject_race</code>. Then I pipe the grouped data to <code>summarize()</code>, which I use to calculate the number of stops within each group (in this data that’s just the number of observations within each group) as well as the proportion of total stops within each group.</p>
+<p>What about counting up the number and proportion of searches within each group? One way to think about that is as another call to <code>summarize()</code> (since, after all, I want to calculate the summary information for searches within the same groups). Within the Tidyverse approach to things, this kind of summarizing within groups and within another variable (<code>search_conducted</code> in this case) can be accomplished with the <code>across()</code> function.</p>
+<p>In general, the <code>across()</code> function seems to usually be made within a call to another verb like <code>summarize()</code> or <code>mutate()</code>. The syntax for <code>across()</code> is similar to these others. It requires two things: (1) at least one variable to summarize across (<code>search_conducted</code> here) and (2) the outputs I want.</p>
+<p>In this particular case, I’ll use it to calculate the within group sums of <code>search_conducted</code>. Notice that I also filter out the missing values from <code>search_conducted</code> before I call <code>summarize</code> here.</p>
+<pre class="r"><code>ilstops %>%
+ group_by(subject_race) %>%
+ filter(!is.na(subject_race), !is.na(search_conducted)) %>%
+ summarize(
+ across(search_conducted, sum)
+ )</code></pre>
+<pre><code>## # A tibble: 5 x 2
+## subject_race search_conducted
+## <fct> <int>
+## 1 asian/pacific islander 68
+## 2 black 1806
+## 3 hispanic 1049
+## 4 other 14
+## 5 white 3010</code></pre>
+<p>If I want <code>across()</code> to calculate more than one summary, I need to provide it a list of things (in a <code>name = value</code> format sort of similar to <code>summarize()</code> or <code>mutate()</code>).</p>
+<pre class="r"><code>ilstops %>%
+ group_by(subject_race) %>%
+ filter(!is.na(subject_race) & !is.na(search_conducted)) %>%
+ summarize(
+ across(
+ search_conducted,
+ list(
+ sum = sum,
+ over_n_stops = mean
+ )
+ )
+ )</code></pre>
+<pre><code>## # A tibble: 5 x 3
+## subject_race search_conducted_sum search_conducted_over_n_stops
+## <fct> <int> <dbl>
+## 1 asian/pacific islander 68 0.0168
+## 2 black 1806 0.0707
+## 3 hispanic 1049 0.0620
+## 4 other 14 0.0419
+## 5 white 3010 0.0376</code></pre>
+<p>I can clean this up a bit by using two functions to the output in descending order by one of the columns. I do this with a nested call to two functions <code>arrange()</code> and <code>desc()</code>. I can also insert my earlier summary statistics for the number and proportions of stops by group back into the table.</p>
+<pre class="r"><code>ilstops %>%
+ group_by(subject_race) %>%
+ filter(!is.na(subject_race) & !is.na(search_conducted)) %>%