Suppose that you’re camping near a river and you see a campfire that’s getting out of hand in the distance. You have a bucket, but your bucket is empty. You know that you should run to the river with the bucket, get the water, and then run to the fire and put it out. Suppose that when you carry an empty bucket your running speed is 1.5 times of the running speed when the bucket is filled with water. What’s the optimal path? Where along the river should you go to get the water so that you will reach the campfire as quickly as possible?

The geometry is shown in the diagram above. You are at a distance y1 = 10 units away from the river, the campfire is at a distance y2 = 30 units from the river. The distance between you and the campfire along the river (i.e. distance between points H and L) is s = 50 units. You want to choose a path YPC that minimizes that the time t it takes for you to go from Y to P and then to C. Since your speed from Y to P is 1.5 times the speed from P to C, in an appropriate unit the time taken is t = d(YP)/1.5 + d(PC). Using the Pythagorean theorem, we have \(d(YP)=\sqrt{x^2+y_1^2}\) and \(d(PC)=\sqrt{(s-x)^2+y_2^2}\), where x is the distance between H and P. So we want to find the value of x to minimize \[t=\frac{\sqrt{x^2+10^2}}{1.5} + \sqrt{(50-x)^2+30^2} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (1)\] Here we introduce a straightforward (brute-force) method to solve this optimization problem. Suppose we want to find x to two decimal places. We know x must be between 0 and 50. We can try every number of 2 decimal places between 0 and 50, calculate the values of t and find the one x that minimizes t.

# load tidyverse packages
library(tidyverse)

What is the minimum value in the vector t and the corresponding optimal value of x?

In this problem, it is useful to create a function that computes t from x:

compute_t <- function(x) {
  sqrt(x^2+10^2)/1.5 + sqrt((50-x)^2 + 30^2)
}

We didn’t do it in Week 2 because we hadn’t covered functions at that time.

As before, we use seq(0,50,0.01) to construct the vector x containing all two decimal-place numbers between 0 and 50, use it to construct the t vector and then search for minimum t and the corresponding optimal value of x. With tidyverse, we can put x and t in a tibble and then take the subset of the tibble in which t is minimum:

tibble(x = seq(0,50,0.01), t = compute_t(x)) %>% filter(t==min(t))
# A tibble: 1 x 2
      x        t
  <dbl>    <dbl>
1 26.13 56.98977

So the minimum value of t is 56.99 and the optimal value of x is 26.13 (to two decimal places).

Compare this approach to the following base R solution:

x <- seq(0,50,0.01)
t <- compute_t(x)
i <- which.min(t)
c(x[i],t[i])
[1] 26.13000 56.98977

OR use subsetting:

x <- seq(0,50,0.01)
t <- compute_t(x)
(t_min <- min(t))
[1] 56.98977
(x_optimal <- x[t==t_min])
[1] 26.13

The tidyverse approach is cleaner.

Suppose you want to improve the accuracy of the calculation. Instead of two decimal places, you want to find x accurate to 4 decimal places. Repeat the calculation by trying numbers of 4 decimal places in the appropriate range and search for the minimum.

From the result of the previous question, we know the optimal x is between 26.12 and 26.14 So we try all 4 decimal-place numbers in that interval and then find the minimum. The command is:

tibble(x = seq(26.12,26.14,1e-4), t = compute_t(x)) %>% filter(t==min(t))
# A tibble: 1 x 2
        x        t
    <dbl>    <dbl>
1 26.1298 56.98977

Aside

As mentioned in Week 2, the brute-force approach introduced here is a quick-and-dirty way to solve this problem using techniques you already know.

The best way to solve this optimization problem is to use R’s optimize() function: optimize(f, interval) finds the minimum of a one-variable function f in the interval specified by the vector interval containing the two end-points. For our problem, we can use this command:

optimize(compute_t, c(0,50), tol=1e-6)
$minimum
[1] 26.12982

$objective
[1] 56.98977

The result is consistent with our method above. The tol paremeter controls the accuracy of the optimization search. I set it to 1e-6 because the default value is slightly larger than 1e-4, which is slightly worse than our method above. See ?optimize for more information on this command.

We didn’t mention optimize() in Week 2 because it requires you to provide the compute_t() function and we hadn’t covered functions at that time.

The Pipe Operator %>%

You see that the pipe operator %>% is very useful. You will see it widely used by people using tidyverse. If you think it involves a lot of keyboard typing, you can use the RStudio shortcut Ctrl+Shift+M (Cmd+Shift+M in Mac). While this operator is very cool, it shouldn’t be abused. For example, the following is a bad use of %>%:

set.seed(72671)
x <- rnorm(1000)
x %>% mean()
[1] 0.02292381

There is no reason to use x %>% mean() when you can simply use mean(x). Even

sqrt(sum(abs(x)))
[1] 28.10843

is better than

abs(x) %>% sum() %>% sqrt()
[1] 28.10843

in my opinion, not to mention this:

x %>% abs() %>% sum() %>% sqrt()
[1] 28.10843

Even worse, () can be omitted, resulting in an obscure code:

x %>% abs %>% sum %>% sqrt
[1] 28.10843

However, some people prefer this style of coding. Its structure is like a step-by-step procedure: start with vector x, take absolute value of each element, then sum over the elements, then take the square root.