Lec 02 - Logic and types in R

class: center, middle, inverse, title-slide

# Lec 02 - Logic and types in R
## <br/> Statistical Programming
### Sem 1, 2020
### <br/> Dr. Colin Rundel

---

exclude: true

---
class: middle
count: false

# In R (almost) <br/> everything is a vector

---

## Vectors

The fundamental building block of data in R are vectors (collections of related values, objects, data structures, functions, etc). R has two types of vectors:

* **atomic** vectors (*vectors*)

- homogeneous collections of the *same* type (e.g. all `true`/`false` values, all numbers, or all character strings).

* **generic** vectors (*lists*)
  
    - heterogeneous collections of *any* type of R object, even other lists <br/> 
    (meaning they can have a hierarchical/tree-like structure).

---
class: middle
count: false

# Atomic Vectors

---

## Atomic Vectors

R has six atomic vector types, we can check the type of any object in R using the `typeof()` function

`typeof()`  |  `mode()`     
:-----------|:------------
logical     |  logical    
double      |  numeric    
integer     |  numeric    
character   |  character  
complex     |  complex    
raw         |  raw

Mode is a higher level abstraction, we will discuss this in more detail later.

---

## Vector types

`logical` - boolean values `TRUE` and `FALSE`
.pull-left[

```r
typeof(TRUE)
```

```
## [1] "logical"
```
]

.pull-right[

```r
mode(TRUE)
```

```
## [1] "logical"
```
]

<br/>

`character` - text strings

<div>

.pull-left[

```r
typeof("hello")
```

```
## [1] "character"
```

```r
typeof('world')
```

```
## [1] "character"
```
]

.pull-right[

```r
mode("hello")
```

```
## [1] "character"
```

```r
mode('world')
```

```
## [1] "character"
```
]

</div>

---

`double` - floating point numerical values (default numerical type)

.pull-left[

```r
typeof(1.33)
```

```
## [1] "double"
```

```r
typeof(7)
```

```
## [1] "double"
```
]

.pull-right[

```r
mode(1.33)
```

```
## [1] "numeric"
```

```r
mode(7)
```

```
## [1] "numeric"
```
]

<br/>

`integer` - integer numerical values (indicated with an `L`)

<div>

.pull-left[

```r
typeof( 7L )
```

```
## [1] "integer"
```

```r
typeof( 1:3 )
```

```
## [1] "integer"
```
]

.pull-right[

```r
mode( 7L )
```

```
## [1] "numeric"
```

```r
mode( 1:3 )
```

```
## [1] "numeric"
```
]

</div>

---

## Concatenation

Atomic vectors can be constructed using the concatenate `c()` function.

```r
c(1, 2, 3)
```

```
## [1] 1 2 3
```

```r
c("Hello", "World!")
```

```
## [1] "Hello"  "World!"
```

```r
c(1, 1:10)
```

```
##  [1]  1  1  2  3  4  5  6  7  8  9 10
```

```r
c(1,c(2, c(3)))
```

```
## [1] 1 2 3
```

**Note** - atomic vectors are *always* flat.

---
class: split-thirds

## Inspecting types

* `typeof(x)` - returns a character vector (length 1) of the *type* of object `x`.

* `mode(x)` - returns a character vector (length 1) of the *mode* of object `x`.

.pull-left[

```r
typeof(1)
```

```
## [1] "double"
```

```r
typeof(1L)
```

```
## [1] "integer"
```

```r
typeof("A")
```

```
## [1] "character"
```

```r
typeof(TRUE)
```

```
## [1] "logical"
```
]

.pull-right[

```r
mode(1)
```

```
## [1] "numeric"
```

```r
mode(1L)
```

```
## [1] "numeric"
```

```r
mode("A")
```

```
## [1] "character"
```

```r
mode(TRUE)
```

```
## [1] "logical"
```
]

---

## Type Predicates

* `is.logical(x)`   - returns `TRUE` if `x` has *type* `logical`.
* `is.character(x)` - returns `TRUE` if `x` has *type* `character`.
* `is.double(x)`    - returns `TRUE` if `x` has *type* `double`.
* `is.integer(x)`   - returns `TRUE` if `x` has *type* `integer`.
* `is.numeric(x)`   - returns `TRUE` if `x` has *mode* `numeric`.

.col3_left[

```r
is.integer(1)
```

```
## [1] FALSE
```

```r
is.integer(1L)
```

```
## [1] TRUE
```

```r
is.integer(3:7)
```

```
## [1] TRUE
```
]

.col3_mid[

```r
is.double(1)
```

```
## [1] TRUE
```

```r
is.double(1L)
```

```
## [1] FALSE
```

```r
is.double(3:8)
```

```
## [1] FALSE
```
]

.col3_right[

```r
is.numeric(1)
```

```
## [1] TRUE
```

```r
is.numeric(1L)
```

```
## [1] TRUE
```

```r
is.numeric(3:7)
```

```
## [1] TRUE
```
]

---

## Other useful predicates

* `is.atomic(x)` - returns `TRUE` if `x` is an *atomic vector*.
* `is.list(x)` - returns `TRUE` if `x` is a *list*.
* `is.vector(x)` - returns `TRUE` if `x` is either an *atomic vector* or *list*.

.pull-left[

```r
is.atomic(c(1,2,3))
```

```
## [1] TRUE
```

```r
is.list(c(1,2,3))
```

```
## [1] FALSE
```

```r
is.vector(c(1,2,3))
```

```
## [1] TRUE
```
]

.pull-right[

```r
is.atomic(list(1,2,3))
```

```
## [1] FALSE
```

```r
is.list(list(1,2,3))
```

```
## [1] TRUE
```

```r
is.vector(list(1,2,3))
```

```
## [1] TRUE
```
]

---

## Type Coercion

R is a dynamically typed language -- it will automatically convert between most types without raising warnings or errors. Keep in mind the rule that atomic vectors must always contain values of the same type.

```r
c(1, "Hello")
```

```
## [1] "1"     "Hello"
```

.top-pad[]

```r
c(FALSE, 3L)
```

```
## [1] 0 3
```

.top-pad[]

```r
c(1.2, 3L)
```

```
## [1] 1.2 3.0
```

---

## Operator coercion

Operators and functions will also attempt to coerce values to an appropriate type for the given operation

<div>
.pull-left[

```r
3.1+1L
```

```
## [1] 4.1
```

```r
5 + FALSE
```

```
## [1] 5
```
]

.pull-right[

```r
log(1)
```

```
## [1] 0
```

```r
log(TRUE)
```

```
## [1] 0
```
]
</div>

.pull-left[

```r
TRUE & FALSE
```

```
## [1] FALSE
```

```r
TRUE & 7
```

```
## [1] TRUE
```
]

.pull-right[

```r
TRUE | FALSE
```

```
## [1] TRUE
```

```r
FALSE | !5
```

```
## [1] FALSE
```
]

---

## Explicit Coercion

Most of the `is` functions we just saw have an `as` variant which can be used for *explicit* coercion.

.pull-left[

```r
as.logical(5.2)
```

```
## [1] TRUE
```

```r
as.character(TRUE)
```

```
## [1] "TRUE"
```

```r
as.integer(pi)
```

```
## [1] 3
```
]

.pull-right[

```r
as.numeric(FALSE)
```

```
## [1] 0
```

```r
as.double("7.2")
```

```
## [1] 7.2
```

```r
as.double("one")
```

```
## Warning: NAs introduced by coercion
```

```
## [1] NA
```
]

---
count: false
class: middle

# Conditionals

---

## Logical (boolean) operators

|  Operator                     |  Operation    |  Vectorized? 
|:-----------------------------:|:-------------:|:------------:
| <code>x &#124; y</code>       |  or           |   Yes        
| `x & y`                       |  and          |   Yes        
| `!x`                          |  not          |   Yes        
| <code>x &#124;&#124; y</code> |  or           |   No         
| `x && y`                      |  and          |   No         
|`xor(x, y)`                    |  exclusive or |   Yes

---

## Vectorized?

```r
x = c(TRUE,FALSE,TRUE)
y = c(FALSE,TRUE,TRUE)
```

.pad-top[]

.pull-left[

```r
x | y
```

```
## [1] TRUE TRUE TRUE
```

```r
x || y
```

```
## [1] TRUE
```
]

.pull-right[

```r
x & y
```

```
## [1] FALSE FALSE  TRUE
```

```r
x && y
```

```
## [1] FALSE
```
]

.footnote[
**Note** both `||` and `&&` only use the *first* value in the vector, all other values are ignored, there is no warning about the ignored values.
]
---

## Vectorization and math

Almost all of the basic mathematical operations (and many other functions) in R are vectorized.

.pull-left[

```r
c(1, 2, 3) + c(3, 2, 1)
```

```
## [1] 4 4 4
```

```r
c(1, 2, 3) / c(3, 2, 1)
```

```
## [1] 0.3333333 1.0000000 3.0000000
```
]

.pull-right[

```r
log(c(1, 3, 0))
```

```
## [1] 0.000000 1.098612     -Inf
```

```r
sin(c(1, 2, 3))
```

```
## [1] 0.8414710 0.9092974 0.1411200
```
]

---

## Length coercion

```r
x = c(TRUE, FALSE, TRUE)
y = c(TRUE)
z = c(FALSE, TRUE)
```

.pad-top[]

.pull-left[

```r
x | y
```

```
## [1] TRUE TRUE TRUE
```

```r
x & y
```

```
## [1]  TRUE FALSE  TRUE
```
]

.pull-right[

```r
y | z
```

```
## [1] TRUE TRUE
```

```r
y & z
```

```
## [1] FALSE  TRUE
```
]

<div/>

<br/>

.pad-top[]

```r
x | z
```

```
## Warning in x | z: longer object length is not a multiple of shorter object
## length
```

```
## [1] TRUE TRUE TRUE
```

---

## Comparisons

---

## Comparisons

```r
x = c("A","B","C")
z = c("A")
```

.pad-top[]

.pull-left[

```r
x == z
```

```
## [1]  TRUE FALSE FALSE
```

```r
x != z
```

```
## [1] FALSE  TRUE  TRUE
```

```r
x > z
```

```
## [1] FALSE  TRUE  TRUE
```
]

.pull-right[

```r
x %in% z
```

```
## [1]  TRUE FALSE FALSE
```

```r
z %in% x
```

```
## [1] TRUE
```
]

---

## Conditional Control Flow

Conditional execution of code blocks is achieved via `if` statements.

```r
x = c(1,3)
```

.pad-top[]

```r
if (3 %in% x)
  print("This!")
```

```
## [1] "This!"
```

.pad-top[]

```r
if (1 %in% x)
  print("That!")
```

```
## [1] "That!"
```

.pad-top[]

```r
if (5 %in% x)
  print("Other!")
```

---

## `if` is not vectorized

```r
x = c(1,3)
```

.pad-top[]

```r
if (x == 1)
  print("x is 1!")
```

```
## Warning in if (x == 1) print("x is 1!"): the condition has length > 1 and only
## the first element will be used
```

```
## [1] "x is 1!"
```

.pad-top[]

```r
if (x == 3)
  print("x is 3!")
```

```
## Warning in if (x == 3) print("x is 3!"): the condition has length > 1 and only
## the first element will be used
```

---

## Collapsing logical vectors

There are a couple of helper functions for collapsing a logical vector down to a single value: `any`, `all`

```r
x = c(3,4,1)
```

.top-pad[]

.pull-left[

```r
x >= 2
```

```
## [1]  TRUE  TRUE FALSE
```

```r
any(x >= 2)
```

```
## [1] TRUE
```

```r
all(x >= 2)
```

```
## [1] FALSE
```
]

.pull-right[

```r
x <= 4
```

```
## [1] TRUE TRUE TRUE
```

```r
any(x <= 4)
```

```
## [1] TRUE
```

```r
all(x <= 4)
```

```
## [1] TRUE
```
]

<div/>

<br/>

```r
if (any(x == 3)) 
  print("x contains 3!")
```

```
## [1] "x contains 3!"
```

---

## Nesting Conditionals

.pull-left[

```r
x = 3

if (x < 0) {
  "x is negative"
} else if (x > 0) {
  "x is positive"
} else {
  "x is zero"
}
```

```
## [1] "x is positive"
```
]

.pull-right[

```r
x = 0

if (x < 0) {
  "x is negative"
} else if (x > 0) {
  "x is positive"
} else {
  "x is zero"
}
```

```
## [1] "x is zero"
```
]

---
class: middle
count: false

# Error Checking

---

## `stop` and `stopifnot`

Often we want to validate user input or function arguments - if our assumptions are not met then we often want to report the error and stop execution.

```r
ok = FALSE
if (!ok)
  stop("Things are not ok.")
```

```
## Error in eval(expr, envir, enclos): Things are not ok.
```

.pad-top[]

```r
stopifnot(ok)
```

```
## Error: ok is not TRUE
```

.pad-top[]

```r
stopifnot(is.logical(ok))
```

.pad-top[]

```r
stopifnot(is.logical(ok+0))
```

```
## Error: is.logical(ok + 0) is not TRUE
```

---

## Style choices

Simple is usually better than complicated - generally it is better to have fewer clauses and have the more important conditions first (e.g. failure conditions)

.pull-left[
Do stuff (ok):

```r
if (condition_one) {
  ##
  ## Do stuff
  ##
} else if (condition_two) {
  ##
  ## Do other stuff
  ##
} else if (condition_error) {
  stop("Condition error occured")
}
```
]

.pull-right[
Do stuff (better):

```r
# Do stuff better
if (condition_error) {
  stop("Condition error occured")
}

if (condition_one) {
  ##
  ## Do stuff
  ##
} else if (condition_two) {
  ##
  ## Do other stuff
  ##
}
```
]

---
class: middle, center

# Missing Values

---

## Missing Values

R uses `NA` to represent missing values in its data structures, what may not be obvious is that there are different `NA`s for the different types.

.pull-left[

```r
typeof(NA)
```

```
## [1] "logical"
```

```r
typeof(NA+1)
```

```
## [1] "double"
```

```r
typeof(NA+1L)
```

```
## [1] "integer"
```
]

.pull-right[

```r
typeof(NA_character_)
```

```
## [1] "character"
```

```r
typeof(NA_real_)
```

```
## [1] "double"
```

```r
typeof(NA_integer_)
```

```
## [1] "integer"
```
]

---

## NA contageon

Because `NA`s represent missing values it makes sense that any calculation using them should also be missing.

.pull-left[

```r
1 + NA
```

```
## [1] NA
```

```r
1 / NA
```

```
## [1] NA
```

```r
NA * 5
```

```
## [1] NA
```
]

.pull-right[

```r
mean(c(1, 2, 3, NA))
```

```
## [1] NA
```

```r
sqrt(NA)
```

```
## [1] NA
```

```r
3^NA
```

```
## [1] NA
```
]

---

## NAs are not always contageous

A useful mental model for `NA`s is to consider them as a unknown value that could take any of the possible values for that type.

For numbers or characters this isn't very helpful, but for a logical value we know that the value must either be `TRUE` or `FALSE` and we can use that when deciding what value to return.

```r
TRUE & NA
```

```
## [1] NA
```

```r
FALSE & NA
```

```
## [1] FALSE
```

```r
TRUE | NA
```

```
## [1] TRUE
```

```r
FALSE | NA
```

```
## [1] NA
```

---

## Conditionals and missing values

`NA`s can be problematic in some cases (particularly for control flow)

```r
1 == NA
```

```
## [1] NA
```

.pad-top[]

```r
if (2 != NA)
  "Here"
```

```
## Error in if (2 != NA) "Here": missing value where TRUE/FALSE needed
```

.pad-top[]

```r
if (all(c(1,2,NA,4) >= 1))
  "There"
```

```
## Error in if (all(c(1, 2, NA, 4) >= 1)) "There": missing value where TRUE/FALSE needed
```

.pad-top[]

```r
if (any(c(1,2,NA,4) >= 1))
  "There"
```

```
## [1] "There"
```

---

## Testing for `NA`

To explicitly test if a value is missing it is necessary to use `is.na` (often along with `any` or `all`).

.pull-left[

```r
NA == NA
```

```
## [1] NA
```

```r
is.na(NA)
```

```
## [1] TRUE
```

```r
is.na(1)
```

```
## [1] FALSE
```
]

.pull-right[

```r
is.na(c(1,2,3,NA))
```

```
## [1] FALSE FALSE FALSE  TRUE
```

```r
any(is.na(c(1,2,3,NA)))
```

```
## [1] TRUE
```

```r
all(is.na(c(1,2,3,NA)))
```

```
## [1] FALSE
```
]

---

## Other Special values (double)

These are defined as part of the IEEE floating point standard (not unique to R)

* `NaN` - Not a number
* `Inf` - Positive infinity
* `-Inf` - Negative infinity

.pull-left[

```r
pi / 0
```

```
## [1] Inf
```

```r
0 / 0
```

```
## [1] NaN
```

```r
1/0 + 1/0
```

```
## [1] Inf
```
]

.pull-right[

```r
1/0 - 1/0
```

```
## [1] NaN
```

```r
NaN / NA
```

```
## [1] NaN
```

```r
NaN * NA
```

```
## [1] NaN
```
]

---

## Testing for `inf` and `NaN`

`NaN` and `Inf` don't have the same testing issues that `NA`s do, but there are still convenience functions for testing for these types of values

.pull-left[

```r
NA
```

```
## [1] NA
```

```r
1/0+1/0
```

```
## [1] Inf
```

```r
1/0-1/0
```

```
## [1] NaN
```

```r
1/0 == Inf
```

```
## [1] TRUE
```

```r
-1/0 == Inf
```

```
## [1] FALSE
```
]

.pull-right[

```r
is.finite(1/0+1/0)
```

```
## [1] FALSE
```

```r
is.finite(1/0-1/0)
```

```
## [1] FALSE
```

```r
is.nan(1/0-1/0)
```

```
## [1] TRUE
```

```r
is.finite(NA)
```

```
## [1] FALSE
```

```r
is.nan(NA)
```

```
## [1] FALSE
```
]

---

## Coercion for infinity and NaN

First remember that `Inf`, `-Inf`, and `NaN` have type double, however their coercion behavior is not the same as for other doubles

```r
as.integer(Inf)
```

```
## Warning: NAs introduced by coercion to integer range
```

```
## [1] NA
```

```r
as.integer(NaN)
```

```
## [1] NA
```

.top-pad[]

.pull-left[

```r
as.logical(Inf)
```

```
## [1] TRUE
```

```r
as.logical(NaN)
```

```
## [1] NA
```
]

.pull-right[

```r
as.character(Inf)
```

```
## [1] "Inf"
```

```r
as.character(NaN)
```

```
## [1] "NaN"
```
]

---

## Exercise 1

**Part 1**

What is the type of the following vectors? Explain why they have that type.

* `c(1, NA+1L, "C")`
* `c(1L / 0, NA)`
* `c(1:3, 5)`
* `c(3L, NaN+1L)`
* `c(NA, TRUE)`

**Part 2**

Considering only the four (common) data types, what is R's implicit type conversion hierarchy (from highest priority to lowest priority)?

*Hint* - think about the pairwise interactions between types.

---
class: middle
count: false

# Loops

---

## for loops

Simplest, and most common type of loop in R - given a vector iterate through the elements and evaluate the code block for each.

```r
res = c()
for(x in 1:10) {
  res = c(res, x^2)
}
res
```

```
##  [1]   1   4   9  16  25  36  49  64  81 100
```

.pad-top[]

```r
res = c()
for(y in list(1:3, LETTERS[1:7], c(TRUE,FALSE))) {
  res = c(res, length(y))
}
res
```

```
## [1] 3 7 2
```

---

## `while` loops

Repeat until the given condition is **not** met (i.e. evaluates to `FALSE`)

```r
i = 1
res = rep(NA,10)

while (i <= 10) {
  res[i] = i^2
  i = i+1
}

res
```

```
##  [1]   1   4   9  16  25  36  49  64  81 100
```

---

## `repeat` loops

Repeat the loop until a `break` is encountered

```r
i = 1
res = rep(NA,10)

repeat {
  res[i] = i^2
  i = i+1
  if (i > 10)
    break
}

res
```

```
##  [1]   1   4   9  16  25  36  49  64  81 100
```

---
class: split-50

## Special keywords - `break` and `next`

These are special actions that only work *inside* of a loop

* `break` - ends the current **loop** (inner-most)
* `next` - ends the current **iteration**

.pull-left[

```r
res = c()
for(i in 1:10) {
    if (i %% 2 == 0)
        break
    res = c(res, i)
    print(res)
}
```

```
## [1] 1
```
]

.pull-right[

```r
res = c()
for(i in 1:10) {
    if (i %% 2 == 0)
        next
    res = c(res,i)
    print(res)
}
```

```
## [1] 1
## [1] 1 3
## [1] 1 3 5
## [1] 1 3 5 7
## [1] 1 3 5 7 9
```
]

---

## Some helpful functions

Often we want to use a loop across the indexes of an object and not the elements themselves. There are several useful functions to help you do this: `:`, `length`, `seq`, `seq_along`, `seq_len`, etc.

.pull-left[

```r
4:7
```

```
## [1] 4 5 6 7
```

```r
length(4:7)
```

```
## [1] 4
```

```r
seq(4,7)
```

```
## [1] 4 5 6 7
```
]

.pull-right[

```r
seq_along(4:7)
```

```
## [1] 1 2 3 4
```

```r
seq_len(length(4:7))
```

```
## [1] 1 2 3 4
```

```r
seq(4,7,by=2)
```

```
## [1] 4 6
```
]

---

## Exercise 2

Below is a vector containing all prime numbers between 2 and 100:

.center[
```r
primes = c( 2,  3,  5,  7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 
      43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97)
```
]

If you were given the vector `x = c(3,4,12,19,23,51,61,63,78)`, write the R code necessary to print only the values of `x` that are *not* prime (without using subsetting or the `%in%` operator).

Your code should use *nested* loops to iterate through the vector of primes and `x`.

---
count: false

# Acknowledgments

Above materials are derived in part from the following sources:

* Hadley Wickham - [Advanced R](http://adv-r.had.co.nz/)
* [R Language Definition](http://stat.ethz.ch/R-manual/R-devel/doc/manual/R-lang.html)