# # Writing functions in R

## # Anonymous functions

An anonymous function is, as the name implies, not assigned a name. This can be useful when the function is a part of a larger operation, but in itself does not take much place.
One frequent use-case for anonymous functions is within the `*apply`

family of Base functions.

Calculate the root mean square for each column in a `data.frame`

:

```
df <- data.frame(first=5:9, second=(0:4)^2, third=-1:3)
apply(df, 2, function(x) { sqrt(sum(x^2)) })
first second third
15.968719 18.814888 3.872983
```

Create a sequence of step-length one from the smallest to the largest value for each row in a matrix.

```
x <- sample(1:6, 12, replace=TRUE)
mat <- matrix(x, nrow=3)
apply(mat, 1, function(x) { seq(min(x), max(x)) })
```

An anonymous function can also stand on its own:

```
(function() { 1 })()
[1] 1
```

is equivalent to

```
f <- function() { 1 })
f()
[1] 1
```

## # RStudio code snippets

This is just a small hack for those who use self-defined functions often.

Type "fun" RStudio IDE and hit TAB.

The result will be a skeleton of a new function.

```
name <- function(variables) {
}
```

One can easily define their own snippet template, i.e. like the one below

```
name <- function(df, x, y) {
require(tidyverse)
out <-
return(out)
}
```

The option is `Edit Snippets`

in the `Global Options -> Code`

menu.

## # Named functions

R is full of functions, it is after all a functional programming language (opens new window), but sometimes the precise function you need isn't provided in the Base resources. You could conceivably install a package (opens new window) containing the function, but maybe your requirements are just so specific that no pre-made function fits the bill? Then you're left with the option of making your own.

A function can be very simple, to the point of being being pretty much pointless. It doesn't even need to take an argument:

```
one <- function() { 1 }
one()
[1] 1
two <- function() { 1 + 1 }
two()
[1] 2
```

What's between the curly braces `{ }`

is the function proper. As long as you can fit everything on a single line they aren't strictly needed, but can be useful to keep things organized.

A function can be very simple, yet highly specific. This function takes as input a vector (`vec`

in this example) and outputs the same vector with the vector's length (6 in this case) subtracted from each of the vector's elements.

```
vec <- 4:9
subtract.length <- function(x) { x - length(x) }
subtract.length(vec)
[1] -2 -1 0 1 2 3
```

Notice that `length()`

is in itself a pre-supplied (i.e. **Base**) function. You can of course use a previously self-made function within another self-made function, as well as assign variables and perform other operations while spanning several lines:

```
vec2 <- (4:7)/2
msdf <- function(x, multiplier=4) {
mult <- x * multiplier
subl <- subtract.length(x)
data.frame(mult, subl)
}
msdf(vec2, 5)
mult subl
1 10.0 -2.0
2 12.5 -1.5
3 15.0 -1.0
4 17.5 -0.5
```

`multiplier=4`

makes sure that `4`

is the default value of the argument `multiplier`

, if no value is given when calling the function `4`

is what will be used.

The above are all examples of **named** functions, so called simply because they have been given names (`one`

, `two`

, `subtract.length`

etc.)

## # Passing column names as argument of a function

Sometimes one would like to pass names of columns from a data frame to a function. They may be provided as strings and used in a function using `[[`

. Let's take a look at the following example, which prints to R console basic stats of selected variables:

```
basic.stats <- function(dset, vars){
for(i in 1:length(vars)){
print(vars[i])
print(summary(dset[[vars[i]]]))
}
}
basic.stats(iris, c("Sepal.Length", "Petal.Width"))
```

As a result of running above given code, names of selected variables and their basic summary statistics (minima, first quantiles, medians, means, third quantiles and maxima) are printed in R console. The code `dset[[vars[i]]]`

selects i-th element from the argument `vars`

and selects a corresponding column in declared input data set `dset`

. For example, declaring `iris[["Sepal.Length"]]`

alone would print the `Sepal.Length`

column from the `iris`

data set as a vector.