# Reading and writing strings
# Printing and displaying strings
R has several built-in functions that can be used to print or display information, but
cat are the most basic. As R is an interpreted language (opens new window), you can try these out directly in the R console:
print("Hello World") # "Hello World" cat("Hello World\n") #Hello World
Note the difference in both input and output for the two functions. (Note: there are no quote-characters in the value of
x created with
x <- "Hello World". They are added by
cat takes one or more character vectors as arguments and prints them to the console. If the character vector has a length greater than 1, arguments are separated by a space (by default):
cat(c("hello", "world", "\n")) #hello world
Without the new-line character (
\n) the output would be:
cat("Hello World") #Hello World>
The prompt for the next command appears immediately after the output. (Some consoles such as RStudio's may automatically append a newline to strings that do not end with a newline.)
"Hello World", the result is similar to the output of
cat. However, the character string is quoted and a number
 is output to indicate the first element of a character vector (In this case, the first and only element):
print("Hello World") # "Hello World"
This default print method is also what we see when we simply ask R to print a variable. Note how the output of typing
s is the same as calling
s <- "Hello World" s # "Hello World"
Or even without assigning it to anything:
"Hello World" # "Hello World"
If we add another character string as a second element of the vector (using the
c() function to concatenate the elements together), then the behavior of
print() looks quite a bit different from that of
print(c("Hello World", "Here I am.")) # "Hello World" "Here I am."
Observe that the
c() function does not do string-concatenation. (One needs to use
paste for that purpose.) R shows that the character vector has two elements by quoting them separately. If we have a vector long enough to span multiple lines, R will print the index of the element starting each line, just as it prints
 at the start of the first line.
c("Hello World", "Here I am!", "This next string is really long.") # "Hello World" "Here I am!" # "This next string is really long."
The particular behavior of
If we call
print(1) # 1 print(TRUE) # TRUE
Factor objects get printed in the same fashion as character variables which often creates ambiguity when console output is used to display objects in SO question bodies. It is rare to use
print() is particularly rare (unless you wanted to suppress the appearance of the quotes or view an object that is returned as
invisible by a function), as entering
foo at the console is a shortcut for
print(foo). The interactive console of R is known as a REPL, a "read-eval-print-loop". The
cat function is best saved for special purposes (like writing output to an open file connection). Sometimes it is used inside functions (where calls to
print() are suppressed), however using
cat() inside a function to generate output to the console is bad practice. The preferred method is to
warning() for intermediate messages; they behave similarly to
cat but can be optionally suppressed by the end user. The final result should simply returned so that the user can assign it to store it if necessary.
message("hello world") #hello world suppressMessages(message("hello world"))
# Capture output of operating system command
# Functions which return a character vector
R has two functions for invoking a system command. Both require an additional parameter to capture the output of the system command.
system("top -a -b -n 1", intern = TRUE) system2("top", "-a -b -n 1", stdout = TRUE)
Both return a character vector.
 "top - 08:52:03 up 70 days, 15:09, 0 users, load average: 0.00, 0.00, 0.00"  "Tasks: 125 total, 1 running, 124 sleeping, 0 stopped, 0 zombie"  "Cpu(s): 0.9%us, 0.3%sy, 0.0%ni, 98.7%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st"  "Mem: 12194312k total, 3613292k used, 8581020k free, 216940k buffers"  "Swap: 12582908k total, 2334156k used, 10248752k free, 1682340k cached"  ""  " PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND "  "11300 root 20 0 1278m 375m 3696 S 0.0 3.2 124:40.92 trala "  " 6093 user1 20 0 1817m 269m 1888 S 0.0 2.3 12:17.96 R "  " 4949 user2 20 0 1917m 214m 1888 S 0.0 1.8 11:16.73 R "
For illustration, the UNIX command
top -a -b -n 1 is used. This is OS specific and may need to be amended to run the examples on your computer.
devtools has a function to run a system command and capture the output without an additional parameter. It also returns a character vector.
devtools::system_output("top", "-a -b -n 1")
# Functions which return a data frame
fread function in package
data.table allows to execute a shell command and to read the output like
read.table. It returns a
data.table or a
fread("top -a -b -n 1", check.names = TRUE) PID USER PR NI VIRT RES SHR S X.CPU X.MEM TIME. COMMAND 1: 11300 root 20 0 1278m 375m 3696 S 0 3.2 124:40.92 trala 2: 6093 user1 20 0 1817m 269m 1888 S 0 2.3 12:18.56 R 3: 4949 user2 20 0 1917m 214m 1888 S 0 1.8 11:17.33 R 4: 7922 user3 20 0 3094m 131m 1892 S 0 1.1 21:04.95 R
fread automatically has skipped the top 6 header lines.
Here the parameter
check.names = TRUE was added to convert
TIME+ to syntactically valid column names.
# Reading from or writing to a file connection
Not always we have liberty to read from or write to a local system path. For example if R code streaming map-reduce must need to read and write to file connection. There can be other scenarios as well where one is going beyond local system and with advent of cloud and big data, this is becoming increasingly common. One of the way to do this is in logical sequence.
Establish a file connection to read with
file() command ("r" is for read mode):
conn <- file("/path/example.data", "r") #when file is in local system conn1 <- file("stdin", "r") #when just standard input/output for files are available
As this will establish just file connection, one can read the data from these file connections as follows:
line <- readLines(conn, n=1, warn=FALSE)
Here we are reading the data from file connection
conn line by line as
n=1. one can change value of
n (say 10, 20 etc.) for reading data blocks for faster reading (10 or 20 lines block read in one go). To read complete file in one go set
After data processing or say model execution; one can write the results back to file connection using many different commands like
writeLines(),cat() etc. which are capable of writing to a file connection. However all of these commands will leverage file connection established for writing. This could be done using
file() command as:
conn2 <- file("/path/result.data", "w") #when file is in local system conn3 <- file("stdout", "w") #when just standard input/output for files are available
Then write the data as follows:
writeLines("text",conn2, sep = "\n")