Wednesday, February 29, 2012

Programatically rename files (or do other stuff to them) in R

In order to do something to a bunch of files at once, we first need a vector which contains the file paths of just the files we are interested in.

startingDir<-"/myDirectory"
filez<-list.files(startingDir,pattern="searchPattern")

head(filez)
[1] "/myDirectory/xxxFile1.txt"
[2] "/myDirectory/xxxFile2.txt"
[3] "/myDirectory/xxxFile3.txt"
[4] "/myDirectory/xxxFile4.txt"
[5] "/myDirectory/xxxFile5.txt"
[6] "/myDirectory/xxxFile6.txt"

Once you have a vector that consists of a bunch of file paths, you can do useful stuff with it, like renaming files in a consistent way. I can use sapply() to apply a function to every element of a vector.  For example, I can use the file.rename() function along with the sub() function to change the filename.

sapply(filez,FUN=function(eachPath){
      file.rename(from=eachPath,to=sub(pattern="xxx",replacement="newTextString",eachPath))
})

If you aren't familiar with sapply(), I highly recommend you take a moment to study up on how it works.  Using the apply() family of functions is really important to being productive in R.  Now my filenames look like this.

[1] "/myDirectory/NewTextStringFile1.txt"
[2] "/myDirectory/NewTextStringFile2.txt"
[3] "/myDirectory/NewTextStringFile3.txt"
[4] "/myDirectory/NewTextStringFile4.txt"
[5] "/myDirectory/NewTextStringFile5.txt"
[6] "/myDirectory/NewTextStringFile6.txt"


This is a whole lot faster and less error prone than trying to rename 500 files by hand.  Just type ?files for a list of other file functions that might be useful, like file.move() and unlink().  Be careful with unlink()....it will delete files from your machine!

Graphical message boxes with R package tcltk

Sometimes you just need a graphical messagebox....know what I mean? If only because it pops up in front of all the other open windows and alerts you to the fact that your R script is waiting for you to do something, or is finished doing something else.

The R package "tcltk" is the easiest way I have found to do this.  On Mac OSX you need to first install the Universal Tcl/Tk for X11(available here).  I haven't tried it on Windows, but I don't think this extra step is necessary.  Now just install the R package "tcltk" in the normal way, for example

install.packages("tcltk")
library("tcltk")

Now you should have access to functions like this.....
tk_messageBox(type="ok",message="I am a tkMessageBox!")


There are different types of messagebox (yesno, okcancel, etc). See ?tk_messageBox.