Wednesday, February 29, 2012

Programatically rename files (or do other stuff to them) in R

In order to do something to a bunch of files at once, we first need a vector which contains the file paths of just the files we are interested in.

startingDir<-"/myDirectory"
filez<-list.files(startingDir,pattern="searchPattern")

head(filez)
[1] "/myDirectory/xxxFile1.txt"
[2] "/myDirectory/xxxFile2.txt"
[3] "/myDirectory/xxxFile3.txt"
[4] "/myDirectory/xxxFile4.txt"
[5] "/myDirectory/xxxFile5.txt"
[6] "/myDirectory/xxxFile6.txt"

Once you have a vector that consists of a bunch of file paths, you can do useful stuff with it, like renaming files in a consistent way. I can use sapply() to apply a function to every element of a vector.  For example, I can use the file.rename() function along with the sub() function to change the filename.

sapply(filez,FUN=function(eachPath){
      file.rename(from=eachPath,to=sub(pattern="xxx",replacement="newTextString",eachPath))
})

If you aren't familiar with sapply(), I highly recommend you take a moment to study up on how it works.  Using the apply() family of functions is really important to being productive in R.  Now my filenames look like this.

[1] "/myDirectory/NewTextStringFile1.txt"
[2] "/myDirectory/NewTextStringFile2.txt"
[3] "/myDirectory/NewTextStringFile3.txt"
[4] "/myDirectory/NewTextStringFile4.txt"
[5] "/myDirectory/NewTextStringFile5.txt"
[6] "/myDirectory/NewTextStringFile6.txt"


This is a whole lot faster and less error prone than trying to rename 500 files by hand.  Just type ?files for a list of other file functions that might be useful, like file.move() and unlink().  Be careful with unlink()....it will delete files from your machine!

No comments:

Post a Comment