Thursday, October 1, 2009

How to rename files recursively in Debian GNU/Linux 5.0 "lenny"

Recently I have mirrored some website to my local disk. I found that most pages were stored in files with ".php" extension. Those files were plain HTML, but my web browser has flatly refused to display them. Yes, I confess, I simply overlooked a very useful wget option - "--html-extension" :)

I didn't want to download everything again, so I decided to simply rename all *.php to *.html, and fix the links inside so they were pointing to renamed files. It proved to be not an easy task for a novice Linux user such as myself :)

The following command recursively replaces the extension ".php" to ".html", starting from the current directory:
find . -name *.php -type f -print0 | xargs -0 -I '{}' rename 's/\.php/\.html/' '{}'

Now fixing hyperlinks inside the web pages:
find . -name *.html -exec sed -i 's/\.php/\.html/' '{}' \;

That's it :)

1 comment:

  1. find plus xargs plus sed is certainly one way to do it, but it's certainly not the easiest way. A guy who calls himself seth has written an excellent Perl script that makes even the most complex file and directory renaming operations very easy for anybody who's familiar with regular expressions, Perl or otherwise. The script is named ren_ext.pl and can be found here:

    http://www.wg-karlsruhe.de/seth/tools.php

    This would change all file extensions from php to html recursively:

    ren_ext.pl -r '\.php$' '.html'

    Quite a bit shorter, isn't it?

    ReplyDelete