Posts Tagged ‘command line’

Fixing a broken link throughout the site

December 12th, 2008

Today our link checking script reported 225 broken links. Most of these were due to Adobe changing the location of their Acrobat Reader download page. Usually when this happens I’m too lazy to figure out how to script the update. But this number of links finally tipped the scales in favor of my being too lazy to update them by hand. It turned out that most of them were produced by one or two dynamic pages, but at least I learned something :-) .

First I used grep to store the list of files containing the broken link in a text file:

steve@oracledev:~/perforce/depot/mainline/weblive/wwwroot$ grep -rl http://www.adobe.com/products/acrobat/readstep2.html . > /tmp/acrobat_link_files.txt

Then I marked those files for edit in Perforce:

steve@oracledev:~/perforce/depot/mainline/weblive/wwwroot$ cat /tmp/acrobat_link_files.txt | p4 -x - edit

Then I used sed to update the link in those files. If you’ve used perl-style regular expressions this will look familiar:

steve@oracledev:~/perforce/depot/mainline/weblive/wwwroot$ cat /tmp/acrobat_link_files.txt | xargs sed -i 's|http://www.adobe.com/products/acrobat/readstep2.html|http://get.adobe.com/reader/|g'

The xargs command calls the sed command for each line of the acrobat_link_files.txt file, passing the line as an argument. The -i switch to sed tells it to update the given file in place.

Perhaps next time I’ll really get my unix geek on and figure out how to do it in 1 line instead of 3.

Update: I’ve got it down to 1 command! The tee command can redistribute stdin to multiple outputs. Here it redirects stdin to the p4 edit command and also to stdout. We need to redirect p4 edit’s output to /dev/null or else that will also get sent to stdout and sed won’t know what to do with it.

steve@oracledev:~/perforce/depot/mainline/weblive/wwwroot/students/ugrad$ grep -rl http://certification.cornell.edu . | tee >(p4 -x - edit 1>/dev/null) | xargs sed -i 's|http://certification.cornell.edu/\?|https://certification.cornell.edu/|g'

Update: I’ve created a shell script to make this easier:

steve@oracledev:~/depot/mainline/common/scripts/bin$ ./bulk_update_urls.sh
Usage: ./bulk_update_urls.sh http://original.url.net/ http://new.url.net/ /path/to/target/dir

steve@oracledev:~/depot/mainline/common/scripts/bin$ ./bulk_update_urls.sh http://www.payments.cornell.edu/Travel_Forms.cfm http://www.dfa.cornell.edu/dfa/payments/essentials/advances/index.cfm ~/depot/mainline/websw/intraroot/

//depot/mainline/websw/intraroot/howdoi/mngcourse/host.html#3 - opened for edit
//depot/mainline/websw/intraroot/howdoi/travel/cashProcedures.html#1 - opened for edit
//depot/mainline/websw/intraroot/howdoi/travel/tranform.html#3 - opened for edit
//depot/mainline/websw/intraroot/howdoi/travel.html#14 - opened for edit

Screen

November 13th, 2008

I got so excited about the screen program that I set up this blog just so I could write about it. When I ssh to one of our servers I often end up su-ing between various users to perform actions that need to be invoked with different privileges. That can get pretty tedious. I could set up sudo for this, but I’d rather have dedicated sessions than constantly have to prefix my commands with ’sudo’. I finally decided to learn how to use screen which is a sort of window manager for console sessions.

Screen allows you to:

  • create a number of consoles and switch between them with keyboard shortcuts
  • detach from a session and re-attach to it later, even from a different machine!
  • share the same session simultaneously with others — this could be very useful for learning administrative tasks by going through them together
  • split the window so multiple consoles are visible at the same time

Here are a few brief tutorials on using it: