Unit 3: File manipulation
Untar
output.tgz and change to the directory output.
Remark: You can loop over several strings in bash using a for loop, e.g. here program1 is run with the arguments bli, bla, ble and blu:for key in bli bla ble blu; do program1 $key; echo $key; done
You want to re-run the same set of calculations and maybe compare the outcome afterwards.
How would you rename all Files from out-XX-anneal to original-out-XX-anneal
for file in out-*-anneal; do mv -i $file original-$file; done # use echo instead of mv before you really run the loopIf it is present, it is much simpler to use the program
prename sometimes just called rename. prename 's/^/original-/' out-*-anneal .
More often, you would just move the files into a sub-directory:
mkdir old-2018_03_01; mv calculation*out old-2018_03_01Q: Now rename the files back. Use a for loop if you usually have extra time. Else just unpack bashcourse.tgz again.
for file in original*; do mv $file ${file#original-} ; done # # use echo instead of mv before useThe simplest way is to use prename (sometimes under the name rename) if it is available:
prename 's/original-//' original-*
All right! Good news, you still get the same results on those calculations, so now you can proceed. The files came from a calculation you ran with the colleague's program. From the output, you are interested in the total Energies in that file.
Q: Pick one of the files. Use a command to print all lines which contain the string "Etotal"
grep Etotal filename
Q1: Use awk to do the same thing (Section awk in the cheat sheet).
awk '/Etotal/{print}' filename Q1: Use "awk" to print only the column of those lines that contain the number of the energy. Awk puts the content of each column into the variables $1 $2 $3 etc.
awk '/Etotal/{print $3}' out-18-anneal man awk), or you can use the -F option to awk and
make = a field seperator using the option -F. This option uses regular expressions to describe the delimiter between columns (you can search in man awk for -F). What do you have to supply after -F to get the column without the "="? ' are quotes to protect the string from the shell. The inside is a regular expression. [] contains a list of characters. Here it is a space and a =. the "+" means that one or more of those characters have to occur for a match.gsub in the awk manual and use it to remove the =.
awk -F'[ =]+' '/Etotal/{print $4}' out-18-anneal # you now have to adjust columns, because the -F option causes an additional column $1. (awk)
Q: Pick one of the output files, extract the numbers and write them into a new file with a prefix "energies-" to the filename.
awk -F'[ =]+' '/Etotal/{print $4}' out-18-anneal > energies-out-18-annealQ: Repeat the writing of energies from the last exercise for all outpuf files using a for loop
for file in out*; do awk -F'[ =]+' '/Etotal/{print $4}' $file > energies-$file ; done gnuplot
gnuplot> plot 'energies-out-18-anneal'
If this is on a terminal, where you cannot get a graphical window, first do
gnuplot> set term dumb
You can also create a PNG grafic file by doing:
gnuplot> set term png
gnuplot> set output "energies-out-18.png"
gnuplot> replot
gnuplot> exit # note that the file is only written when it is closed. The exit closes it in this case.
Q: You want to make the file with the extracted energies available to your colleagues on the cluster, so they can read it from your home directory. How do you do this?
find . -name 'out*' -type f -exec grep -il 9998.657 '{}' +
If there is still time: Write a bash script that will automate the process of what you have just done to plot the energies from the annealing files producing one png plot file for each output produced by the calculation.