Unit 1: Documentation, Quoting, Regular Expressions

Documents coming with this exercise:

Exercise File: 

Beginnings...

You are a PhD student and you just started in a new work group. Your advisor gave you some general overview about the research you will have to do, then sent you to a senior post-doc to fill you in with any details. He says you will have to get a software that has been written in the group. A different colleague sends you the file bashcourse.tgz. He is a bit weird and you suspect he doesn't like you. Or people. He writes some of the scientific software for your group.
Copy and paste in terminals on Linux desktops usually work in two ways:
  • Mark text with the mouse to copy, press middle mouse button to paste (For ssh via putty: right mouse button to paste)
  • Mark text with the mouse, then Ctrl-Shift-C for copy and Ctrl-Shift-V for paste
Those two mechanisms fill two independent buffers.

Before you start: enter and execute the following command: export LC_COLLATE=C.
Without this setting, some searches will behave unexpectedly. This command only affects your current shell. To make it permanent, add the command to your .bashrc file with:
echo "export LC_COLLATE=C" >> ~/.bashrc

Task: Get the file with bashcourse.tgz from the begin of this page by copying the link and uwing wget to download the file directly to the cluster. Unpack the file bashcourse.tgz, change into the newly created directory and read the text files within. It is a compressed tar file, "tgz" is a short form of the file ending .tar.gz (a tar file compressed with gzip).
Run man tar to look up the options -z -v -x and -f (you can search for them typing / -f (there is a space before the -f).

Remark: / is the command to search inside less. Use n to find the next match and < to jump to the start of a file for a new search.
The search terms you used in "less" are regular expressions.
Have a quick look at (but don't read) man 7 regex (and at man 3 regex). The 3 and 7 denote different sections of the manual and you could find out about them running man man.

Regular expressions are used for searches with the programs e.g. grep, awk, perl, sed, vi and less. The shell uses different matching called globbing. To make it more complicated, there are three different types of regular expressions, as used by grep ("basic"), egrep("extended") and perl (man perl will tell you that this is documented in man perlre).
The following, most basic search elements are the same in all of them:

strings: any string composed just of letters and numbers matches itself.
. matches any character
* do the previous match as often as possible (including 0), i.e. .* matches any string, x* matches xxx or xxxxx. Either matches an empty line!
[]: match any of the listed characters. E.g. [abc] matches a, b or c, [154] matches 1, 5 or 4.
[a-z]: you can use ranges, this matches all letters from a-z. An alphanumeric match would be [a-zA-Z0-9]
[^a-z]: ^ at the begin of a character list negates it. This matches anything but a-z
^: If at the beginning of the regex, matches the begin of line
$: If at the end of the regex, matches the end of line

If you want to learn a bit more about regular expressions, https://likegeeks.com/regex-tutorial-linux/ gives you examples for the more commen types of matching, http://mywiki.wooledge.org/RegularExpression has a more formal description and explains the differences between the different flavors of regex.

wget https://www.uni-ulm.de/fileadmin/website_uni_ulm/kiz/it/rechner-compute-server/workshops/bashcourse_01.tgz # fetches the file from the web
ls -la
tar xvf bash[tab] # use tab completion! Yields:
tar xvf bashcourse_01.tgz
ls -la
cd bashcourse/
ls -la

You change into the directory and enter ls -la *.
This returns an error. But why does the command fail?

Ok, he is really weird. Why did he name the files like this and how are you supposed to look at them?

Remark: You may think filenames like this are far-fetched, but sadly you will encounter about everything in a filename, given enough time.

You decide to look up how you can quote the file names so you can look at the file's contents.
Run "man bash" search for the correct chapter by entering /^QUOTING

You decide to try some simple examples to see the effects of the information from the quoting chapter.

Try the following examples, in which you set the bash variable "var" to different values. echo $something # prints nothing
var=something# silently sets the variable
echo $var#prints the variable

var="something
else";

#(yes, there is a newline in the variable. The quote prevents the sending of the command until it is closed, so the newline really becomes part of the value of the variable)

echo $var

echo "$var"

echo '$var'

Also look at https://mywiki.wooledge.org/Quotes for help. wooledge.org is one of the few sources you should trust on advice on bash next to the manpage.

Now back to that file, $README (and then the file "It's my first draft") open the file with the program less.

Use the quotes from above. You can open and close any number of quotes in a string. Single quotes will quote double quotes and vice versa.

first file:
less '$README' # single quotes ' prevent expansion of variables, double quotes " would not
less \$README # backslash escapes a single character
second file:
use tab completion It[tab] auto-completes the filename with - hopefully - correct quoting
several possibilities… single quotes can quote double quotes and vice versa
you can use \ to quote each of them
less 'It'\''s my "first draft"'
less "It's my "'"first draft"'

Still, you cannot look at the files --ban and -la . Quoting them doesn't help and you also seem to be unable to remove them.
If you have time left over, look in the man-page of rm how you might be able to remove them.

Run the python script numpy_test.py from the tarfile. It is a short test program that does a matrix multiplication several times and prints out how long that took.

After running that successfully, run the bash script killme

The program numpy_test.py is a python script. You can give the script as an argument to python to run it. killme is a bash script and you can run it using bash killme. You can also make either of them executable using chmod.
chmod can change options for the user (u), group (g), others (o) and all of them (a). It can change read (r), write (w) and execute (x) permissions.
chmod a+r filename gives everyone read permission, chmod a-r filename removes read permissions for all. For this, you need to grant "execute" permission.
See "shared file access" on the cheat sheet for a short chmod description, man chmod for a long one.
The program should run about long enough to allow you monitoring it. If the time is too short for you, edit it and change the line with number=xxx to a higer value. You can edit with "nano" or with "vi". If you are not used to vi(m), nano is probably easier to use. "vi" seems un-intuitive, because it has a command mode and an edit mode. Your cheat sheet has the most basic things needed to do simple edits with it.

The permissions of the file do not mark it as executable, i.e. you are not allowed to execute it.
Try: ./numpy_test.py
You can tell python to read it (and execute the commands within):
python numpy_test.py
or you can grant execute permissions to it:
ls -la numpy_test.py
chmod a+x numpy_test.py
ls -la numpy_test.py # note the changed permissions to the left. We will do a bit on permissions later.
./numpy_test.py


Suppose you will want to run that program from the command line quite often. You don't want to change to the correct directory every time, but just call it.
Q: Create $HOME/bin if necessary and copy numpy_test.py to that location. Print the Variable $PATH and check if $HOME/bin is included. If it is not included, add $HOME/bin to $PATH on the shell and check that the command was successful. Then edit your $HOME/.bashrc to add the directory to your $PATH on each login.
mkdir -p ~/bin
cp numpy_test.py ~/bin
chmod a+x ~/bin/numpy_test.py # what happens if execute permissions are missing?
echo $PATH
PATH=$PATH:~/bin
numpy_test.py # test run
nano ~/.bashrc # add the last line to .bashrc

Q: The killme script does seem to be stuck and doesn't stop. How can you force ending it?

Look at the section "Process management" on the cheat sheet
Use ps or htop to find out the PID (process ID) and kill it with kill help kill shows you how it is used. List the signal names. man 7 signal will explain in detail
You will need the signal "SIGKILL".
A: ps aux | grep killme shows you the PID of the process in the 2nd column, lets say it is 123456.
kill 123456 sends signal 15 to the process.
kill -SIGKILL 123456 sends the SIGKILL signal 9 to the process, which cannot be ignored. This is mostly written as kill -9 123456 , because it is shorter.
Q: You want to make the file numpy_test.py available to your colleagues on the cluster, so they can read or execute it from your $HOME/bin directory. How do you do this?

A: Using traditional permissions, you have to chmod a+x $HOME and $HOME/bin if the file stays in $HOME/bin. All directories between the root and the directory the file is in have to be executable for a user to be able to access it. The "a" is for "all" and "x" is "executable". Execute right allows people to execute a program in case of a file, but for directories allows them to change into the directory at all - but they cannot see what files are inside it. Additionally, you have to give read permission with "chmod a+x" to the program and chmod a+r so they can also read it. Mind, anyone who knows the filename, can now read that file! If you want to give permissions to a single user only, you have to use ACLs.


If you have time left over, read "man setfacl" and allow some other user to read a file. You can also use ACLs to give execute permission to the directory only to certain people.