sabato 4 febbraio 2017

Hey students, don't buy the teachers' lies!

When I was an university student I had a teacher in the Operating Systems subject that teached me (and a lot of others) to program so bad in shell scripting! I will not name her, but I had to say that today I find the very same errors around the scripts that my colleagues write every day, and this is a kind of watermark of the damage she did.

Luckily I find my way out, studying on other books and practicing on my own (Linux) computer.

So what were the problems?
To understand it must be clear the exercise schema the teacher was adopting, and that was pretty much always the same: a main script (let's call "coordinator" which aim is to parse the argument list and invoke, using recursion, a "worker" script.
Something that can be shown as the following code:

#!/bin/sh
# coordinator

# argument validation...

# export the current directory
# in the path to invoke the worker script
PATH=$PATH:`pwd`
export PATH

# first call of the worker
worker



#!/bin/sh
# worker

# recursion on myself
for f in *
do
if [ -d $f ]
then
worker f
fi
done

# do other work...



The first problem, in my opinion, is the usage of relative paths to invoke the worker script, and therefore the need for exporting the PATH variable. First of all, launching a script with a relative path makes it a little slower to launch, since the shell itself has to search for the script against each PATH entry. Second, and much more important, it is the key to exploitation: not having the control over the full path of the script it is possible to inject a malicious script somewhere in the PATH and use it as a worker.
When I objected the above to the teacher, the answer was to simply invert the PATH manipulation order:

PATH=`pwd`:$PATH
export PATH

But again, this is a kick in the ass of security: what if I name my script as another system wide command? I can alter the behaviour of this and other programs...
So what is the solution? Of course invoke the worker script with an absolute path and to not manipulate the PATH variable. After all, what is the point in showing (to the teacher) you can export a variable?

Another problem is the recursion on the worker script: usually such script was scanning a directory content, invoking itself each time a subdirectory was found. Now, while this can work in theory, you can easily imagine the worker script becoming a fork-bomb. It is quite easy to see how find(1), xargs(1) and friends can help in this situation.

Another oddity that comes into my mind is the way students were forced to test if an argument was an absolute path or a relative one:

case $1 in
/*) # absolute
;;
*) # relative
;;
esac


Do you believe the above is easy to read? Is it efficient and does it scale well? Why not using Unix tools and pipes, regular expressions and awk? even better, getopt anyone?

So, dear ex-teacher, what is the whole point in teaching such shit?

Nessun commento: