Some Very Basic Unix Stuff

by Dr. Robert Heckendorn
University of Idaho

This is how to do some basic things in UNIX. If you are on windows then get the cygwin UNIX simulation tools and put them on your machine (or use the CS UNIX machines and you CS account). OSX is UNIX so you don't need to do anything special.

Basic File System Stuff: Ls, Pwd, Cd, Cp, Mv, Rm, Mkdir, Rmdir, Chmod

Your shell is "in a directory" or has a "current working directory". You can list the files in your current working directory by doing the ls command or for more information ls -l. You current working directory can be found by using the pwd command which prints its full name. cd directory will change your current working directory to the directory you gave it. cd alone will change your directory back to your "home directory". mv from to will rename or move your file nameed from to to. cp from to will copy the file from to to. rm fred will remove file fred. rm -r Jane will remove directory Janea and all its contents!!! Watch out. mkdir Jane will make directory Jane. rmdir Jane will remove directoryJane.

chmod u+x zap will make the file zap executable. If you want to run a program zap you need to make it executable and then invoke it in the shell: ./zap. The ./ tells the shell to look in the current directory for zap. You can also put "." in your PATH variable so you don't have to type ./ everything you want to run it.

I/O

Standard I/O
Consider a random program we can call fred. In UNIX we can run it in a console window by simply typing fred.
fred
Standard input (denoted stdin) is the input that comes in from the console (that is what you type in). If a program accepts input from standard input it gets input from the person typing in. Standard ouput or standard out (denoted stdout) is the output that goes to the console. It can be redirected. It is often the default output.

If your program is fred which accepts input from standard input then just type fred (return) followed by typing in your input. In UNIX end with a control-D character. In windows end with a control-Z character.

Standard output is the character output to the terminal.

fred
it waits for you to type. When you type, the characters are fed via standard input into the program. When fred uses cout or printf calls in a C or C++ program the output goes to standard out and appears on the terminal.

Redirection
If you put a < then it redirects input from a file. To input from file "dogs" into command fred then type:

fred < dogs

If you put a > then it redirects standard output to a file. If fred outputs to standard output then

fred > cats
will write the output of fred to the file cats

Piping
Piping is a way to attach the standard output of one program into the standard input of another much like you might put two pieces of pipe together. The stream of characters coming out of one program then becomes the stream of characters going into the input of the other.

fred | sally
pipes the output of fred into the input of sally. So for instance if you have a program fred that you have written, you could write a Perl test program fredTest.pl and use it to "drive" your program and use a histogram program to plot the output:
fredtest.pl | fred | hist 1

Man pages

A correctly configured UNIX system (OS X included) has manual pages that can be accessed by invoking the man command. Suppose you want to know more about the sort command. Type this

man sort
to get the manual page or "man page".

Cat

cat is a command that will concatenate all the files given as arguments and write them to standard output.
cat fred sally
will write the file fred to standard output and then write the file sally to standard output.

Tr

tr is the the translate tool. It maps characters. For example:

cat output.txt | tr ' ' '.' 
will take the contents of the output.txt file and replace all blanks with periods and then print the results to stdout.

Wc

wc the word count program. It takes stdin and counts the number of lines, words, and characters in the file. You can choose to only print some of the stats: -l for line count etc. For example

wc -l < data.txt
prints the number of lines in data.txt

Sort

The sort command will sort a file of data. It has a large number of options allowing you sort on particular columns and to only keep unique items in the list etc. See the man page for details. For example: If the file z contains:

ingestion
snowstorm
personalize
underutilize
testified
septennial
variegate
didactic
buckskins
laughter
repugnance
then: sort < z > zz
will put
buckskins
didactic
ingestion
laughter
personalize
repugnance
septennial
snowstorm
testified
underutilize
variegate
in file zz.

Uniq

The uniq command allows you work with things that occur multiple times in the same file. See the man page for the details of this useful command. For example: uniq -c will count the number of times each line occurs in the file.

Find

find will do general searchers of your directory structure. This is a very powerful command.

Grep

Grep is a General Regular Expression Processing tool. It takes a regular expression pattern (see the regular expression handout) and prints all the lines from a file that match.

grep 'fred' cmdlog
will print out all lines that contain "fred" anywhere in the line out of a mythical file that contains a log of commands typed in.

Grep with no filenames reads from standard input. Has the 3 digit lotto number ever been 666? Using the lotto history command for 2008 I get a list of lotto numbers and grep them for 666:

lottoHistory 2008 | grep '666'
or
cat rosters2007 rosters2008 | sort -u | wc -l
concatenates the game player rosters from 2007 and 2008 and sorts them removing all duplicates and then counts how many there are using the wordcount tool. This is how many different players were playing over the two years.

Tar files

A tar file is like a zip or gzip file. It is a single file that contains any number of files and or directories.

To create a tar file you use the unix tar command. For example: to put the files cats.cpp, dogs.cpp and makefile into tar file animals.tar do:

tar -cvf animals.tar cats.cpp dogs.cpp makefile
The c means create, v is verbose, and f says what follows in the filename.

DANGER: don't forget to put the name of the tar file (animals.tar) you want to write to or you could OVERWRITE the files you are trying to tar! The first file after -cvf is the output and always overwritten with the tar formatted file. No warnings. It will simply overwrite it if it is there.

To unpack a tar file use the extract option:

tar xvf animals.tar
To list the table of contents of a tar file animals.tar
tar tvf animals.tar

Make files

make is a dependency driven build tool. It allows you to rebuild only the parts of your project that have changed. I provide a make primer to show you on the general features of make. However, for a single flex file (see flex primer) you only need to put this into a file called makefile in the same directory as your code.


BIN  = tree
CC   = g++
CFLAGS = -g -DCPLUSPLUS

SRCS = $(BIN).l OBJS = lex.yy.o LIBS = -lfl

$(BIN): $(OBJS) $(CC) $(CCFLAGS) $(OBJS) $(LIBS) -o $(BIN)

lex.yy.c: $(BIN).l flex $(BIN).l


NOTE: The variable BIN above is the name of the binary to be created.

NOTE: the lines that begin with whitespace begin with exactly one tab (I am not kidding).

If you execute the program make it will look in the directory in which it was executed for a file named makefile and use that to direct a build in that directory. For example, suppose you had a file tree.l (the only file that you need to build) you wanted to turn in for an assignment you could:

  1. create file tree.l
  2. put the makefile code above into file makefile
  3. type make if you want to build your code
  4. type tar -cvf assignment1.tar tree.l makefile to build a tar file to submit for assignment 1.
  5. submit it.
  6. wait for email about whether you built and tested.
  7. go to step 1.

Sdiff

sdiff stands for "side by side difference". It is a UNIX command used to compare the output of your program to the expected output in our homework test system. You can invoke sdiff as a command:

sdiff fileA fileB
Lines will be truncated for display but all characters are tested in the compare. You can increase the width of the compare with -w <width> option:
sdiff -w120 fileA fileB

Here is how to read the output of sdiff. Assume you have two files bigthings is a list of big things and mammals is a list of mammals. The files are alphabetized and tend to have some overlap. Here is bigthings:

cows
horses
houses
planets
trucks
zebra
Here is mammals:
cowz
elephants
horses
zebra
executing the command:
sdiff bigthings mammals
would give you this output:
cows               |    cowz
                   >    elephants
horses                  horses
houses             <
planets            <
trucks             <
zebra                   zebra
The two files are presented side by side with a central column with one of 4 characters: |, <, >, space. These mean: Note: lines may be different because they have substituted blanks for tabs or one may have trailing blanks that other doesn't.

Note: sdiff tries to get the lines to "line up" in sequence. A random reordering of one file tends to make all the lines not match.

In the above example 3 things are in bigthings that are not in mammals, 1 thing is in mammals that is not in bigthings, and 1 thing is a line that is has been substituted for another.

Check the man page for sdiff for the options, in particular the -w option.