One way to view assignment 2 is as a revision in your view of how a
word is formed.  We have operated under the assumption that words are 
strings of alphabetic characters delimited by spaces. However, there are 
many languages in the world that do not delimit words with spaces. 

So we can think of what we are doing as trying to develop a tool that can 
deal with languages where spaces do not always delimit words - sometimes 
they do, other times they may be embedded in a word and simply act like an 
alpha character. While assignment 1 limited us to a rather English-centric
view of what a word consists of, assignment 2 is forcing us to broaden
our perspective. 

You can view assignment 1 as computing pointwise mutual information
values for regular expressions that were made up strictly of string
literals such as /interest/ and /rate/.  We are extending this notion in 
assignment 2 such that more powerful regular expressions can be 
used. **However, the implicit assumption in assignment 1 that each regular 
expression matches a word remains equally valid in assignment 2.**

For example, suppose the regexs are:

/interest rates are / /\w+/

The matches in the following sentence are shown in parenthesis.

The (interest rates are ) (falling) sharply.  

Now, the question arises as to how to count these "bigrams". If you
view each pattern that matches as a word, then it remains exactly
the same as in assignment 1. 

W1                    W2
The                   interest rates are     
interest rates are    falling
falling               sharply 

Crucially, note that 'interest' is never counted as a word. This is
because  'interest rates are' is being treated as a word. If it helps
to visualize things, imagine that the embedded spaces are some other
character (eg. interest#rates#are). 

A similar analogy can be made for characters, but I will leave that for
you to flesh out. This is simply an analogy meant to help you visualize
the problem. If you already have a clear perspective of what you need
to do then don't worry if you don't understand the above so well. There
are other ways to visualize the problem that make just as much sense. 

Good luck!