I have assigned grades to assignment 4. I've looked at all of your
bitext files to varying degrees of details, and in general I have
not found any major problems. I have found a few minor problems 
with language tags being inaccurate, questionable urls in the
article tags, etc. but in general the data appears to me to be 
translated text. Thus, I have assigned full credit to all assignment
4 submissions. (Please read on however, there is a condition 
upon which you full credit will be maintained). 

I have posted all of the bitext files on the class web page. 
You can go directly to the data at:
http://www.d.umn.edu/~tpederse/Courses/CS8995-SPR01/Assign/bitext.html

I would encourage you to use this data for sentence boundary
and alignment testing. While no gold standard is available, it may
at least give you a sense of how your data performs on real world
data. There are some novel language pairs available (languages
that did not make it to the gold standard stage, most notably
perhaps several examples of English-German bitext). 

Here's the condition on full credit - if you notice a problem in 
any of the bitext as you work with it, please let me and the creator 
know. If such a problem is reported, I expect that the creator will 
fix or replace the offending article/s within 3 days time. If this  
is not done then I will adjust the assignment 4 score of the 
creator downward. (In general the problem I am referring to are 
articles that are clearly not translations of one another.) 

We will likely use some of this data in stage 3, so it is in your
best interests to check some of this data a little bit to make sure
it is reasonable. 

Please let me know if you have any questions.