The presence of Ns correlates well with sequencing errors. The presence of an N in the sequence indicates the GS20's inability to accurately call a base at that position within the sequence. The number of other sequencing errors (substitutions, insertions and deletions) within a sequence read correlates with number of uncalled bases. By removing all reads that contain one or more Ns, the overall sequencing error rate drops substantially.
Huse et al. Genome Biology 2007 8:R143 doi:10.1186/gb-2007-8-7-r143