Filtering Stop Words in Pig

bottom-img
As easy as with a replicated non-equijoin:
/* Remove stop words. Note use of replicated join for mucho velocidad */
stop_words = LOAD 'stopwords.txt' AS (word:chararray);
words = JOIN my_words BY word LEFT OUTER, stop_words BY word using 'replicated';
words = FILTER words BY stop_words::word IS NULL;