Monday 20 February 2017

Pig Exercise-part III

Here is Pig programs for wordcount

myinput = load '/sample.txt' as (line);
//TOKENIZE splits the line into a field for each word. 
//flatten will take the collection of records returned by TOKENIZE and
//produce a separate record for each one, calling the single field in the
//record word.

words = foreach myinput generate flatten(TOKENIZE(line)) as word;

grpd = group words by word;

cntd = foreach grpd generate group, COUNT(words);

dump cntd;


Keep updated with

www.facebook.com/coebda

If you need raw data comment here:

No comments:

Post a Comment