A typical error in writing reducer code

Suppose the input is:

00403D91436B76D22DDD88ACEAA41FB4
00403D91436B76D22DDD88ACEAA41FB4        s1
00403DA1239B66B92BD91FFF0EC2DC3F        s1|s3
00403DC1F314463D904A0C03C9714743

The reducer is to output the first column which is not unique.

#!/usr/bin/perl -w

use strict;

# loop vars
my $key     = "";
my $cur_key = "";

# key and vals
my $uid;
my $tag;
my $count = 0;

sub onBeginKey( ) {
$cur_key = $key;
$count   = 0;
}

sub onSameKey( ) {
$count++;
}

sub onEndKey( ) {
if ( $count == 2 ) {
printf STDOUT "%s\t%s\n", $cur_key, $tag;
}
}

while ( my $line = <STDIN> ) {
chomp($line);

my @fields      = split( /\t/, $line );
my $fields_size = scalar @fields;

$key = $fields[0];
if ( $fields_size == 2 && $fields[1]) {
$tag = $fields[1];
}


if ($cur_key) {
if ( $key ne $cur_key ) {
&onEndKey();
&onBeginKey();
}
&onSameKey();
}
else {
&onBeginKey();
&onSameKey();
}
}
if ($cur_key) {
&onEndKey();
}

The correct one is:

#!/usr/bin/perl -w

use strict;

# loop vars
my $key     = "";
my $cur_key = "";
my $o_tag = "";

# key and vals
my $uid;
my $tag;
my $count = 0;

sub onBeginKey( ) {
$cur_key = $key;
$count   = 0;
}

sub onSameKey( ) {
$count++;
if($tag){ $o_tag = $tag;}
}

sub onEndKey( ) {
if ( $count == 2 && $o_tag) {
printf STDOUT "%s\t%s\n", $cur_key, $o_tag;
}
}

while ( my $line = <STDIN> ) {
chomp($line);

my @fields      = split( /\t/, $line );
my $fields_size = scalar @fields;

$key = $fields[0];
if ( $fields_size == 2 && $fields[1]) {
$tag = $fields[1];
}
else{
$tag = "";
}

if ($cur_key) {
if ( $key ne $cur_key ) {
&onEndKey();
&onBeginKey();
}
&onSameKey();
}
else {
&onBeginKey();
&onSameKey();
}
}
if ($cur_key) {
&onEndKey();
}

Note: The code is Perl. But the logic is same for Java.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s