Given two vectors, X = [x1, x2, …] and Y = [y1, y2, …], their inner product is Z = x1 * y1 + x2 * y2 + … .

Two vectors are usually given in a sparse representation:

Vector One

1,0.4
3,0.6
7,0.6
4,0.7
10,0.9
12,0.8
20,0.1

Vector Two

1,0.4
3,0.1
6,0.6
4,0.7
10,0.9
12,0.2
21,0.1

Mapper streaming code in Perl:

#!/usr/bin/perl
while($line=<STDIN>){
@fields = split(/,/, $line);
if($fields[0] && $fields[1]){
print "$fields[0]\t$fields[1]";
}
}

Reducer streaming code in Perl:

#!/usr/bin/perl
$lastKey="";
$product=1;
$count=0;
while($line=<STDIN>){
@fields=split(/\t/, $line);
$key = $fields[0];
$value = $fields[1];
if($lastKey ne "" && $key ne $lastKey){
if($count==2){
print "$lastKey\t$product\n";
}
$product=$value;
$lastKey=$key;
$count=1;
}
else{
$product=$product*$value;
$lastKey=$key;
$count++;
}
}
#the last key
if($count==2){
print "$lastKey\t$product\n";
}

Run the code:

hadoop@fxu-t60:/usr/local/hadoop$ bin/hadoop jar contrib/streaming/hadoop-0.20.2-streaming.jar -D mapred.reducer.tasks=1 -input /home/hadoop/vectors -output /home/hadoop/vectors-output -mapper "VectorProduct.pl" -file ~/Desktop/HadoopInAction/VectorProduct.pl -reducer "VectorProductReducer.pl" -file ~/Desktop/HadoopInAction/VectorProductReducer.pl

Output

1 0.16
10 0.81
12 0.16
3 0.06
4 0.49

One more post-processing code is needed to do the sum of all values. Or use this reducer code to get the sum directly:

#!/usr/bin/perl
$lastKey="";
$product=1;
$count=0;
$sum=0;
while($line=<STDIN>){
@fields=split(/\t/, $line);
$key = $fields[0];
$value = $fields[1];
if($lastKey ne "" && $key ne $lastKey){
if($count==2){
$sum=$sum+$value;
}
$product=$value;
$lastKey=$key;
$count=1;
}
else{
$product=$product*$value;
$lastKey=$key;
$count++;
}
}
#the last key
if($count==2){
$sum=$sum+$value;
}
print $sum;

Here is only showing how to do product of two vectors. If there are more than two, giving each vector an ID might solve the problem with little modification. This is actually the *numerator* to compute the **Cosine similarity**.

### Like this:

Like Loading...

*Related*

Product of two Vectors

Product of vectors, cross product, scalar product, dot product, vector product, properties of vector product.

http://www.infoaw.com/article.php?articleId=865