Search This Blog

Saturday, 22 March 2025

MATRIX MULTIPLICATION

 

Step 1: Define the Input Format

For matrix multiplication, we typically represent the matrices in a sparse format like (i, j, value) for matrix elements, where i is the row index, j is the column index, and value is the element value.

We will assume two matrices: Matrix A (size MxN) and Matrix B (size NxP). The result will be Matrix C (size MxP).

Step 2: Mapper Class

The mapper will emit intermediate key-value pairs that represent the multiplication of elements in Matrix A and Matrix B. The key will be a tuple (i, k) where i is the row of Matrix A, and k is the column of Matrix B. The value will be the relevant elements of the matrices being multiplied.

Step 3: Reducer Class

The reducer will aggregate the results for each (i, k) key by summing the products of the corresponding matrix elements from Matrix A and Matrix B.

Step 4: Final Output

The output will be the resulting matrix C, where each element is the sum of the products of the corresponding row of Matrix A and the column of Matrix B.

Step-by-Step Code Example:

1. MatrixMultiplication.java (Main Driver Class)

java
import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class MatrixMultiplication { public static void main(String[] args) throws Exception { if (args.length != 2) { System.err.println("Usage: MatrixMultiplication <input_path> <output_path>"); System.exit(-1); } Configuration conf = new Configuration(); Job job = Job.getInstance(conf, "Matrix Multiplication"); job.setJarByClass(MatrixMultiplication.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.setMapperClass(MatrixMultiplicationMapper.class); job.setReducerClass(MatrixMultiplicationReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); System.exit(job.waitForCompletion(true) ? 0 : 1); } }

2. MatrixMultiplicationMapper.java (Mapper Class)

java
import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper; import java.io.IOException; public class MatrixMultiplicationMapper extends Mapper<Object, Text, Text, IntWritable> { private static final String A_PREFIX = "A"; // For Matrix A private static final String B_PREFIX = "B"; // For Matrix B @Override public void map(Object key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); String[] tokens = line.split(","); // Check if this is Matrix A or Matrix B String matrixType = tokens[0].trim(); int i = Integer.parseInt(tokens[1].trim()); // Row or Column Index int j = Integer.parseInt(tokens[2].trim()); // Column or Row Index int val = Integer.parseInt(tokens[3].trim()); // Matrix Element Value if (matrixType.equals("A")) { // For Matrix A (i, j, value), we emit (i, k) pairs for each element in A for (int k = 0; k < 3; k++) { // Assume B is 3x3 for example; adapt for general cases context.write(new Text(i + "," + k), new IntWritable(val)); } } else if (matrixType.equals("B")) { // For Matrix B (i, j, value), we emit (i, k) pairs for each element in B for (int k = 0; k < 3; k++) { // Again adapt for general cases context.write(new Text(i + "," + k), new IntWritable(val)); } } } }

3. MatrixMultiplicationReducer.java (Reducer Class)

java
import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Reducer; import java.io.IOException; public class MatrixMultiplicationReducer extends Reducer<Text, IntWritable, Text, IntWritable> { @Override public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int sum = 0; for (IntWritable value : values) { sum += value.get(); } context.write(key, new IntWritable(sum)); } }

4. Input Format Example

You would feed the matrices in a CSV format:

Matrix A (3x3):

css
A, 0, 0, 1 A, 0, 1, 2 A, 0, 2, 3 A, 1, 0, 4 A, 1, 1, 5 A, 1, 2, 6 A, 2, 0, 7 A, 2, 1, 8 A, 2, 2, 9

Matrix B (3x3):

css
B, 0, 0, 1 B, 0, 1, 2 B, 0, 2, 3 B, 1, 0, 4 B, 1, 1, 5 B, 1, 2, 6 B, 2, 0, 7 B, 2, 1, 8 B, 2, 2, 9

5. Output Format Example

For the output of the final matrix C (which will be of size 3x3):

0,0 30 0,1 36 0,2 42 1,0 66 1,1 81 1,2 96 2,0 102 2,1 126 2,2 150

Step 5: Compilation and Execution

  1. Compile the Java files:

bash
javac -classpath `hadoop classpath` -d . MatrixMultiplication.java MatrixMultiplicationMapper.java MatrixMultiplicationReducer.java jar cf matrix_multiplication.jar MatrixMultiplication*.class
  1. Run the Hadoop job:

bash
hadoop jar matrix_multiplication.jar MatrixMultiplication input_path output_path

No comments:

Post a Comment

Hadoop Analytics

NLP BASICS

  1. What is NLP? NLP is a field of artificial intelligence (AI) that focuses on the interaction between computers and human languages. Its...