Search This Blog

Saturday, 22 March 2025

matrix-3

 To implement matrix multiplication using MapReduce in Hadoop for 2x2 matrices, we can break the process into the following steps:

  1. Input Format: We'll use a text file format where each matrix element is represented by a line containing the row, column, and value.

  2. Map Function: In the mapper, we will split the matrices into rows and columns, and emit key-value pairs for the corresponding matrix operations.

  3. Reduce Function: The reducer will handle the summation of products for matrix multiplication.

For simplicity, let's assume:

  • Matrix A is represented as A = [[a11, a12], [a21, a22]]

  • Matrix B is represented as B = [[b11, b12], [b21, b22]]

The matrix multiplication C = A * B will give us:

C = [[c11, c12], [c21, c22]]

Where:

  • c11 = a11 * b11 + a12 * b21

  • c12 = a11 * b12 + a12 * b22

  • c21 = a21 * b11 + a22 * b21

  • c22 = a21 * b12 + a22 * b22

Here's a simple code example using Hadoop MapReduce to implement 2x2 matrix multiplication:

Step 1: Mapper

The Mapper will read each matrix element and emit intermediate key-value pairs that represent the positions and values for matrix multiplication.

java
import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper; import java.io.IOException; public class MatrixMultiplicationMapper extends Mapper<Object, Text, Text, IntWritable> { // The input format will be something like: // Matrix A: (row, col, value) // Matrix B: (row, col, value) @Override public void map(Object key, Text value, Context context) throws IOException, InterruptedException { // Parsing the input (row, col, value) String[] parts = value.toString().split(","); String matrix = parts[0].trim(); // Matrix A or B int row = Integer.parseInt(parts[1].trim()); int col = Integer.parseInt(parts[2].trim()); int val = Integer.parseInt(parts[3].trim()); if (matrix.equals("A")) { // For matrix A, emit (row, col) as key and value as val context.write(new Text("A_" + row + "_" + col), new IntWritable(val)); } else if (matrix.equals("B")) { // For matrix B, emit (row, col) as key and value as val context.write(new Text("B_" + row + "_" + col), new IntWritable(val)); } } }

Step 2: Reducer

The Reducer will collect the intermediate key-value pairs and compute the product for each element in the result matrix.

java
import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Reducer; import java.io.IOException; import java.util.ArrayList; import java.util.List; public class MatrixMultiplicationReducer extends Reducer<Text, IntWritable, Text, IntWritable> { @Override public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { String keyString = key.toString(); String matrixInfo = keyString.split("_")[0]; // Either A or B int row = Integer.parseInt(keyString.split("_")[1]); int col = Integer.parseInt(keyString.split("_")[2]); List<Integer> aValues = new ArrayList<>(); List<Integer> bValues = new ArrayList<>(); // Separate the values from matrix A and matrix B based on the key for (IntWritable val : values) { if (matrixInfo.equals("A")) { aValues.add(val.get()); } else { bValues.add(val.get()); } } // Calculate the multiplication if both matrix values exist if (matrixInfo.equals("A") && !bValues.isEmpty()) { // Multiply row of A with column of B int result = aValues.get(0) * bValues.get(0) + aValues.get(1) * bValues.get(1); // Example for c11 calculation context.write(new Text("C_" + row + "_" + col), new IntWritable(result)); } else if (matrixInfo.equals("B") && !aValues.isEmpty()) { // Process the multiplication for matrix B if necessary // Similar logic can be added to process matrix B part } } }

Step 3: Driver

Finally, the driver will configure and execute the Hadoop job. It specifies input/output paths, the Mapper, and the Reducer.

java
import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class MatrixMultiplicationJob { public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = Job.getInstance(conf, "Matrix Multiplication"); job.setJarByClass(MatrixMultiplicationJob.class); job.setMapperClass(MatrixMultiplicationMapper.class); job.setReducerClass(MatrixMultiplicationReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(args[0])); // Input path FileOutputFormat.setOutputPath(job, new Path(args[1])); // Output path System.exit(job.waitForCompletion(true) ? 0 : 1); } }

Step 4: Input Format

Ensure that the input format to the MapReduce job is properly structured. For example, if you're using text files:

less
Matrix A: A, 0, 0, 1 A, 0, 1, 2 A, 1, 0, 3 A, 1, 1, 4 Matrix B: B, 0, 0, 5 B, 0, 1, 6 B, 1, 0, 7 B, 1, 1, 8

Output

For the given matrices, the output for the multiplication would be something like:

nginx
C_0_0 19 C_0_1 22 C_1_0 43 C_1_1 50

Conclusion

This simple MapReduce program performs 2x2 matrix multiplication. You can extend this to larger matrices or modify it for more complex scenarios by enhancing the Mapper and Reducer logic. Make sure to manage the input format and output format carefully when running the job on a real Hadoop cluster.

No comments:

Post a Comment

Hadoop Analytics

NLP BASICS

  1. What is NLP? NLP is a field of artificial intelligence (AI) that focuses on the interaction between computers and human languages. Its...