hadoop - Knowing usage of mapper and reducer -


I'm running a dip Latin script on 550 GB of data. I reddor is default 1. It takes approximately 38 minutes to generate the result Are there. I want to know that if the number of resellers who execute the script faster, then

Any help would be appreciated.

In addition to this, I had to know the concept behind changing mapers and reducers.

The increase in the number of reducers will definitely be helpful (if the operation you have done is aggregation), as real aggregation is decreasing, running many reducers will increase the performance.

You can use the word 'parallel' to determine the number of reducers in the pig. Ex: A = Load 'myfile' AS (T, U, V); B = Group A By T PARALLEL 18;

The number of mappers is decided by using the input size and the map. The number of mappers is usually equal to the number of input divisions.

Comments

Popular posts from this blog

Pass DB Connection parameters to a Kettle a.k.a PDI table Input step dynamically from Excel -

multithreading - PhantomJS-Node in a for Loop -

c++ - MATLAB .m file to .mex file using Matlab Compiler -