mardi 21 avril 2015

How to parallelize several apache spark rdds?

I have the next vals:

val bcs = sc.sql("select * from bcs")
val imps = sc.sql("select * from imps")

I want to do:

bcs.map(x => wrapBC(x)).collect
imps.map(x => wrapIMP(x)).collect

but when I do this, it's running not async. I can to do it with Future, like that:

Future { bcs.map(x => wrapBC(x)).collect }
Future { imps.map(x => wrapIMP(x)).collect }
...

I want to do this without Future, how can I do it?

Aucun commentaire:

Enregistrer un commentaire