mardi 21 avril 2015

How to run spark interactively in cluster mode

I have a spark cluster running on

spark://host1:7077
spark://host2:7077
spark://host3:7077

and connect through /bin/spark-shell --master spark://host1:7099 When trying to read a file with:

val textFile = sc.textFile("README.md")
textFile.count()

The prompt says

WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

When checked through Web ui on host1:8080 it shows:

Workers: 0
Cores: 0 Total, 0 Used
Memory: 0.0 B Total, 0.0 B Used
Applications: 0 Running, 2 Completed
Drivers: 0 Running, 0 Completed
Status: ALIVE

My question is how to specify cores and memory when running in spark-shell cluster mode? Or I have to run by packaging my scala code into .jar file then submit the job to spark?

Thanks

Aucun commentaire:

Enregistrer un commentaire