mardi 21 avril 2015

scala read file and split and then save into val

I have a hello.txt

hello.txt

     [,1]
1       2
2       2
5      12
6       4

and here is scala code:

val textFile = sc.textFile("/home/winsome/share/hello.txt")
val ratings = textFile.map { line => 
    val fields = line.split(" ")  
    val (id, linksStr) = (fields(0).toInt, fields(1).toInt)
    println(id)        //1 2 5 6
    printlin(linkStr)  //2 2 12 4
 }

println(id) and printlin(linkStr) do nothing , Please tell me how to display the format I want
thank you

Slick 3.0.0 (RC3) Hook "onCreate"

I'm currently evaluating porting an application written in PHP (Laravel) to the Playframework and Slick (3.0).

One thing I really liked about working with Laravel was the fact that there were some "hooks" you could use, like "onCreate". What I mean by that is, that upon creation of a new "ModelA" I'd like to create multiple "ModelB"-rows based on that A, so I'd like to hook right into the creation-process of A and define how (many) B's should be created and how they should look.

Think of a node and a tree where for each node I want to add some paths to the nodetree.

Is there any way to achieve this?

play framework form binding with body parsers failure

I have a problem with multipartFormData. I want to send a form with files. The files have a size limitation. After adding the maxLength it seems the data from the simpleForm is somehow lost. The file size validations works however.

This is my code - 2 options:

def addFormWithFiles = Action(parse.maxLength(10 * 1024 * 1024,parse.multipartFormData)) {implicit request =>
    //Option 1
    request.body match {
      case Left(_) => BadRequest("The file you attached was too large")
      case Right(multipartdata) =>
        sampleForm.bindFromRequest().fold(
          hasErrors => Ok(hasErrors),
          goodThingy => {
            multipartdata.file("myFile") match {
              case Some(file) =>
                import java.io.File
                file.ref.moveTo(new File("PATH"))
                MyDAO.insert(goodThingy)
                Ok("File injection good, and form good")
              case _ => Ok("at least the form is good, but no file")
        }
      }
    )
}

// Option 2
sampleForm.bindFromRequest().fold(
hasErrors => {
  request.body match {
    case Left(_) => Ok(hasErrors)
    case Right(_) => Ok(hasErrors)
  }
},
goodThingy =>
  request.body match {
      // file too big but the form is ok
    case Left(_) => Ok(goodThingy)
    case Right(multipartform) => multipartform.file("myFile") match {
      case Some(file) =>
        import java.io.File
        file.ref.moveTo(new File("path"))
        MyDAO.insert(goodThingy)
        Ok("adding file and item to db success")
      case _ =>
        MyDAO.insert(goodThingy)
        Ok("at least the form was good, no file attached")

    }
  }
)
  }

Num of actor instance

I'm new to akka-actor and confused with some problems:

  1. when I create an actorSystem, and use actorOf(Props(classOf[AX], ...)) to create actor in main method, how many instances are there for my actor AX?
  2. If the answer to Q1 was just one, does this mean whatever data-structure I created in the AX actor class's definition will only appear in one thread and I should not concern about concurrency problems?
  3. What if one of my actor's action (one case in receive method) is a time consuming task and would take quite long time to finish? Will my single Actor instance not responding until it finish that task?
  4. If the answer to Q3 is right, what I am supposed to do to prevent my actor from not responding? Should I start another thread and send another message back to it until finish the task? Is there a best practice there I should follow?

json4s (de)serialisation of Java Pojo's

I'm working on a Scala/Spray/Akka system on which we have the need to serialise and deserialise objects to json, either for the REST interface or for persisting the model.

Some of the model object are Java POJO's. We're using Json4s as serialiser, but it seems to lack support for POJO's. When serialising to json I was able to overcome this limitation by implementing a CustomSerializer. However, when deserialising Json4s tries to do it's own reflection magic resulting in a "Can't find ScalaSig for class ..." exception. The custom serialiser is never called.

I created a small project on Github to replicate this issue. Does anyone know how to solve this issue? Did anyone have a similar issue?

The issue is also reported with json4s (nr 228).

How to run spark interactively in cluster mode

I have a spark cluster running on

spark://host1:7077
spark://host2:7077
spark://host3:7077

and connect through /bin/spark-shell --master spark://host1:7099 When trying to read a file with:

val textFile = sc.textFile("README.md")
textFile.count()

The prompt says

WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

When checked through Web ui on host1:8080 it shows:

Workers: 0
Cores: 0 Total, 0 Used
Memory: 0.0 B Total, 0.0 B Used
Applications: 0 Running, 2 Completed
Drivers: 0 Running, 0 Completed
Status: ALIVE

My question is how to specify cores and memory when running in spark-shell cluster mode? Or I have to run by packaging my scala code into .jar file then submit the job to spark?

Thanks

How to return all the elements in class ListBuffer?

I'm a bit confused on how to return all elements in a list. I'm trying to create a function that takes a string as input and returns a ListclassBuffer where all the elements are returned in the list. For example for string: "Author", my function would return the Listclass Buffer of all the authors in the list. First, all I would need to figure out what method to put the string into the listclassbuffer. I'm a little confused. Then I would use the .toList to convert this listclassBuffer. Afterwards, I'm thinking I need to design a four loop for this function where for each element in the list, I could use the .appends function to return all the elements about this? I'm just really confused about how set things up and how the code could look like.