mardi 21 avril 2015

scala read file and split and then save into val

I have a hello.txt

hello.txt

     [,1]
1       2
2       2
5      12
6       4

and here is scala code:

val textFile = sc.textFile("/home/winsome/share/hello.txt")
val ratings = textFile.map { line => 
    val fields = line.split(" ")  
    val (id, linksStr) = (fields(0).toInt, fields(1).toInt)
    println(id)        //1 2 5 6
    printlin(linkStr)  //2 2 12 4
 }

println(id) and printlin(linkStr) do nothing , Please tell me how to display the format I want
thank you

Slick 3.0.0 (RC3) Hook "onCreate"

I'm currently evaluating porting an application written in PHP (Laravel) to the Playframework and Slick (3.0).

One thing I really liked about working with Laravel was the fact that there were some "hooks" you could use, like "onCreate". What I mean by that is, that upon creation of a new "ModelA" I'd like to create multiple "ModelB"-rows based on that A, so I'd like to hook right into the creation-process of A and define how (many) B's should be created and how they should look.

Think of a node and a tree where for each node I want to add some paths to the nodetree.

Is there any way to achieve this?

play framework form binding with body parsers failure

I have a problem with multipartFormData. I want to send a form with files. The files have a size limitation. After adding the maxLength it seems the data from the simpleForm is somehow lost. The file size validations works however.

This is my code - 2 options:

def addFormWithFiles = Action(parse.maxLength(10 * 1024 * 1024,parse.multipartFormData)) {implicit request =>
    //Option 1
    request.body match {
      case Left(_) => BadRequest("The file you attached was too large")
      case Right(multipartdata) =>
        sampleForm.bindFromRequest().fold(
          hasErrors => Ok(hasErrors),
          goodThingy => {
            multipartdata.file("myFile") match {
              case Some(file) =>
                import java.io.File
                file.ref.moveTo(new File("PATH"))
                MyDAO.insert(goodThingy)
                Ok("File injection good, and form good")
              case _ => Ok("at least the form is good, but no file")
        }
      }
    )
}

// Option 2
sampleForm.bindFromRequest().fold(
hasErrors => {
  request.body match {
    case Left(_) => Ok(hasErrors)
    case Right(_) => Ok(hasErrors)
  }
},
goodThingy =>
  request.body match {
      // file too big but the form is ok
    case Left(_) => Ok(goodThingy)
    case Right(multipartform) => multipartform.file("myFile") match {
      case Some(file) =>
        import java.io.File
        file.ref.moveTo(new File("path"))
        MyDAO.insert(goodThingy)
        Ok("adding file and item to db success")
      case _ =>
        MyDAO.insert(goodThingy)
        Ok("at least the form was good, no file attached")

    }
  }
)
  }

Num of actor instance

I'm new to akka-actor and confused with some problems:

  1. when I create an actorSystem, and use actorOf(Props(classOf[AX], ...)) to create actor in main method, how many instances are there for my actor AX?
  2. If the answer to Q1 was just one, does this mean whatever data-structure I created in the AX actor class's definition will only appear in one thread and I should not concern about concurrency problems?
  3. What if one of my actor's action (one case in receive method) is a time consuming task and would take quite long time to finish? Will my single Actor instance not responding until it finish that task?
  4. If the answer to Q3 is right, what I am supposed to do to prevent my actor from not responding? Should I start another thread and send another message back to it until finish the task? Is there a best practice there I should follow?

json4s (de)serialisation of Java Pojo's

I'm working on a Scala/Spray/Akka system on which we have the need to serialise and deserialise objects to json, either for the REST interface or for persisting the model.

Some of the model object are Java POJO's. We're using Json4s as serialiser, but it seems to lack support for POJO's. When serialising to json I was able to overcome this limitation by implementing a CustomSerializer. However, when deserialising Json4s tries to do it's own reflection magic resulting in a "Can't find ScalaSig for class ..." exception. The custom serialiser is never called.

I created a small project on Github to replicate this issue. Does anyone know how to solve this issue? Did anyone have a similar issue?

The issue is also reported with json4s (nr 228).

How to run spark interactively in cluster mode

I have a spark cluster running on

spark://host1:7077
spark://host2:7077
spark://host3:7077

and connect through /bin/spark-shell --master spark://host1:7099 When trying to read a file with:

val textFile = sc.textFile("README.md")
textFile.count()

The prompt says

WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

When checked through Web ui on host1:8080 it shows:

Workers: 0
Cores: 0 Total, 0 Used
Memory: 0.0 B Total, 0.0 B Used
Applications: 0 Running, 2 Completed
Drivers: 0 Running, 0 Completed
Status: ALIVE

My question is how to specify cores and memory when running in spark-shell cluster mode? Or I have to run by packaging my scala code into .jar file then submit the job to spark?

Thanks

How to return all the elements in class ListBuffer?

I'm a bit confused on how to return all elements in a list. I'm trying to create a function that takes a string as input and returns a ListclassBuffer where all the elements are returned in the list. For example for string: "Author", my function would return the Listclass Buffer of all the authors in the list. First, all I would need to figure out what method to put the string into the listclassbuffer. I'm a little confused. Then I would use the .toList to convert this listclassBuffer. Afterwards, I'm thinking I need to design a four loop for this function where for each element in the list, I could use the .appends function to return all the elements about this? I'm just really confused about how set things up and how the code could look like.

R and scala: scala get R variable

import org.ddahl.jvmr.RInScala
object Birt_obj {
   def main(args: Array[String]) {
        R> """ 
        ...
        Rstr2 =(as.matrix(Rstr1))"""

       //var resultArray : Array[String] = Array()
    }

}

result:

  [,1]
1    2
2    2
5   12
6    4
7    1
8   12

I want to know how can scala get the variable Rstr2 and save in array ??

How to parallelize several apache spark rdds?

I have the next vals:

val bcs = sc.sql("select * from bcs")
val imps = sc.sql("select * from imps")

I want to do:

bcs.map(x => wrapBC(x)).collect
imps.map(x => wrapIMP(x)).collect

but when I do this, it's running not async. I can to do it with Future, like that:

Future { bcs.map(x => wrapBC(x)).collect }
Future { imps.map(x => wrapIMP(x)).collect }
...

I want to do this without Future, how can I do it?

What is the difference between method and function in Scala [duplicate]

This question already has an answer here:

Recently,I am learning Scala.The difference between function and method makes me confused . What is the difference and how do i use it at the specific time? Thanks!

Type mismatch with identical types in Spark-shell

I have build a scripting workflow around the spark-shell but I'm often vexed by bizarre type mismatches (probably inherited from the scala repl) occuring with identical found and required types. The following example illustrates the problem. Executed in paste mode, no problem

scala> :paste
// Entering paste mode (ctrl-D to finish)


import org.apache.spark.rdd.RDD
case class C(S:String)
def f(r:RDD[C]): String = "hello"
val in = sc.parallelize(List(C("hi")))
f(in)

// Exiting paste mode, now interpreting.

import org.apache.spark.rdd.RDD
defined class C
f: (r: org.apache.spark.rdd.RDD[C])String
in: org.apache.spark.rdd.RDD[C] = ParallelCollectionRDD[0] at parallelize at <console>:13
res0: String = hello

but

scala> f(in)
<console>:29: error: type mismatch;
 found   : org.apache.spark.rdd.RDD[C]
 required: org.apache.spark.rdd.RDD[C]
              f(in)
                ^ 

There are related discussion about the scala repl and about the spark-shell but the mentioned issue seems unrelated (and resolved) to me.

This problem causes serious problems for writing passable code to be executed interactively in the repl, or causes to lose most of the advantage of working in a repl to begin with. Is there a solution? (And/or is it a known issue?)

Edits:

Problems occured with spark 1.2 and 1.3.0. Test made on spark 1.3.0 using scala 2.10.4

It seems that, at least in the test, repeating the statement using the class separately from the case class definition, mitigate the problem

scala> :paste
// Entering paste mode (ctrl-D to finish)


def f(r:RDD[C]): String = "hello"
val in = sc.parallelize(List(C("hi1")))

// Exiting paste mode, now interpreting.

f: (r: org.apache.spark.rdd.RDD[C])String
in: org.apache.spark.rdd.RDD[C] = ParallelCollectionRDD[1] at parallelize at <console>:26

scala> f(in)
res2: String = hello

Disassembling Scala code

How to disassemble Scala code? Can it be done without first building the Jar and decompiling the resulting .class files, or alternatively a faster way to do so?

In Python you have dis and the following is an example of how it's used:

def myfunc(alist):
    return len(alist)

>>> dis.dis(myfunc)
  2           0 LOAD_GLOBAL              0 (len)
              3 LOAD_FAST                0 (alist)
              6 CALL_FUNCTION            1
              9 RETURN_VALUE

How can I view the SQL generated by Anorm?

I am trying out anorm to execute a number of insert statements, and then return the value of LAST_INSERT_ID(). But I am getting an error saying my SQL syntax is invalid.

Can anyone tell me how to check and see what the final generated SQL that is sent ty MySQL looks like?

Akka:Error In merging two list of Futures

I have 2 list of futures; "List<Future<Object1>>"and "List<Future<Object2>>". I want to merge them to create a new "List<Future<Object1>>", so that modified object1 will have a object2 within it.

Please see the stripped down version of my code

 public Future<Object> mergeResponse(List<Future<Object1>> listOfObject1Future, final List<Future<Object2>> listOfObject1Future2) {
    Iterable<Future<Object>> listOfObject1 = listOfObject1Future; //first list of future
    Future<Iterable<Object>> listOfObject1Future = Futures.sequence(listOfObject1, ec);

    Future<Object> mergedResponse = listOfObject1Future.flatMap(new  Mapper<Iterable<Object>, Future<Object>>() {
        public Future<Object> apply(final Iterable<Object> response1){
            return Futures.future(new Callable<Object>() {
                public Object call() throws Exception { 
                Iterator<Object> listObject1Iterator = response1.iterator();
                final List<Future<Object>> finalResponse = new ArrayList<>();
                while (listObject1Iterator.hasNext()) {
                    Object1 responseObject1 = listObject1Iterator.next();
                    finalResponse = merge(responseObject1,  listOfObject1Future); //passing the second list of futures

                }
                 return finalResponse;
                }
            }, ec);
        }
    }, ec);
    return mergedResponse;
 }


public Future<Object> merge(final Object1 responseObject1, List<Future<Object>> listOfObject2Future){

    Iterable<Future<Object>> listOfObject2 = listOfObject2Future;
    Future<Iterable<Object>> listOfObject2Future = Futures.sequence(listOfObject2, ec);

    Future<Object> response = listOfObject2Future.flatMap(new Mapper<Iterable<Object>, Future<Object>>() {
        public Future<Object> apply(final Iterable<Object> response2){
            return Futures.future(new Callable<Object>() {
                public Object call() throws Exception { 
                Iterator<Object> listObject2Iterator = response2.iterator();
                while (listObject2Iterator.hasNext()) {
                    Object2 responseObject2 = listObject2Iterator.next();

                if(responseObject1.getXXX().equals(responseObject2.getXXX())   //setting object2 in object1 based on some condition
                {
                   responseObject1.setObject2(responseObject2);
                }
                }
                return responseObject1;
                }
            }, ec);
        }
    }, ec);

    return response;
}

However when I am returning the final response, intermittently Object1 is not getting updated with Object2. All the futures are populated by making Rest calls to different services. On analysis, I can see in those scenarios , its not enetering the second flatmap itself. I think the problem is first future flatmap is executed before second list of futurepopulation. Please help

saveTocassandra could not find implicit value for parameter rwf

I'm trying to save a dataset in Cassandra database using spark scala, But I am getting an exception while running a code: link used:http://ift.tt/1wdt6tj

error:

could not find implicit value for parameter rwf: com.datastax.spark.connector.writer.RowWriterFctory[FoodToUserIndex]
 food_index.saveToCassandra("tutorial", "food_to_user_index")
                          ^

.scala

def main(args: Array[String]): Unit = {

val conf = new SparkConf(true)
  .set("spark.cassandra.connection.host", "localhost")
  .set("spark.executor.memory", "1g")
  .set("spark.cassandra.connection.native.port", "9042")
val sc = new SparkContext(conf)


case class FoodToUserIndex(food: String, user: String)

val user_table = sc.cassandraTable[CassandraRow]("tutorial",   "user").select("favorite_food","name")

val food_index = user_table.map(r => new   FoodToUserIndex(r.getString("favorite_food"), r.getString("name")))
food_index.saveToCassandra("tutorial", "food_to_user_index")}

build.sbt

name := "intro_to_spark"

version := "1.0"

scalaVersion := "2.11.2"

libraryDependencies += "org.apache.spark" %% "spark-core" % "1.2.0"

libraryDependencies += "com.datastax.spark" %% "spark-cassandra-connector" %  "1.2.0-rc3"

if change the version of scala and cassandra connector to 2.10, 1.1.0 it's work. but i need use scala 2.11:

scalaVersion := "2.10.4"

libraryDependencies += "org.apache.spark" %% "spark-core" % "1.2.0"

libraryDependencies += "com.datastax.spark" %% "spark-cassandra-connector" %   "1.1.0" withSources() withJavadoc()

What are the differences between Either and Option?

According to the documentation:

A common use of Either is as an alternative to Option for dealing with possible missing values.

Why would you use one over the other?

Binding a datepicker form value in Play framework

I am using a datepicker in my page. The datepicker is described as follows:

<div class="form-group  " id="startDate_field">
    <label class="control-label col-md-2 " for="startDate">@Messages.get("event.startDate")</label>
    <div class="col-md-2">
        <div class="input-group date" id="startDatePicker">
            <input type="text" class="form-control" id="startdate" name="startDate" value='@{myForm.form("startDate")}'>
                <span class="input-group-addon">
                    <i class="glyphicon glyphicon-th">
                    </i>
                </span>
        </div>
    </div>
</div> 

Which gives me the compilation error:

value form is not a member of play.data.Form[models.Event]

How can I bind this value to the form value?

Apache Spark: network errors between executors

I'm running Apache Spark 1.3.1 on Scala 2.11.2, and when running on an HPC cluster with large enough data, I get numerous errors like the ones at the bottom of my post (repeated multiple times per second, until the job gets killed for being over time). Based on the errors, the executor is attempting to get shuffle data from other nodes but is unable to do so.

This same program executes fine with either (a) a smaller amount of data, or (b) in local-only mode, so it has something to do with the data getting sent over the network (and isn't triggered with a very small amount of data).

The code that is being executed around the time this happens is as follows:

val partitioned_data = data  // data was read as sc.textFile(inputFile)
  .zipWithIndex.map(x => (x._2, x._1))
  .partitionBy(partitioner)  // A custom partitioner
  .map(_._2)

// Force previous lazy operations to be evaluated. Presumably adds some
// overhead, but hopefully the minimum possible...
// Suggested on Spark user list: http://ift.tt/1HgtRvw
sc.runJob(partitioned_data, (iter: Iterator[_]) => {})

Is this indicative of a bug, or is there something I'm doing wrong?

Here's a small snippet of the stderr log of one of the executors (full log is here):

15/04/21 14:59:28 ERROR TransportRequestHandler: Error sending result ChunkFetchSuccess{streamChunkId=StreamChunkId{streamId=1601401593000, chunkIndex=0}, buffer=FileSegmentManagedBuffer{file=/tmp/spark-0f8d0598-b137-4d14-993a-568b2ab3709a/spark-12d5ff0a-2793-4b76-8a0b-d977a5924925/spark-7ad9382d-05cf-49d4-9a52-d42e6ca7117d/blockmgr-b72d4068-d065-47e6-8a10-867f723000db/15/shuffle_0_1_0.data, offset=26501223, length=6227612}} to /10.0.0.5:41160; closing connection
java.io.IOException: Resource temporarily unavailable
    at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
    at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415)
    at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516)
    at org.apache.spark.network.buffer.LazyFileRegion.transferTo(LazyFileRegion.java:96)
    at org.apache.spark.network.protocol.MessageWithHeader.transferTo(MessageWithHeader.java:89)
    at io.netty.channel.socket.nio.NioSocketChannel.doWriteFileRegion(NioSocketChannel.java:237)
    at io.netty.channel.nio.AbstractNioByteChannel.doWrite(AbstractNioByteChannel.java:233)
    at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:264)
    at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:707)
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.flush0(AbstractNioChannel.java:315)
    at io.netty.channel.AbstractChannel$AbstractUnsafe.flush(AbstractChannel.java:676)
    at io.netty.channel.DefaultChannelPipeline$HeadContext.flush(DefaultChannelPipeline.java:1059)
    at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:688)
    at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:669)
    at io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelOutboundHandlerAdapter.java:115)
    at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:688)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:718)
    at io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:706)
    at io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:741)
    at io.netty.channel.DefaultChannelPipeline.writeAndFlush(DefaultChannelPipeline.java:895)
    at io.netty.channel.AbstractChannel.writeAndFlush(AbstractChannel.java:240)
    at org.apache.spark.network.server.TransportRequestHandler.respond(TransportRequestHandler.java:147)
    at org.apache.spark.network.server.TransportRequestHandler.processFetchRequest(TransportRequestHandler.java:119)
    at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:95)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:91)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:44)
    at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
    at java.lang.Thread.run(Thread.java:619)
15/04/21 14:59:28 ERROR TransportRequestHandler: Error sending result ChunkFetchSuccess{streamChunkId=StreamChunkId{streamId=1601401593000, chunkIndex=1}, buffer=FileSegmentManagedBuffer{file=/tmp/spark-0f8d0598-b137-4d14-993a-568b2ab3709a/spark-12d5ff0a-2793-4b76-8a0b-d977a5924925/spark-7ad9382d-05cf-49d4-9a52-d42e6ca7117d/blockmgr-b72d4068-d065-47e6-8a10-867f723000db/27/shuffle_0_5_0.data, offset=3792987, length=2862285}} to /10.0.0.5:41160; closing connection
java.nio.channels.ClosedChannelException
15/04/21 14:59:28 ERROR TransportRequestHandler: Error sending result ChunkFetchSuccess{streamChunkId=StreamChunkId{streamId=1601401593002, chunkIndex=0}, buffer=FileSegmentManagedBuffer{file=/tmp/spark-0f8d0598-b137-4d14-993a-568b2ab3709a/spark-12d5ff0a-2793-4b76-8a0b-d977a5924925/spark-7ad9382d-05cf-49d4-9a52-d42e6ca7117d/blockmgr-b72d4068-d065-47e6-8a10-867f723000db/15/shuffle_0_1_0.data, offset=0, length=10993212}} to /10.0.0.6:42426; closing connection
java.io.IOException: Resource temporarily unavailable
    at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
    at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415)
    at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516)
    at org.apache.spark.network.buffer.LazyFileRegion.transferTo(LazyFileRegion.java:96)
    at org.apache.spark.network.protocol.MessageWithHeader.transferTo(MessageWithHeader.java:89)
    at io.netty.channel.socket.nio.NioSocketChannel.doWriteFileRegion(NioSocketChannel.java:237)
    at io.netty.channel.nio.AbstractNioByteChannel.doWrite(AbstractNioByteChannel.java:233)
    at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:264)
    at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:707)
    at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.flush0(AbstractNioChannel.java:315)
    at io.netty.channel.AbstractChannel$AbstractUnsafe.flush(AbstractChannel.java:676)
    at io.netty.channel.DefaultChannelPipeline$HeadContext.flush(DefaultChannelPipeline.java:1059)
    at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:688)
    at io.netty.channel.AbstractChannelHandlerContext.flush(AbstractChannelHandlerContext.java:669)
    at io.netty.channel.ChannelOutboundHandlerAdapter.flush(ChannelOutboundHandlerAdapter.java:115)
    at io.netty.channel.AbstractChannelHandlerContext.invokeFlush(AbstractChannelHandlerContext.java:688)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:718)
    at io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:706)
    at io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:741)
    at io.netty.channel.DefaultChannelPipeline.writeAndFlush(DefaultChannelPipeline.java:895)
    at io.netty.channel.AbstractChannel.writeAndFlush(AbstractChannel.java:240)
    at org.apache.spark.network.server.TransportRequestHandler.respond(TransportRequestHandler.java:147)
    at org.apache.spark.network.server.TransportRequestHandler.processFetchRequest(TransportRequestHandler.java:119)
    at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:95)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:91)
    at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:44)
    at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
    at java.lang.Thread.run(Thread.java:619)
15/04/21 14:59:28 WARN TransportChannelHandler: Exception in connection from http://ift.tt/1GfG2pt
java.io.IOException: Connection reset by peer
    at sun.nio.ch.FileDispatcher.read0(Native Method)
    at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
    at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)
    at sun.nio.ch.IOUtil.read(IOUtil.java:206)
    at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236)
    at io.netty.buffer.PooledHeapByteBuf.setBytes(PooledHeapByteBuf.java:234)
    at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
    at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:225)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
    at java.lang.Thread.run(Thread.java:619)
15/04/21 14:59:28 ERROR TransportResponseHandler: Still have 2 requests outstanding when connection from http://ift.tt/1GfG2pt is closed
15/04/21 14:59:28 INFO RetryingBlockFetcher: Retrying fetch (1/3) for 2 outstanding blocks after 5000 ms

Running tests in Intellij for Play Framework is very slow

Is there a way to speed up the execution of unit tests for Play Framework in Intellij? I am doing TDD. Whenever I execute a test, it takes about 30 - 60 seconds to compile and run. Even a simple Hello World test takes time. Rerunning the same test even without any change will still start the make process.

I am on Intellij 14.1, on Play 2.3.8, written in Scala.

I already tried setting the java compiler to eclipse, and also tried setting Scala compiler to SBT.

Why dont akka actors have a postStart method?

Here is a situation i have to deal with - I am using websockets with play framework and each websocket connection has its own Actor as described here. Now as soon as the websocket connection is made i need to start another Actor that subscribes to a Redis channel and on getting any published message to the channel passes that message to its parent i.e the Websocket Actor. So i need to start the Redis Subscriber Actor after the Websocket Actor has started. But actors dont have a postStart method. I tried creating the Redis Subscriber Actor in the preStart method of the Websocket Actor and it works fine but i dont understand the reason for Actors not having a postStart method. Isn't this a common scenario where actors create children actors. Or is this approach of doing things incorrect?

How can I hide my implicit method or disable `LabelledGeneric` for a specific sealed family?

I have to use third-party traits which are designed as:

trait Super[T]
trait Sub[T] extends Sub[T]

and I have defined the following method to provide instances for arbitrary sealed families:

implicit def subFor[T, Repr](
  implicit
  gen: LabelledGeneric.Aux[T, Repr],
  sg: Lazy[JsonFormat[Repr]]
): Sub[T] = ???

but somewhere else in the code, somebody has already defined

implicit def superFor[T]: Super[T] = ???

for some specific types, e.g. Option and a few others.

Actually, I'd like to use the existing implementations instead of using my subFor generic implementation. However, because my method returns Sub[T] and the existing implementation returns Super[T] I can't see any way to define my implementation in such a way that it is considered lower priority.

How can I block my subFor from being called for specific types? Is there some way to disable LabelledGeneric for specific types, or some priority reducing mechanism that would stop the compiler from ever looking for my subFor if it already sees a Super[T]?

sbt-ensime not playing nice w/ SBT name hashing

Following the installation guide here to install ENSIME for Emacs. I added the addSbtPlugin("org.ensime" % "ensime-sbt" % "0.1.5") line in my ~/.sbt/0.13/plugins/plugins.sbt file and started SBT in a SBT multi-project. On start I get the following error:

/myprojectpath/project/project/build.sbt:3: error: value withNameHashing is not a member of sbt.inc.IncOptions
incOptions := incOptions.value.withNameHashing(true)
                               ^
[error] Type error in expression
Project loading failed: (r)etry, (q)uit, (l)ast, or (i)gnore?

If I remove the addSbtPlugin line then SBT starts up fine.

Thanks!

Building Spark 1.3.1 for Scala 2.11 locally fails

Following its guide, my Spark 1.3.1 installation fails after executing these commands:

$ dev/change-version-to-2.11.sh
$ build/sbt -Pyarn -Phadoop-2.4 -Dscala-2.11 -DskipTests clean package assembly

with the error message

sbt.ResolveException: download failed: org.twitter4j#twitter4j-core;3.0.3!twitter4j-core.jar

Then I tried to install normally:

$ mvn -Pyarn -Phadoop-2.4 -Dscala-2.11 -DskipTests clean package

and I get the error:

[error] Required file not found: sbt-interface.jar
...
[ERROR] Failed to execute goal net.alchim31.maven:scala-maven-plugin:3.2.0:compile (scala-compile-first) on project spark-network-common_2.11: Execution scala-compile-first of goal net.alchim31.maven:scala-maven-plugin:3.2.0:compile failed. CompileFailed -> [Help 1]

Installing in a standard way:

$ build/mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests clean package

gives me the second error shown above.

Any help would be much appreciated!

ScalaTest Plus not recognizing tests

I've been tasked to update and write a series of tests on an app in Scala Play, a language and framework I'm unfamiliar with. Part of what I'd like to do is integrate the ScalaTestPlus library. To get started I have been following the following tutorial:

http://ift.tt/1Iy9Spj

Unfortunately I am not getting very far. I have added a new unit test file to the tests folder:

import org.scalatestplus.play._

class StackSpec extends PlaySpec {

  "A Test" must {
    "pass" in {
      assert(1 == 1)
    }
    "Fail" in {
      assert(1 != 1)
    }
  }
}

and I have updated my build.sbt to include the scalatestplus library

  "org.scalatestplus" % "play_2.37" % "1.2.0" % "test"//,

Using Activator, I am trying to run my test file with test-only. Everything compiles without errors, but activator is not finding any tests

[info] No tests were executed.

I don't believe the issue is with activator, since I can run old test files (from the previous engineer) using the test and test-only commands. A quick sample of one of the previous (working) test files:

import java.util.concurrent.TimeUnit
import com.sun.xml.internal.bind.v2.TODO
import scala.collection.JavaConverters._
import controllers.Application
import models.{Item, PriorityBucket}
import play.api.test._

class WebSpec extends PlaySpecification {

  "Home page" should {
    "do something" in new WithSeleniumDbData(TestUtil.testApp) {
      Redacted.deleteAll()

      val ObId = TestUtil.create(Some(PriorityBucket.Low),
          Some(Application.ENGLISH))
      val item = Item.find(ItemId).get

      browser.goTo("/")
      browser.await().atMost(2, 
          TimeUnit.SECONDS).until(Selectors.all_obs).isPresent
    }

Any ideas where I've gone astray? Thanks in advance for the help!

I am using scala 2.11 I am using play 2.3.7

EDIT: Possibly relevant, I switched the extension from PlaySpec to FlatSpec and saw the following error when compiling:

SampleSpec.scala:10: value in is not a member of String
[error]     "pass" in {

I made sure to import FlatSpec as well, which has me a bit confused--is FlatSpec a member of ScalaTest but not a member of ScalaTestPlus, I don't see why else the compilation would fail.

UPDATE: To further investigate the issue I spun up a brand new Play app and copied over my sample test. After some tooling around with versions I've been able to get my test to run on the activator test command with the rest of the suite. Unfortunately, any other commands like test-only are still returning no tests run.

RDD filter in scala spark

i have a Dataset and i want to extract those (review/text) have (review/time) between x and y, for example ( 1183334400 < time < 1185926400),

here are samples of my data:

product/productId: B000278ADA
product/title: Jobst Ultrasheer 15-20 Knee-High Silky Beige Large
product/price: 46.34
review/userId: A17KXW1PCUAIIN
review/profileName: Mark Anthony "Mark"
review/helpfulness: 4/4
review/score: 5.0
review/time: 1174435200
review/summary: Jobst UltraSheer Knee High Stockings
review/text: Does a very good job of relieving fatigue.

product/productId: B000278ADB
product/title: Jobst Ultrasheer 15-20 Knee-High Silky Beige Large
product/price: 46.34
review/userId: A9Q3932GX4FX8
review/profileName: Trina Wehle
review/helpfulness: 1/1
review/score: 3.0
review/time: 1352505600
review/summary: Delivery was very long wait.....
review/text: It took almost 3 weeks to recieve the two pairs of stockings .

product/productId: B000278ADB
product/title: Jobst Ultrasheer 15-20 Knee-High Silky Beige Large
product/price: 46.34
review/userId: AUIZ1GNBTG5OB
review/profileName: dgodoy
review/helpfulness: 1/1
review/score: 2.0
review/time: 1287014400
review/summary: sizes recomended in the size chart are not real
review/text: sizes are much smaller than what is recomended in the chart. I tried to put it and sheer it!.

my Spark-Scala Code :

import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.io.{LongWritable, Text}
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat
import org.apache.spark.{SparkConf, SparkContext}

object test1 {
  def main(args: Array[String]): Unit = {
    val conf1 = new SparkConf().setAppName("golabi1").setMaster("local")
    val sc = new SparkContext(conf1)
    val conf: Configuration = new Configuration
    conf.set("textinputformat.record.delimiter", "product/title:")
    val input1=sc.newAPIHadoopFile("data/Electronics.txt",     classOf[TextInputFormat], classOf[LongWritable], classOf[Text], conf)
    val lines = input1.map { text => text._2}
    val filt = lines.filter(text=>(text.toString.contains(tt => tt in (startdate until enddate))))
    filt.saveAsTextFile("data/filter1")
  }
}

but my code not work,

how can i filter these lines?

Formatting the join rdd - Apache Spark

I have two key value pair RDD, I join the two rdd's and I saveastext file, here is the code:

val enKeyValuePair1 = rows_filter6.map(line => (line(8) -> (line(0),line(4),line(10),line(5),line(6),line(14),line(1),line(9),line(12),line(13),line(3),line(15),line(7),line(16),line(2),line(14))))

val enKeyValuePair = DATA.map(line => (line(0) -> (line(2),line(3))))

val final_res = enKeyValuePair1.leftOuterJoin(enKeyValuePair)

val output = final_res.saveAsTextFile("C:/out")

my output is as follows:
(534309,((17999,5161,45005,00000,XYZ,,29.95,0.00),None))

How can i get rid of all the parenthesis? I want my output as follows:

534309,17999,5161,45005,00000,XYZ,,29.95,0.00,None

Task not serializable when using object in REPL

So, another SO question prompted me to try the following:

object Foo{
  def f = 1
}

sc.parallelize(List(1)).map(x=>{
  val myF = Foo.f _
  x + myF()
}

Which works, but the following does not:

object Foo{
  def f = 1
  def run(rdd: org.apache.spark.rdd.RDD[Int]) = rdd.map(x=>{
    val myF = Foo.f _
    x + myF()
  }
}

Foo.run(sc.parallelize(List(1)))

I will take a deeper look at the serialization stack tomorrow when I can remove the mess of the REPL output. But, this should be the same thing. Why does one way yell and the other does not.

how to run scalding test in local mode with local input file

Scalding has a great utility to run an integration test for the job flow. In this way the inputs and outputs are the in-memory buffer

val input = List("0" -> "This a a day")
val expectedOutput = List(("This", 1),("a", 2),("day", 1))
 JobTest(classOf[WordCountJob].getName)
  .arg("input", "input-data")
  .arg("output", "output-data")
  .source(TextLine("input-data"), input)
  .sink(Tsv("output-data")) {
  buffer: mutable.Buffer[(String, Int)] => {
    buffer should equal(expectedOutput)
  }
}.run

How can I transfare/write another code that will read input and write output to the real local file? Like FileTap/LFS in cascading - and not an in-memory approach

How to make -Dsbt.override.build.repos=true global for SBT?

I want to override all repos even the ones introduced inadvertently in my build.sbt files so we can point to our proxy and have a common binary base for all the team. The option

$ sbt -Dsbt.override.build.repos=true

does the job but I'd like to make this option permanent. I've been looking at http://ift.tt/1DyY4Pf but I don't know how to translate that option to the global.sbt file they mention.

How would you configure that option globally?

WS API in Play framework using Scala is giving java.net.ConnectException error

I am trying to read a remote json file and then parse it by using the WS API in Scala using the play framework. I am getting the following error.

[info] application - Application has just started
[info] application - scheduler-initalDelay in minutes : 13
[info] application - The json file is scala.concurrent.impl.Promise$DefaultPromise@2f2fbc09
[info] application - Application shutdown...
java.net.ConnectException: http://ift.tt/1Gg89aO
    at com.ning.http.client.providers.netty.NettyConnectListener.operationComplete(NettyConnectListener.java:103)
    at org.jboss.netty.channel.DefaultChannelFuture.notifyListener(DefaultChannelFuture.java:427)
    at org.jboss.netty.channel.DefaultChannelFuture.notifyListeners(DefaultChannelFuture.java:418)
    at org.jboss.netty.channel.DefaultChannelFuture.setFailure(DefaultChannelFuture.java:380)
    at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$1.operationComplete(NioClientSocketPipelineSink.java:115)
    at org.jboss.netty.channel.DefaultChannelFuture.notifyListener(DefaultChannelFuture.java:427)
    at org.jboss.netty.channel.DefaultChannelFuture.notifyListeners(DefaultChannelFuture.java:413)
    at org.jboss.netty.channel.DefaultChannelFuture.setSuccess(DefaultChannelFuture.java:362)
    at org.jboss.netty.channel.AbstractChannel$ChannelCloseFuture.setClosed(AbstractChannel.java:355)
    at org.jboss.netty.channel.AbstractChannel.setClosed(AbstractChannel.java:185)
    at org.jboss.netty.channel.socket.nio.AbstractNioChannel.setClosed(AbstractNioChannel.java:197)
    at org.jboss.netty.channel.socket.nio.NioSocketChannel.setClosed(NioSocketChannel.java:84)
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:357)
    at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink.eventSunk(NioClientSocketPipelineSink.java:58)
.
.
.
.
.
.
    at sbt.RunnerWrapper$1.runRunner(FrameworkWrapper.java:200)
    at sbt.RunnerWrapper$1.execute(FrameworkWrapper.java:236)
    at sbt.ForkMain$Run.runTest(ForkMain.java:239)
    at sbt.ForkMain$Run.runTestSafe(ForkMain.java:211)
    at sbt.ForkMain$Run.runTests(ForkMain.java:187)
    at sbt.ForkMain$Run.run(ForkMain.java:251)
    at sbt.ForkMain.main(ForkMain.java:97)
Caused by: java.nio.channels.ClosedChannelException
    ... 98 more

Below is the code snippet that is throwing the error.

case class Payout(payout: Double)
implicit val readPayout: Reads[Payout] = (__ \ "data" \ "data" \ "Stat" \ "payout").read[Double].map { Payout(_)}
  def PayoutChange() = {

    val apiUrl: String = "http://ift.tt/1Gg89aO"

    Logger.info("Requesting the payout")
    val temp = WS.url(apiUrl).withRequestTimeout(1000)
    val result: Future[JsValue] = temp.get().map {
      response =>
        (response.json).as[JsValue]
    }
    Logger.info("The json file is " + result)
    result.onComplete {
      result =>
        val jsonurl = result.get
        val payout = jsonurl.\("payout")
       Logger.debug("payout "+payout)
    }

  }

I am able to access the "apiUrl" when I put the url in browser. The url has some identification keys, but the keys are hardcoded within the URL. If I am able to access the url through browser, what might be preventing it from getting accessed from the code?

Can't create column of type double when making a shapefile

// My type
val typeBuilder = new SimpleFeatureTypeBuilder()
typeBuilder.setName("line-query-seg")
typeBuilder.add("value", classOf[Double])
typeBuilder.add("seg", classOf[LineString])
val sft = typeBuilder.buildFeatureType()

// Trying to create a shapefile of this type
val dataStoreFactory = new ShapefileDataStoreFactory()
val shp = new File("/tmp/collection.shp")
val params = scala.collection.Map[String, Serializable]("url" -> shp.toURI.toURL,
  "create spatial index" -> true
)
val shpDS = dataStoreFactory.createNewDataStore(params.asJava).asInstanceOf[ShapefileDataStore]
shpDS.createSchema(sft)

The above code fails with:

java.io.IOException: Unable to write column value : double

I'm using scala version 2.10.4 and geotools version 11.2

What is the difference between Try and Either?

According to the documentation:

The Try type represents a computation that may either result in an exception, or return a successfully computed value. It's similar to, but semantically different from the scala.util.Either type.

The docs do not go into further detail as to what the semantic difference is. Both seem to be able to communicate successes and failures. Why would you use one over the other?

How to get the structure of JSON in scala

I have a lot of JSON files which are not structured and I want to get a deeper element and all the element to get to it.

For example :

{
"menu": {
    "id": "file",
    "popup": {
        "menuitem": {
                  "module"{
                      "-vdsr": "New",
                      "-sdst": "Open",
                      "-mpoi": "Close" }
        ...
    }
}

In this case the result would be :

menu.popup.menuitem.module.-vdsr
menu.popup.menuitem.module.-sdst
menu.popup.menuitem.module.-mpoi

I tried Jackson and Json4s and they are efficient to go the last value but, I don't see how I can get the whole structure.

I want this to run a job with apache spark on very huge JSON files and the structure will be very complex for each. I also tried sparkSQL but if I don't know the entire structure I can't get it.

Format a double without decimals & adding commas for thousands in Scala Java Play framework

I have doubles such as:

87654783927493.00

23648.00

I want to output them as:

87,654,783,927,493

23,648


I found the following solution:

@("%,.2f".format(myDoubleHere).replace(".00",""))

This was with the help of:

how to format a number/date in play 2.0 template?

What is the proper way to format a double in a Play 2 template


I'd like to replace this shameless solution by something more clean. Chopping off those decimals using a .replace() method is really not pretty.

how to run scala jar file within java file?

Ineed to run scala jar file from java code :

so if I had this scala code:

 object test extends App{
   override def main(args: Array[String]) {
      println("Hello, world! " + args.toList)
    }
  }

and i exported in demo.jar , i want to exceute from inside java application? the way im using it is throw runtime process which not working with me?

import java.io.InputStream;

import org.apache.commons.io.IOUtils;

public class testscalajar{

    public static void main(String[] args) {
        // TODO Auto-generated method stub
        try{
            Process proc = Runtime.getRuntime().exec(new String[]{"scala","-cp","d:\\demo.jar", "test"});
         // Then retreive the process output

         InputStream in = proc.getInputStream();
         InputStream err = proc.getErrorStream();
         String inString = IOUtils.toString(in, "UTF8"); 
         String errString = IOUtils.toString(err, "UTF8"); 
       System.out.println(inString);
       System.out.println(errString);
            }
            catch ( Exception e)
            {
                  System.out.println(e.getMessage());
            }

    }

}

any body have any other solutions?????

Are Futures executed on a single thread? (Scala)

Using the default implicit execution context in Scala, will each new future be computed on a single, dedicated thread or will the computation be divided up and distributed to multiple threads in the thread pool?

I don't know if this helps, the background to this question is that I want to perform multiple concurrent operations using the HtmlUnit API. To do this, I would wrap each new WebClient instance in a Future. The only problem is that the WebClient class is not thread safe, so I'm worried that it might broken up and sent to different threads.

Understanding @specialized with Non-Primitives

Looking at @specialized's docs, I see:

scala> class MyList[@specialized T]
defined class MyList

My incomplete understanding is that MyList accepts a generic parameter, T, that must be a primitive.

scala> new MyList[Int] {}
res1: MyList[Int] = $anon$1@17884d

But, I then made a case class.

scala> case class Zig(x: String)
defined class Zig

However, given my above assumption, I did not expect to be able to new a MyList with a parameterized type of Zig.

scala> new MyList[Zig]
res2: MyList[Zig] = MyList@62de73eb

What am I missing?

Scala: Either, Right, Left

I am new to Scala and currently trying to work with the play framework.

This is working code I wrote:

def authenticate = Action (BodyParsers.parse.json) { req =>
    req.body.validate[AuthenticationForm].map {form =>
        UserRepository.findByCredentials(form).map { user =>
            user.apiKeys.find(_.deviceId == form.deviceId).map { apiKey =>
                Ok(Json.toJson(apiKey))
            }.getOrElse({
                // HOW DO I TRANSFORM THIS INTO MORE BEAUTIFUL CODE
                val createdApiKey = ApiKeyRepository.create(new ApiKey(form.deviceId, form.deviceId))
                val userToWithNewApiKey = user.copy(apiKeys = user.apiKeys.:+(createdApiKey))
                UserRepository.update(userToWithNewApiKey)

                Ok(Json.toJson(createdApiKey))
            })
        }.getOrElse {
            Unauthorized
        }
    }.getOrElse {
        BadRequest
    }
}

Well, that does not look tooo nice. Can we do better? I can not, yet. But I saw this stackoverflow post: http://ift.tt/1yOAFOg That looks pretty nice :)

Now I am wondering, how to transform my code, so it looks like in the given example. Of course I already tried, but I could not make it compile and also do not know how to handle the code after the comment ("HOW DO I TRANSFORM THIS INTO MORE BEAUTIFUL CODE"). In my case I am using play.api.mvc.Result instead of "Failure" as given in the link above. So of what type should my Either[play.api.mvc.Result, ?WhatHere?] be?

Best regards

What's the C# equivalent of Scala's Crypto.encryptAES?

I'm currently trying to create an interface with a customer and I need to encrypt some headers before sending them. The example they have provided is in Scala, but our system uses C#.

The method they use is Crypto.encryptAES(name, privateKey), what's the equivalent of this in C#?

scala challenge - what am I missing?

Working through the 99 scala problems and confused by problem 23. To my eyes, the example is incongruent with the stated problem. Specifically the Symbol 'e in the resulting list isn't in among the input. Am I missing something?

Problem and example are as follows:

P23 (**) Extract a given number of randomly selected elements from a list.
    Example:

    scala> randomSelect(3, List('a, 'b, 'c, 'd, 'f, 'g, 'h))
    res0: List[Symbol] = List('e, 'd, 'a)

    Hint: Use the solution to problem P20

How to implement a nested java interface in scala

Say I have the following legacy java defined:

abstract class A {

  abstract I foo();

  public interface I
  {
    int bar();
  }
}

And I want to implement this in scala something like the following:

class MyA extends A {

  def foo() = new I {

    def bar = 3

  }
}

The scala will not compile with the error

not found: type I

How can I refer to the java interface I in my scala code?

СonvertIng Biginteger to long causes error

I am moving a playframework app database from PostgreSQL to MySQL.

And I get this error when launching the app:

Cannot convert 126: class java.math.BigInteger to Long for column ColumnName(Speaker.id,Some(id))

It seems to come from there:

def listAll: List[Speaker] = DB.withConnection {implicit c =>
SQL("SELECT * FROM Speaker;")
  .as(speakerParser *)}<---

Here is the code of the speackerParser:

private val speakerParser: RowParser[Speaker] = {
  get[Long]("id") ~
  get[String]("firstName") ~
  get[String]("lastName") ~
  get[String]("title") ~
  get[String]("team") ~
  get[String]("organisation") ~
  get[String]("email") map {
    case id ~ firstName ~ lastName ~ title ~ team ~ organisation ~ email => Speaker(id, firstName, lastName, title, team, organisation, email)
  }}

And the column wich cause my troubles is id:

Table: Speaker
Columns:
id  bigint(20) UN AI PK
title   varchar(20)
firstName   varchar(255)
lastName    varchar(255)
email   varchar(255)
team    varchar(255)
organisation    varchar(255)

This code is not from me but I need to modify it.

I am new to postgre, scala and Play so I may forget something really simple.

Preventing the creation of anonymous classes from traits, without an enclosing object

Suppose I have the following trait and object:

trait NoAnon {
    val i: Int
}

object NoAnon {
    def create = new NoAnon {
        val i = 123
    }
}

I would like to prevent anonymous instances of NoAnon from being created outside of the companion object. ie. only create in this example should be allowed to do this. I can enclose these within another object, and make the trait private to that object to accomplish this:

object Ctx {

    private[Ctx] trait NoAnon {
        val i: Int
    }

    object NoAnon {
        def create = new Ctx.NoAnon {
            val i = 123
        }
    }

}

Is it possible to do this without the enclosing object Ctx?

Map tuples to tuples using Iterator

Why is the following code does not work, and how can I overcome it using Iterator?

def f(str : String) : (String, String) = {
  str.splitAt(1)
}
var with_id : Iterator[(String, Int)] = List(("test", 1), ("list", 2), ("nothing", 3), ("else", 4)).iterator

println(with_id.mkString(" "))

val result = with_id map { (s : String, i : Int) => (f(s), i) }

println(result.mkString(" "))

Expected output is:

(("t", "est"), 1) (("l", "ist"), 2) ...

Error:

Error:(28, 54) type mismatch;
found   : (String, Int) => ((String, String), Int)
required: ((String, Int)) => ?
val result = with_id map { (s : String, i : Int) => (f(s), i) }
                                                 ^

Play Scala Anorm dynamic SQL for UPDATE query

My Google-fu is letting me down, so I'm hoping you can help

I'm building some webservices is the play framework using scala and anorm for database access

One of my calls is to update an existing row in a database - i.e run a query like

UPDATE [Clerks]
   SET [firstName] = {firstName}
  ,[lastName] = {lastName}
  ,[login] = {login}
  ,[password] = {password}
 WHERE [id] = {id}

My method receives a clerk object BUT all the parameters are optional (except the id of course) as they may only wish to update a single column of the row like so

UPDATE [Clerks]
   SET [firstName] = {firstName}
 WHERE [id] = {id}

So I want the method to check which clerk params are defined and build the 'SET' part of the update statement accordingly

It seems like there should be a better way than to go through each param of the clerk object, check if it is defined and build the query string - but I've been unable to find anything on the topic so far.

Does anyone have any suggestions how this is best handled

how to pass a Scala function to another function as a function?

I'm just not understanding how to build a function from previous functions.

For example, math.min() function that takes the minimum of two numbers. What if I wanted to create a function min3Z(a:int, b:int, c:Int)? How would I build this from min?

What does treating a method as a function mean in Scala?

By assigning a variable (or value?) a method name with a space and an underscore, you tell scala to treat the method as a function, which apparently means doing more than simply taking the value generated by a call to the method and assigning to the variable. What else is/can go on through such an assignment?

Conditional drop of tables in slick

Any idea how to do a conditional drop in Slick 3.0, to prevent An exception or error caused a run to abort: Unknown table 'MY_TABLE' if for some reason it doesn't exist?

def clear = {
    val operations = DBIO.seq(
      myTable.schema.drop,
      // other table definitions
      ...
    )
    db.run(operations)
  }

mina "Discarding output packet because channel is being closed" missing command output

We use Mina sshd client to connect to Linux based server. It looks like when we get the command SSH_MSG_CHANNEL_DATA along with the Received SSH_MSG_CHANNEL_EOF, Mina will not put the data into the output buffer and will log: "ChannelExec - Discarding output packet because channel is being closed"

Here is my sample code and Mina debug.

Is This a bug our am I using Mina async wrong?

val channel: ChannelExec = clientSession.createExecChannel(cmd)
channel.setupSensibleDefaultPty()
channel.setUsePty(true)
channel.setStreaming(ClientChannel.Streaming.Async)

val bais: ByteArrayInputStream = new ByteArrayInputStream(Array[Byte]())
channel.setIn(bais)

val openFuture = channel.open()

val baosOut: ByteArrayOutputStream = new ByteArrayOutputStream()

openFuture.addListener(new SshFutureListener[OpenFuture] {
    override def operationComplete(future: OpenFuture): Unit = {
        channel.getAsyncOut.read(new Buffer).addListener(new SshFutureListener[IoReadFuture] {
            def operationComplete(future: IoReadFuture) {
                readBuffer(channel, future, channel.getAsyncOut, baosOut, this)
            }
        })
    }
})
openFuture.await(connectionConfig.maxConnTime)

DEBUG Nio2Session - Read 244 bytes    
DEBUG ChannelExec - Received SSH_MSG_CHANNEL_WINDOW_ADJUST on channel ChannelExec[id=153, recipient=0]    
DEBUG Window - Increase client remote window by 2097152 up to 2097152    
DEBUG ChannelExec - Received SSH_MSG_CHANNEL_DATA on channel ChannelExec[id=153, recipient=0]    
DEBUG ChannelExec - Received SSH_MSG_CHANNEL_EOF on channel ChannelExec[id=153, recipient=0]    
DEBUG ChannelExec - Received SSH_MSG_CHANNEL_REQUEST exit-status on channel ChannelExec[id=153, recipient=0] (wantReply false)    
DEBUG ChannelExec - Received SSH_MSG_CHANNEL_CLOSE on channel ChannelExec[id=153, recipient=0]    
DEBUG ChannelExec - Closing ChannelExec[id=153, recipient=0] gracefully    
DEBUG ChannelSession$1 - Closing ChannelAsyncOutputStream[ChannelExec[id=153, recipient=0]] gracefully    
DEBUG ChannelExec - Send SSH_MSG_CHANNEL_EOF on channel ChannelExec[id=153, recipient=0]    
DEBUG ChannelExec - Discarding output packet because channel is being closed    
DEBUG ChannelSession$1 - ChannelAsyncOutputStream[ChannelExec[id=153, recipient=0]] closed    
DEBUG ChannelAsyncInputStream - Closing ChannelAsyncInputStream[ChannelExec[id=153, recipient=0]] gracefully    
DEBUG ChannelAsyncInputStream - ChannelAsyncInputStream[ChannelExec[id=153, recipient=0]] closed    
DEBUG ChannelAsyncInputStream - Closing ChannelAsyncInputStream[ChannelExec[id=153, recipient=0]] gracefully    
DEBUG ChannelAsyncInputStream - ChannelAsyncInputStream[ChannelExec[id=153, recipient=0]] closed    
DEBUG ChannelExec - Send SSH_MSG_CHANNEL_CLOSE on channel ChannelExec[id=153, recipient=0]    
DEBUG Nio2Session - Writing 52 bytes    
DEBUG Nio2Session - Finished writing    
DEBUG ChannelExec - Message SSH_MSG_CHANNEL_CLOSE written on channel ChannelExec[id=153, recipient=0]    
DEBUG ChannelSession$1 - ChannelAsyncOutputStream[ChannelExec[id=153, recipient=0]] is already closed    
DEBUG ChannelAsyncInputStream - ChannelAsyncInputStream[ChannelExec[id=153, recipient=0]] is already closed    
DEBUG ChannelAsyncInputStream - ChannelAsyncInputStream[ChannelExec[id=153, recipient=0]] is already closed    
DEBUG ChannelExec - ChannelExec[id=153, recipient=0] closed

How combine data from several responses in Scala Play2?

I need do some requests to different URLs, get data from their responses and put this info in one list, but i have some misunderstanding in this theme. 1) for one request i do

def doRequest: Future[WSResponse] = {
client
  .url("MY_URL")
  .withRequestTimeout(5000)
  .get()}

Then I parse json in response to List of my objects:

def list: Future[List[FoobarEntry]] = {
doRequest.map {
  response => {
    val json = response.json \ "foobar"
      json.validate[List[FoobarEntry]] match {
      case js:JsSuccess[List[FoobarEntry]]=>
        js.get
      case e:JsError => Logger.error(JsError.toFlatJson(e).toString()); List()
    }
  }
}}

I think that for several url i should write some look like

def doRequests: List[Future[WSResponse]] = {
List(client
     .url("URL_1")
     .withRequestTimeout(5000)
     .get(),
     client
     .url("URL_2")
     .withRequestTimeout(5000)
     .get())}

But how parse this list of Future[WSResponse] like my def list: Future[List[FoobarEntry]]?

play framework form binding with body parsers failure

I have a problem with multipartFormData. I want to send a form with files. The files have a size limitation. After adding the maxLength it seems the data from the Form is somehow lost. The file size validations works however.

This is my code - 2 options:

def addFormWithFiles = Action(parse.maxLength(10 * 1024 * 1024,parse.multipartFormData)) {implicit request =>
    //Option 1
    request.body match {
      case Left(_) => BadRequest("The file you attached was too large")
      case Right(multipartdata) =>
        sampleForm.bindFromRequest().fold(
          hasErrors => Ok(hasErrors),
          goodThingy => {
            multipartdata.file("myFile") match {
              case Some(file) =>
                import java.io.File
                file.ref.moveTo(new File("PATH"))
                MyDAO.insert(goodThingy)
                Ok("File injection good, and form good")
              case _ => Ok("at least the form is good, but no file")
        }
      }
    )
}

// Option 2
sampleForm.bindFromRequest().fold(
hasErrors => {
  request.body match {
    case Left(_) => Ok(hasErrors)
    case Right(_) => Ok(hasErrors)
  }
},
goodThingy =>
  request.body match {
      // file too big but the form is ok
    case Left(_) => Ok(goodThingy)
    case Right(multipartform) => multipartform.file("myFile") match {
      case Some(file) =>
        import java.io.File
        file.ref.moveTo(new File("path"))
        MyDAO.insert(goodThingy)
        Ok("adding file and item to db success")
      case _ =>
        MyDAO.insert(goodThingy)
        Ok("at least the form was good, no file attached")

    }
  }
)
  }

Run flyway-test-extensions on scalatest

Is it a simple way to use flyway-test-extensions on scalatest?

I need to clean and fill DB with test data before all DB-tests and before some single tests.

Pair sum algorithm - performance with binary search

I am implementing an algorithm with Scala for finding values x1 and x2 from separate arrays l1 and l2 such that x1 + x2 = t, where t is some target value.

Algorithm 1 iterates through l1 and l2 one by one and checks if x1+x2=t. Runs in O(n^2). Algorithm 2 sorts l2, then performs a binary search on it for each item in l1. Supposedly runs in O(nlogn) but does not. Why is it running slower than algorithm 1?

Note that this is a course assignment, i'm looking for clues only.

Algorithm 1:

def hasPairSlow(l1: List[Int], l2: List[Int], target: Int): Option[Pair[Int,         Int]] = {
  l1 foreach { i => 
    l2 foreach { j => if (i+j == target) return Some(i -> j) } 
  }
  None
}

Algorithm 2:

def hasPair(l1: List[Int], l2: List[Int], target: Int): Option[Pair[Int, Int]]   = {
  val s2 = l2.sorted
  l1 foreach { i =>
    val p = checkPair(i, s2, target)
    if (p.isDefined) return Some(i, p.get)
  }
  None
}

private def checkPair(x: Int, l: List[Int], target: Int): Option[Int] = {
  val mid = l.length / 2
  if (mid == 0) { // Length == 1
    if (l.head + x == target) Some(l.head) else None
  } else {
    val candinate = l(mid)
    if (candinate + x == target) Some(candinate)
    else {
      val s = l.splitAt(mid)
      if (candinate + x > target) {
        checkPair(x, s._1, target)
      }
      else /* candinate + x < target */ {
        checkPair(x, s._2, target)
      } 
  }
}

implementation of multithreading and synchronization in scala [on hold]

How to create threads in Scala as in java, i used following ? without using java syntax as given in scala school

    val pool = Executors.newFixedThreadPool(3)
    var t=new Test   
    //Test is a SCALA class extends runnable and defines run method
    for(x<- 1 to 3)
    pool.execute(t)

    pool.shutdown()

How to apply synchronization in SCALA ? I used following in above code is it correct?

     val t=new Test

How to create synchronized block/method ? synchronized keyword is not here in scala. without using java's syntax is it possible to synchronize block and methods. by using val with objects is it correct way to synchronize an object.

Flattening nested java lists in Scala

I am working in Scala with java libraries. One of these libraries returns a list of lists. I want to flatten the list.

Example:

import scala.collection.JavaConverters._
var parentList : util.List[util.List[Int]] = null
parentList = new util.ArrayList[util.List[Int]]

parentList.asScala.flatten // error

I have used asScala converter but I'm still meeting an error.

Resources not found by sbt and eclipse in Scala?

Having a Scala project built with sbt and opened in Eclipse, I want to access my ./src/main/resources/files automatically. For that I add this line EclipseKeys.createSrc := EclipseCreateSrc.Default + EclipseCreateSrc.Resource to my build.sbt in the root directory and do clean, update, reload, eclipse. Resource files still can not be found neither by sbt nor by eclipse.

Why is RichException (getStrackTraceString) deprecated in Scala 2.11?

RichException, a "library enhancement" for Throwable which provides the method getStackTraceString: String, is deprecated as of Scala 2.11. The deprecation flag includes only the message Use Throwable#getStackTrace.

Is this supposed to imply that one should migrate away from the deprecated method by simply inlining the implementation, e.g. by replacing

_.getStackTrace

with

_.getStackTrace().mkString("", EOL, EOL)

or does is the deprecation intended to indicate that this approach is, for some reason, a bad idea?

Type mismatch with identical types in Spark-shell

I have build a scripting workflow around the spark-shell but I'm often vexed by bizarre type mismatches (probably inherited from the scala repl) occuring with identical found and required types. The following example illustrates the problem. Executed in paste mode, no problem

scala> :paste
// Entering paste mode (ctrl-D to finish)


import org.apache.spark.rdd.RDD
case class C(S:String)
def f(r:RDD[C]): String = "hello"
val in = sc.parallelize(List(C("hi")))
f(in)

// Exiting paste mode, now interpreting.

import org.apache.spark.rdd.RDD
defined class C
f: (r: org.apache.spark.rdd.RDD[C])String
in: org.apache.spark.rdd.RDD[C] = ParallelCollectionRDD[0] at parallelize at <console>:13
res0: String = hello

but

scala> f(in)
<console>:29: error: type mismatch;
 found   : org.apache.spark.rdd.RDD[C]
 required: org.apache.spark.rdd.RDD[C]
              f(in)
                ^ 

There are related discussion about the scala repl and about the spark-shell but the mentioned issue seems unrelated (and resolved) to me.

This problem causes serious problems for writing passable code to be executed interactively in the repl, or causes to lose most of the advantage of working in a repl to begin with. Is there a solution? (And/or is it a known issue?)

How to fill the contents of a combo parameter with a Scala script

I know how to get dynamic behavior when filling parameters before running a Jenkins project.

I get this by writing Groovy scripts. But I know much better how to write Scala scripts.

Is there any Scala-based solution for this?

I suppose, the solution is to create a plugin.

Subscribing multiple actors to Dead Letters in Akka

I am trying to create a simple application that has two actors:

  • Master actor that handles
  • DeadLettersListener that is supposed to handle all dead or unhandled messages

Here is the code that works perfectly:

object Hw extends App {
  // creating Master actor
  val masterActorSystem = ActorSystem("Master")
  val master = masterActorSystem.actorOf(Props[Master], "Master")

  // creating Dead Letters listener actor
  val deadLettersActorSystem = ActorSystem.create("DeadLettersListener")
  val listener = deadLettersActorSystem.actorOf(Props[DeadLettersListener])

  // subscribe listener to Master's DeadLetters
  masterActorSystem.eventStream.subscribe(listener, classOf[DeadLetter])
  masterActorSystem.eventStream.subscribe(listener, classOf[UnhandledMessage])
}

According to the akka manual though, ActorSystem is a heavy object and we should create only one per application. But when I replace these lines:

val deadLettersActorSystem = ActorSystem.create("DeadLettersListener")
val listener = deadLettersActorSystem.actorOf(Props[DeadLettersListener])

with this code:

val listener = masterActorSystem.actorOf(Props[DeadLettersListener], "DeadLettersListener")

The subscription does not work any more and DeadLettersListener is not getting any Dead or Unhandled messages.

Can you please explain what am I doing wrong and give an advice how to subscribe to Dead Letters in this case?

Scala/Spark efficient partial string match

I am writing a small program in Spark using Scala, and came across a problem. I have a List/RDD of single word strings and a List/RDD of sentences which might or might not contain words from the list of single words. i.e.

val singles = Array("this", "is")
val sentence = Array("this Date", "is there something", "where are something", "this is a string")

and I want to select the sentences that contains one or more of the words from singles such that the result should be something like:

output[(this, Array(this Date, this is a String)),(is, Array(is there something, this is a string))]

I thought about two approaches, one by splitting the sentence and filtering using .contains. The other is to split and format sentence into a RDD and use the .join for RDD intersection. I am looking at around 50 single words and 5 million sentences, which method would be faster? Are there any other solutions? Could you also help me with the coding, I seem to get no results with my code (although it compiles and run without error)

saveTocassandra could not find implicit value for parameter rwf

i'm trying to save a dataset in Cassandra database using spark scala, But i am getting an exception while running a code: link used: [http://ift.tt/1yLe6d7]

errooooooooooooooooor:

could not find implicit value for parameter rwf: com.datastax.spark.connector.writer.RowWriterFctory[FoodToUserIndex] food_index.saveToCassandra("tutorial", "food_to_user_index") ^

How to generate an asynchronous reset in verilog always blocks with chisel

Chisel generate always blocks with only clock in sensivity list :

always @posedge(clk) begin
  [...]
end

Is it possible to configure Module to use an asynchronous reset and generate an always block like this ?

always @(posedge clk or posedge reset) begin
   [...]
end

How to properly validate JSON in Scala?

import scala.io.Source
import play.api.libs.json._
import play.api.libs.functional.syntax._
import scala.util.{ Try, Success, Failure }

case class Project(name: String, description: String)

def getScalaProjects: Seq[JsValue] = {

  val url = "http://ift.tt/1Ooj8Ow"
  val gitScalaRepos = Try(Source.fromURL(url).getLines) recover {
    case e: FileNotFoundException =>
      throw new AppException(s"Requested page does not exist: ${e.getMessage}.")
    case e: MalformedURLException =>
      throw new AppException(s"Please make sure to enter a valid URL: ${e.getMessage}.")
    case _ => throw new AppException("An unexpected error has occurred.")
  }

  val gitJSON = Try(Json.parse(gitScalaRepos.get mkString "\n")) match {
    case Success(json) => json
    case Failure(f) => throw new AppException("Could not parse JSON.")
  }

  implicit val projectReads: Reads[Project] = (
    (JsPath \ "name").read[String] and
    (JsPath \ "description").read[String]
  )(Project.apply _)

  for {
    i <- 0 until 5
    p = (gitJSON \ "items")(i).validate[Project]
    p match {
      case s: JsSuccess[Project] => s.get
      case e: JsError => throw new AppException("Could not parse JSON: " +
          JsError.toFlatJson(e).toString())
    }
  } yield p

}

This does not compile (particularily the p match in the for-expression). Please help me to correctly validate the JSON that has been downloaded from the above url and to handle the case when say we change the items to items2 so that it will return a JsError that I need to handle.

Scala Reflection : why getMethods can return the val members?

In Scala, I have an Abstract Class:

abstract class AbstractSQLParser {

  def apply(input: String): Int = {

    println(reserveWords)
    return 1
  }

  protected case class Keyword(str: String)
  // use Reflection here
  protected lazy val reserveWords: Int =
      this.getClass
          .getMethods
          .filter(_.getReturnType == classOf[Keyword])
          .length
}

and another extends it :

class SqlParserDemo extends AbstractSQLParser{

  def apply(input: String, onError: Boolean) : Int = {

    apply(input)
  }

  protected val CREATE = Keyword("CREATE")
  protected val TEMPORARY = Keyword("TEMPORARY")
  protected val TABLE = Keyword("TABLE")
  protected val IF = Keyword("IF")
  protected val NOT = Keyword("NOT")
  protected val EXISTS = Keyword("EXISTS")

  def func1 = Keyword("HI")
}

but when I call

val sqlParser = new SqlParserDemo
print(sqlParser.apply("hello", false))

the output is :

7

1

What confuses me is : why getMethods can return the val members ?

You see, in the sub-class, I have six val , and one function.

In Oracle doc,

public Method[] getMethods() throws SecurityException Returns an array containing Method objects reflecting all the public member methods of the class or interface represented by this Class object, including those declared by the class or interface and those inherited from superclasses and superinterfaces

Scala : match case syntax vs => S?

 def phrase[T](p: Parser[T]) = new Parser[T] {
    def apply(in: Input) = lastNoSuccessVar.withValue(None) {
      p(in) match {
      case s @ Success(out, in1) =>
        if (in1.atEnd)
          s
        else
            lastNoSuccessVar.value filterNot { _.next.pos < in1.pos } getOrElse Failure("end of input expected", in1)
        case ns => lastNoSuccessVar.value.getOrElse(ns)
      }
    }
  }

The function above is found in Scala's source code.

What confuses me is : withValue's declaration is

def withValue[S](newval: T)(thunk: => S): S

then,

  • what is the meaning of => S ?

  • and what is the relationship with match case syntax ?

Running tests in Intellij for Play Framework is very slow

Is there a way to speed up the execution of unit tests for Play Framework in Intellij? I am doing TDD. Whenever I execute a test, it takes about 30 - 60 seconds to compile and run. Even a simple Hello World test takes time. Rerunning the same test even without any change will still start the make process.

I am on Intellij 14.1, on Play 2.3.8, written in Scala.

I already tried setting the java compiler to eclipse, and also tried setting Scala compiler to SBT.

SqlContext is not a member of package org.apache.spark.sql

this is my build.sbt file:

name := "words"

version := "1.0"

scalaVersion := "2.10.4"

libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "1.3.0",
  "org.apache.spark" %% "spark-sql"  % "1.3.0"
)

sbt.version=0.13.8-RC1

When I compile the program, I have the following error:

    [error] D:\projects\bd\words\src\main\scala\test.scala:8: 
            type SqlContext is not a member of package org.apache.spark.sql
    [error]     val sqlContext = new org.apache.spark.sql.SqlContext(sc)
    [error]                                               ^
    [error] one error found
    [error] (compile:compileIncremental) Compilation failed

Using underlying POI object in Spoiwo

I found the Spoiwo library for Scala/Excel generation. However for my spreadsheet I need to use data validations as well.

It seems that Spoiwo does not support that feature yet. Can I access the underlying POI object somehow to use the sheet.addValidationData method directly?

Scala functional algorithm perfomance

I've been solving this year Code Jam task with Dijkstra. Long story short. You have to find right 3 subsets of set of chars out of X element set of chars.

I implemented my algorithm in Scala. But it didn't meet the requirements. It run for more than 10 minutes and therefore it made me lose the assignment.

class Dijkstra extends CodeJamProblem{    
  override def run(input: List[String]): String = {

    val params :: letters :: Nil = input
    val l :: x :: Nil = params.split(" ").toList.map{_.toInt}
    val inputString = (1 to x).map{_ => letters}.reduce(_ + _).toCharArray.toList

    val rests = for(
      iRests <- check(inputString, 'i);
      jRests <- check(iRests, 'j);
      kRests <- check(jRests, 'k)
      if kRests.length == 0)
      yield true

    if(rests.length > 0) "YES" else "NO"
    "NO"
  }
  @tailrec
  final def check(input: List[Char], equalTo: Symbol, lq: Quaternion = ('l, false), r: List[List[Char]] = List()) : List[List[Char]] = {
    if(input.isEmpty) r
    else {
      val rest = input.tail
      val newq = lq * Quaternion(Symbol(input.head.toString), false)
      if (newq.symbol == equalTo && !newq.negative){
        check(rest, equalTo, newq, rest :: r)
      } else {
        check(rest, equalTo, newq, r)
      }
    }
  }
  override val linesPerInput: Int = 2
}

case class Quaternion(symbol: Symbol, negative: Boolean){
  def *(that: Quaternion) = {
    val negative = this.negative ^ that.negative
    val (symbol, negate) = (this.symbol, that.symbol) match {
      case ('l, a) => (a, false)
      case (a, 'l) => (a, false)
      case (a, b) if a == b => ('l, true)

      case ('i,'j) => ('k, false)
      case ('j,'k) => ('i, false)
      case ('k,'i) => ('j, false)

      case ('j,'i) => ('k, true)
      case ('k,'j) => ('i, true)
      case ('i,'k) => ('j, true)

    }
    Quaternion(symbol, negate ^ negative)
  }
}

I thought it's just about the algorithm itself. However then I implemented the same algorithm in Erlang and it ran in less than a second.

main(String) ->
  AtomList = [{list_to_atom([X]), false} || X <- String],
  Result = [ true ||    RestI <- find_rests(AtomList, {i, false}),
                        RestJ <- find_rests(RestI, {j, false}),
                        RestK <- find_rests(RestJ, {k, false}),
                        RestK == []
              ],
  case Result of
    [true | _] -> "YES";
    _ -> "NO"
  end.

find_rests(I, S) -> find_rests(I, S, {1, false}, []).
find_rests([], _, _, Rests) -> Rests;
find_rests(Input, Symbol, LastQ, Rests) ->
  [H | Rest] = Input,
  case mulpily(LastQ, H) of
    Symbol   -> find_rests(Rest, Symbol, Symbol, [Rest | Rests]);
    Q        -> find_rests(Rest, Symbol, Q, Rests)
  end.



multiply({Sym,Sign}, {Sym2, Sign2}) ->
  {S, N} = mul(Sym, Sym2),
  {S, Sign xor Sign2 xor N}.

mul(S, 1) ->
  {S, false};
mul(1, S) ->
  {S, false};
mul(S, S) ->
  {1, true};
mul(i,j) ->
  {k, false};
mul(j,k) ->
  {i, false};
mul(k,i) ->
  {j, false};
mul(j,i) ->
  {k, true};
mul(k,j) ->
  {i, true};
mul(i,k) ->
  {j, true}.

So there is my question. What makes Scala run so much slower. My guess is it's some object copying that I can't see. But if that's the case why does scala copy immutable structures?

Please help

For people trying to understand the problem better here's the assignment http://ift.tt/1J2Twoy

lundi 20 avril 2015

Converting from Java List to Scala mutable Buffer

I'm having some issues converting from Scala to Java, and then back to Scala. I'm trying to convert from a Scala mutable buffer to a Java List, and then back to a Scala mutable buffer after I apply Java's nifty shuffling function.

I tried using Scala's Random library's shuffling function (even when the buffer is converted to a Scala list) however it's not working for me, as the type of buffer is of type "Card" which is an Object type that I have set up for the project that I am working on. The code in question looks like this:

def shuffleDeck() {
  val list: java.util.List[Card] = cards
  val newList = java.util.Collections.shuffle(list)
  asScalaBuffer(newList)
}

in the Scala IDE I'm using, the error given to me is:

type mismatch; found : Unit required: java.util.List[?]

I'm not really sure what to do. Any and all help would be greatly appreciated!

Scala Saddle Filtering Column Values

I am new to scala Saddle, I have three column (customer name, age and Status) in a frame. I have to apply filter in column (age). If any customer age having more than 18 I need to set the Status is "eligible" other wise I need to put "noteligible". Please help me to solve this issue.

Code:

f.col("age").filterAt(x => x > 18)  //but how to update Status column

Scala/Spark ClassCastException when accesing list element

I am new to spark/scala and am trying to work with(specifically average a subset of values from) data returned by a method

def getRDDForTask(sc: SparkContext, taskName: String, attributeName: String, numberOfPoints: Int): RDD[List[BigDecimal]].

  val Stats=getRDDForTask(sc, "example_name", "memory_usage", 200) 
  val initList=Stats.toArray.head
  val initValue=initList.head

Calling Stats.toArray.head returns: res100: List[BigDecimal] = List(1429505205000, 68400001, 210800640)

However, when I call initList.head, to access the first element of the List, I get an error. This message occurs when I try to run any operation (e.g. foreach or map) on values of the list. I've tried creating List[BigDecimal] from scratch and can do all standard operations on it. Am I missing something obvious? The specific error is:

java.lang.ClassCastException: scala.math.BigInt cannot be cast to scala.math.BigDecimal
    at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:42)
    at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:47)
    at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:49)
    at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:51)
    at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:53)
    at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:55)
    at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:57)
    at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:59)
    at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:61)
    at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:63)
    at $iwC$$iwC$$iwC$$iwC.<init>(<console>:65)
    at $iwC$$iwC$$iwC.<init>(<console>:67)
    at $iwC$$iwC.<init>(<console>:69)
    at $iwC.<init>(<console>:71)
    at <init>(<console>:73)
    at .<init>(<console>:77)
    at .<clinit>(<console>)
    at .<init>(<console>:7)
    at .<clinit>(<console>)
    at $print(<console>)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:622)
    at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
    at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1338)
    at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
    at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
    at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
    at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:856)
    at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:901)
    at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:813)
    at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:656)
    at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:664)
    at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:669)
    at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:996)
    at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:944)
    at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:944)
    at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
    at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:944)
    at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1058)
    at org.apache.spark.repl.Main$.main(Main.scala:31)
    at org.apache.spark.repl.Main.main(Main.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:622)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Task not serializable when using object

So, another SO question prompted me to try the following:

object Foo{
  def f = 1
}

sc.parallelize(List(1)).map(x=>{
  val myF = Foo.f _
  x + myF()
}

Which works, but the following does not:

object Foo{
  def f = 1
  def run(rdd: org.apache.spark.rdd.RDD[Int]) = rdd.map(x=>{
    val myF = Foo.f _
    x + myF()
  }
}

Foo.run(sc.parallelize(List(1)))

I will take a deeper look at the serialization stack tomorrow when I can remove the mess of the REPL output. But, this should be the same thing. Why does one way yell and the other does not.

Formatting the join rdd - Apache Spark

I have two key value pair RDD, I join the two rdd's and I saveastext file, here is the code:

val enKeyValuePair1 = rows_filter6.map(line => (line(8) -> (line(0),line(4),line(10),line(5),line(6),line(14),line(1),line(9),line(12),line(13),line(3),line(15),line(7),line(16),line(2),line(14))))

val enKeyValuePair = DATA.map(line => (line(0) -> (line(2),line(3))))

val final_res = enKeyValuePair1.leftOuterJoin(enKeyValuePair)

val output = final_res.saveAsTextFile("C:/out")

my output is as follows: (534309,((17999,5161,45005,00000,XYZ,,29.95,0.00),None))

How can i get rid of all the paranthesis? I want my output as follows: 534309,17999,5161,45005,00000,XYZ,,29.95,0.00,None

Scalike JDBC insert Geometry

In psql I can insert a type of GEOMETRY with:

insert into table (geo) 
values (ST_SetSRID(ST_MakePoint(-28.095519, 153.459987), 4326));

Given I have a type Point(lat: Double, long: Double), can I do an insert like this using scalikejdbc?

scala method name as variable name

I want to define case class demo(notify: String), but IDE(intellij idea) complains that "notify cannot override final member".

I know that notify is a member method of AnyRef and IDE may take variable name for method name. I need a notify field, how can I do that?

any advantage to using declaring a method arg w/ paramaterized type vs declaring the arg as 'Any' in scala?

I recently ran across this example, which prints [7]

class Decorator(left: String, right: String) {
  def layout[A](x: A) = left + x.toString() + right
}


def apply(f: Int => String, v: Int) = f(v)
val decorator = new Decorator("[", "]")
println(apply(decorator.layout, 7))

It would also be possible to declare Decorator like this:

class Decorator(left: String, right: String) {
  def layout(x: Any) = left + x.toString() + right
}

Im curious to know: What does the first approach buy you ? thanks in advance !

-chris

curriable function that returns a function in scala via '=>', and (secondly), via 1 arg list followed by another

Scala Experts:

I'm starting to learn a little scala, and I basically understand functions that return functions and currying, but I've seen two syntaxes for doing this, and I'd like to better understand the differences, and maybe a little of theory behind what's going on.

In the first method (using =>) I can curry the function by just specifying the argument to be bound to variable 'x'. However when I try to do this with the second approach, the compiler tells me I need to specify the '_' wild card for the second argument.

I understand what I need to do... But I am not sure why I need to do things this way. Can someone please tell me what the scala compiler is doing here ? Thanks !

First Method using =>

def add(x:Int) = (y:Int) => x + (-y)
add: (x: Int)Int => Int

scala> def adder = add(100)   // x is bound to 100 in the returned closure
adder: Int => Int

scala> adder(1)
res42: Int = 99

Second Method using one arg list followed by another

scala> def add2(x:Int)(y:Int) :  Int =  x + y
add2: (x: Int)(y: Int)Int

scala> def  adder2 = add2(100)
<console>:9: error: missing arguments for method add2;
follow this method with `_' if you want to treat it 
 as a partially applied function

       def  adder2 = add2(100)
                         ^

scala> def  adder2 = add2(100) _    // Okay, here is the '_'
adder2: Int => Int

scala> adder2(1)                    // Now i can call the curried function
res43: Int = 101

Either wait for a function to finish or timeout after 5 seconds in Akka

I'm trying to wait for a function to finish or timeout after 5 seconds, but whatever I do, I can't prevent the following exception. Interestingly it is caught by the parent actor:

java.util.concurrent.TimeoutException: Futures timed out after [5 seconds]

One of the solutions that I tried (from this question):

val f = Future { dockerClient.execStartCmd(execCreateCmdResponse.getId()).exec() }
val result: Try[InputStream] = Await.ready(f, 5.seconds).value.get

val resultEither = result match {
  case Success(t) => log.info("right")
  case Failure(e) => log.info("left")
}

Apache Weblog Parsing using regex. Which is better, case class or a Java like class?

I am writing a general scala class which can parse the apache weblog files. So far the solution I have is to use group regex to match the different parts of the log string.To illustrate each line of the incoming logs gives something like the string below

25.198.250.35 - - [2014-07-19T16:05:33Z] "GET / HTTP/1.1" 404 1081 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)"

class HttpLogStringParser(logLine: String) {
  // Regex Pattern matching the logLine
  val pattern = """^([\d.]+) (\S+) (\S+) \[(.*)\] \"(.+?)\" (\d{3}) (\d+) \"(\S+)\" \"([^\"]+)\"$""".r
  val matched = pattern.findFirstMatchIn(logLine)

  def getIP: String = {
    val IP = matched match {
      case Some(m) => m.group(1)
      case _ => None
    }
    IP.toString
  }

  def getTimeStamp: String = {
    val timeStamp = matched match {
      case Some(m) => m.group(4)
      case _ => None
    }
    timeStamp.toString
  }

  def getRequestPage: String = {
    val requestPage = matched match {
      case Some(m) => m.group(5)
      case _ => None
    }
    requestPage.toString
  }

  def getStatusCode: String = {
    val statusCode = matched match {
      case Some(m) => m.group(6)
      case _ => None
    }
    statusCode.toString
  }
}

calling these methods should give me IP, date, timestamp or status code. Is this the best way to do it. I have also tried pattern matching on case class but that just gives me match boolean. Am I getting it completely wrong. what would be the best way to get the values I need from the input log string?

Dealing with JSON in Scala?

In Scala 2.11, having the below code:

import play.api.libs.json._
...
val data = // read json from file                             (3)
val JSON: JsValue = Json.parse(data mkString "\n")            (4)
val items = JSON \ "items"
for (i <- 0 until 100) yield items(i)

  1. if I unite the last two lines for (i <- 0 until 100) yield (JSON \ "items")(i), will the term JSON \ "items" be evaluated for each i or only once?
  2. is it worth to parallelise the list construction with this for-expression (I don't care about the order in which items will appear in the list), where items is an array of JSON objects?
  3. what is the best way to handle exceptions from parsing the JSON in the lines (3 - 4) and validate it?

Split string by "|~|" in scala

In python I can do this:

In [4]: "string1|~|string2".split("|~|")
Out[4]: ['string1', 'string2']

However, the same code in scala does not give me the expected output:

scala> "string1|~|string2".split("|~|")
res3: Array[java.lang.String] = Array("", s, t, r, i, n, g, 1, |, ~, |, s, t, r, i, n, g, 2)

I looked into this question How to split a string by a string in Scala and it seems that my code should work, but it does not. What am I missing? How do I get my desired output?

Handling by-name parameters in Scala macro

I have a macro that does some analysis on nested function applications. It matches applications and retrieve the parameter types this way:

case q"$f[..$targs](..$args)(...$otherArgs)" =>

    // retrieve the list of all parameter types
    val paramTpes = f.tpe match {
      case pmt: PolyType if pmt.paramLists.size == 0 =>
        Seq()
      case pmt: PolyType =>
        pmt.paramLists(0) map {_.typeSignature
             .substituteTypes(pmt.typeParams, targs map (_.tpe))}
      case pmt: MethodType if pmt.paramLists.size == 0 =>
        Seq()
      case pmt: MethodType =>
        pmt.paramLists(0) map (_.typeSignature)
    }

Now, if there happen to be by-name parameters, what I get is some weird type that prints => T but cannot be matched with anything. There seem to be no facility in the reflection API to handle these properly. What I would like to do is retrieve T in case it' a => T, because => T causes further problems in the generated code.

Shapeless: case classes with attributes and typeclasses

I am currently implementing a library to serialize and deserialize to and from XML-RPC messages. It's almost done but now I am trying to remove the boilerplate of my current asProduct method using Shapeless. My current code:

trait Serializer[T] {
  def serialize(value: T): NodeSeq
} 

trait Deserializer[T] {
  type Deserialized[T] = Validation[AnyErrors, T]
  type AnyErrors = NonEmptyList[AnyError]
  def deserialize(from: NodeSeq): Deserialized[T]
}

trait Datatype[T] extends Serializer[T] with Deserializer[T]

// Example of asProduct, there are 20 more methods like this, from arity 1 to 22
def asProduct2[S, T1: Datatype, T2: Datatype](apply: (T1, T2) => S)(unapply: S => Product2[T1, T2]) = new Datatype[S] {
  override def serialize(value: S): NodeSeq = {
    val params = unapply(value)
    val b = toXmlrpc(params._1) ++ toXmlrpc(params._2)
    b.theSeq
  }

  // Using scalaz
  override def deserialize(from: NodeSeq): Deserialized[S] = (
      fromXmlrpc[T1](from(0)) |@| fromXmlrpc[T2](from(1))
    ) {apply}
}

My goal is to allow the user of my library to serialize/deserialize case classes without force him to write boilerplate code. Currently, you have to declare the case class and an implicit val using the aforementioned asProduct method to have a Datatype instance in context. This implicit is used in the following code:

def toXmlrpc[T](datatype: T)(implicit serializer: Serializer[T]): NodeSeq =
  serializer.serialize(datatype)

def fromXmlrpc[T](value: NodeSeq)(implicit deserializer: Deserializer[T]): Deserialized[T] =
  deserializer.deserialize(value)

This is the classic strategy of serializing and deserializing using type classes. I have grasped how to convert from case classes to HList via Generic or LabelledGeneric. The problem is once I have this conversion done how I can call the methods fromXmlrpc and toXmlrpc as in the asProduct2 example. I don't have any information about the types of the attributes in the case class and, therefore, the compiler cannot find any implicit that satisfy fromXmlrpc and toXmlrpc.

As I am a beginner with Shapeless, I would like to know what's the best way of get this functionality. I have some insights but I definitely have no idea of how to get it done using Shapeless. The ideal would be to have a way to get the type from a given attribute of the case class and pass this type explicitly to fromXmlrpc and toXmlrpc. I imagine that this is not how it can be done.