Hi,i just have to play and learn how to use this algorithm provided by spark-ml to do some feature extractions from some text using Google`s Word2Vec algorithm, i mean, why not to use my actual cv? Before that, probably you will have to convert the pdf file to text file. Actually i am working with…
About how to parallelize multiple Machine Learning Algorithm using a pipeline with spark.
You basically need to make a Pipeline and build a ParamGrid with different algorithms as stages. Here is an simple example: val dt = new DecisionTreeClassifier() .setLabelCol("label") .setFeaturesCol("features") val lr = new LogisticRegression() .setLabelCol("label") .setFeaturesCol("features") val pipeline = new Pipeline() val paramGrid = new ParamGridBuilder() .addGrid(pipeline.stages, Array(Array[PipelineStage](dt), Array[PipelineStage](lr))) val cv = new CrossValidator() .setEstimator(pipeline) .setEstimatorParamMaps(paramGrid)…
About how to build a recommendation engine using kafka, spark-streaming using scala
Hi, recently i was receiving classes from formacionhadoop.com, Master online big data expert, 150 hours, which means that i received notions about hadoop, spark and nosql databases. A good course, i recommend it to everyone to learn the basis of big data technology. I already have taken classes from Andrew Ng with its Machine Learning coursera course.…
About how to build a recommendation engine using Spark MLLib, Spark streaming, kafka, mongodb using scala.
This is the first post about how to create a recommendation engine, the preliminar know how to build a good recommendation engine. What is it a recommendation engine? well, the users from Spotify, Amazon and Netflix know that the recommendations shown to us are really accurate, it looks like there are somebody who knows us…