Sei sulla pagina 1di 2

Welcome to the Spark-notes wiki!

https://docs.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-streaming-
overview

https://github.com/simplesteph/medium-blog-kafka-udemy

https://www.cloudkarafka.com/blog/2016-11-30-part1-kafka-for-beginners-what-is-
apache-kafka.html http://cloudurable.com/blog/kafka-tutorial-v1/index.html
http://cloudurable.com/blog/kafka-tutorial-kafka-producer/index.html
https://www.javaworld.com/article/3060078/big-data/big-data-messaging-with-kafka-
part-1.html https://www.javaworld.com/article/3060078/big-data/big-data-messaging-
with-kafka-part-1.html?page=2

http://site.clairvoyantsoft.com/understanding-resource-allocation-configurations-
spark-application/

https://www.wikitechy.com/tutorials/azure/azure-tutorial

http://sqlandhadoop.com/spark-sql-dataframe/

https://databricks.com/blog/2016/05/11/apache-spark-2-0-technical-preview-easier-
faster-and-smarter.html
http://cdn2.hubspot.net/hubfs/438089/notebooks/spark2.0/Whole-stage%20code
%20generation.html https://databricks.com/blog/2016/05/23/apache-spark-as-a-
compiler-joining-a-billion-rows-per-second-on-a-laptop.html ont_prov_Df =
ont_prov_Df.filter(months_between(current_date(), col("create_date"))+1 > 12)

http://xinhstechblog.blogspot.co.uk/2016/05/overview-of-spark-dataframe-api.html
https://smtebooks.com/file/5785
https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spar
k/sql/DateFunctionsSuite.scala https://drive.google.com/uc?
export=download&id=0B4hhbFaItiPxZW9XZk1GQ2toZUk Complete RDD functions:
https://github.com/anjijava16/HadoopRelatedBooks https://hackernoon.com/managing-
spark-partitions-with-coalesce-and-repartition-4050c57ad5c4
https://github.com/anjijava16/HadoopRelatedBooks/blob/master/Hadoop%20Application
%20Architectures%20By%20Mark%20Grover%20Jul%202015%20Orielly.pdf https://spark-
test.github.io/sparksqldoc/
https://learning.acm.org/webinar_pdfs/MateiZaharia_Webinar_121715.pdf
http://asiandatascience.com/wp-content/uploads/2017/12/Data-Scientists-Guide-to-
Apache-Spark.pdf https://www.cloudera.com/documentation/enterprise/5-6-
x/PDF/cloudera-spark.pdf https://arxiv.org/ftp/arxiv/papers/1403/1403.3375.pdf

http://cloudurable.com/ppt/6-kafka-consumers-advanced.pdf
https://docs.databricks.com/spark/latest/dataframes-datasets/introduction-to-
dataframes-scala.html

https://github.com/spirom/LearningSpark/tree/master/src/main/scala
http://homepage.cs.latrobe.edu.au/zhe/ZhenHeSparkRDDAPIExamples.html

https://github.com/vaquarkhan -- more notes

datframe: http://sqlandhadoop.com/spark-dataframe-alias-as/
https://docs.databricks.com/spark/latest/dataframes-datasets/complex-nested-
data.html https://docs.databricks.com/spark/latest/faq/join-two-dataframes-
duplicated-column.html https://docs.databricks.com/spark/latest/dataframes-
datasets/introduction-to-dataframes-scala.html
https://www.linkedin.com/pulse/custom-udf-apache-spark-harjeet-kumar
https://codeload.github.com/AgilData/spark-rdd-dataframe-dataset/zip/master
https://github.com/JerryLead/SparkInternals/blob/master/markdown/english/0-
Introduction.md https://lets-do-something-big.blogspot.in/2016/06/custom-udf-in-
apache-spark.html http://datastrophic.io/core-concepts-architecture-and-internals-
of-apache-spark/

https://medium.com/@mrpowers/dealing-with-null-in-spark-cfdbb12f231e

https://hortonworks.com/tutorial/dataframe-and-dataset-examples-in-spark-repl/

https://www.linkedin.com/today/author/iamabhishekchoudhary
https://github.com/spirom/LearningSpark/blob/master/src/main/scala/dataframe/UDF.sc
ala https://www.linkedin.com/pulse/apache-spark-big-data-dataframe-things-know-
abhishek-choudhary http://alvincjin.blogspot.co.uk/search/label/Spark?updated-
max=2016-01-15T15:37:00-05:00&max-results=20&start=14&by-date=false
http://alvincjin.blogspot.co.uk/search/label/Spark http://arun-teaches-u-
tech.blogspot.co.uk/p/cca-175-prep-problem-scenario-1.html

http://edureka.freshdesk.com/support/solutions/articles/4000077779-spark-
assignments-

http://edureka.freshdesk.com/support/solutions/articles/4000077778-spark-project
http://edureka.freshdesk.com/support/solutions/articles/4000103463-maven-with-
spark-and-scala

https://www.linkedin.com/pulse/spark-performance-tuning-harjeet-kumar?
trk=portfolio_article-card_title

https://stackoverflow.com/questions/37189802/how-to-convert-a-dataframe-column-to-
sequence

http://daily-scala.blogspot.co.uk/2009/10/groupby-collection-processing.html

Potrebbero piacerti anche