vrijdag 18 december 2020

Yacht Most Data workshop Spark

Gehost door Richard Groen

Data Professional

Header image

This workshop about Spark is organized by and for Most Data, the Yacht community for Data professionals. For this workshop we team with Pipple’s CTO Ruud Mullers, who will lead the workshop. It is part of a series of three online data science workshops: 1) Coding Standards and Logging in Python, 2) Machine Learning Exploration in Python and 3) Spark.


More and more data is generated in every company, which means that we are increasingly running against the limits of our servers / computers. Fortunately, when increasing your CPU, GPU or RAM becomes too expensive or no longer sufficient, there are nowadays techniques to scale horizontally, i.e. ways to use multiple computers / nodes. During this workshop you will get an introduction to distributed working using Hadoop and Spark, with a focus on reprocessing and training models using pyspark.

After the basic information has been explained during the theory piece, you will work in a group with a large dataset and corresponding case on a Spark cluster. Also showing the difference between single node and distributed work. At the end of the workshop, the results of the different teams are compared and a team is declared the winner.

About the workshop

During the workshop Google Colab will be used. Make sure you bring a laptop with Google Colab properly running. This workshop requires you to have at least basic experience in working with Python. The workshop will be held via a Google Meet/Hangout videocall. When you sign up via the form below, you will receive the link to join this workshop.

Yacht supports Pipple’s Women in Data Science conference

Women in Data Science (WiDS) Amsterdam is an independent conference organized by Pipple Thursday September 24th, under the banner of Stanford University. The event is followed by a series of online lectures in September and October. Yacht supports diversity and inclusion and are proud to be a partner of Women in Data Science Amsterdam


Friday December 18th, 14:00 - 17.00 hrs.

Tijd en locatie

Vrijdag 18 dec, 2020

14:00  tot  17:00


Google Meet/Hangout & Google Colab

Bekijk in Google Maps

More info or questions?

Please contact me

contactperson Image

Richard Groen

Data Professional

Gerelateerde evenementen

Most Data community Banner

Most Data community

In onze community wordt o.a. gewerkt aan Machine Learning, Natural Language Processing en Neural Networks. Most Data is dé plek om je kennis up-to-date te houden en te experimenteren.

Meld je aan voor Most Data

Meld je nu aan voor Yacht Most Data workshop Spark

Jouw privacy is belangrijk voor ons. Yacht gaat zorgvuldig om met jouw persoonsgegevens, zie hierover meer in ons privacy statement