Discover this podcast and so much more

Podcasts are free to enjoy without a subscription. We also offer ebooks, audiobooks, and so much more for just $11.99/month.

33: Katharine Jarmul - Testing in Data Science

33: Katharine Jarmul - Testing in Data Science

FromTest and Code


33: Katharine Jarmul - Testing in Data Science

FromTest and Code

ratings:
Length:
38 minutes
Released:
Nov 30, 2017
Format:
Podcast episode

Description

A discussion with Katharine Jarmul, aka kjam, about some of the challenges of data science with respect to testing.
Some of the topics we discuss:
experimentation vs testing
testing pipelines and pipeline changes
automating data validation
property based testing
schema validation and detecting schema changes
using unit test techniques to test data pipeline stages
testing nodes and transitions in DAGs
testing expected and unexpected data
missing data and non-signals
corrupting a dataset with noise
fuzz testing for both data pipelines and web APIs
datafuzz
hypothesis
testing internal interfaces
documenting and sharing domain expertise to build good reasonableness
intermediary data and stages
neural networks
speaking at conferences
Special Guest: Katharine Jarmul.
Released:
Nov 30, 2017
Format:
Podcast episode

Titles in the series (100)

Test & Code is a weekly podcast hosted by Brian Okken. The show covers a wide array of topics including software engineering, development, testing, Python programming, and many related topics. When we get into the implementation specifics, that's usually Python, such as Python packaging, tox, pytest, and unittest. However, well over half of the topics are language agnostic, such as data science, DevOps, TDD, public speaking, mentoring, feature testing, NoSQL databases, end to end testing, automation, continuous integration, development methods, Selenium, the testing pyramid, and DevOps.