This episode is a must for every data scientist that deals with scraping web data. Together with Prof. Johannes Boegershausen and prof. Hannes Datta we discuss their new paper that lists state-of-the-art standards in web scraping.
Listen to this episode to get to know about the most essential errors that practitioners make during selection of data sources, designing extraction and executing it.Visit web-scraping.org to explore more resources, that our guests designed for improving the quality of web scraping practice.Lastly - make sure their publication is read by all the data-scientists you work with!
Paper discussed: Boegershausen, J., Datta, H., Borah, A., & Stephen, A. (2022). Fields of gold: Scraping web data for marketing insights. Journal of Marketing.