Recent & Upcoming Talks

大規模データに対するデータサイエンスの進め方

2016-11-08

A data engineering and data science platform based on Hadoop/Spark

2017-02-07

Using Cloudera Enterprise, it is possible to build and operate an enterprise-grade Hadoop/Spark platform. To make use of big data, what kind of platform is needed, and how do you get the most out of it? From the perspective of data engineering and data science, I will introduce machine learning that uses SQL-on-Hadoop, Spark, and Python.

2017-02-07

Cloudera Data Science WorkbenchとPySparkを使って好きなPythonライブラリを分散で使う

2017-06-02

An introduction of using arbitrary Python packages on PySpark with Cloudera Data Science Workbench

2017-06-02

Invited talk: データサイエンティストからみた統合されたデータ分析基盤の恩恵

2017-06-27

In this session, we will introduce the benefits of the integrated data analysis platform, which is important for using data in the enterprise, and how Cloudera will prevent the analysis environment from becoming silos.

2017-06-27

機械学習システムのデプロイパターン

2017-11-07

仕事ではじめる機械学習

2018-05-17

How do you debug/test your Workflow?

2019-08-30

Developing and testing workflows productively is hard. In this session, I talk about how to develop heavy, data-dependent workflows with Digdag.

2019-08-30

Challenges for Machine Learning Systems toward Continuous Improvement

2019-11-22

When executing machine learning pipelines for trainings and inferences, the systems and machine learning infrastructures vary depending on required characteristics and requirements such as the purpose of the application, data volume, and latency. On the other hand, many companies in industry have built machine learning infrastructures with each companies knowledges. The knoledges are not organized yet since they are engineering efforts and engineers less motivated to publish them so that there are few papers for the system design and problems characteristic in machine learning infrastructure and systems. In this presentation, we will introduce challenges for machine learning systems, especially for continuous prediction in the production environment and approaches to them.

2019-11-22

Managing Machine Learning workflows on Treasure Data

2018-10-17

Train, predict, and serve: How to put your machine learning model into production

2017-12-06

Adopting a machine learning system is an essential step for enterprise companies to progress to the next stage of their business. However, machine learning systems tend to be complex, because they depend on different languages, libraries, or frameworks, such as scikit-learn, TensorFlow, and XGBoost. As a result, there are many challenges for building machine learning system in production, including determining which architecture is best for which use case, how to deploy your predictive models, and how to move from development and to a production environment. I explain how to put your machine learning model into production, discusses common issues and obstacles you may encounter, and shares best practices and typical architecture patterns of deployment ML models with example designs from the Hadoop and Spark ecosystem using Cloudera Data Science Workbench.

2017-12-06

No results found

Recent & Upcoming Talks