Avatar

Aki Ariga

Machine Learning Engineer

Arm Treasure Data

Biography

Aki Ariga is a Machine Learning Engineer at Arm Treasure Data. His interests include developing peroduction Machien Learning systems, Machine Learning products, and ML Ops. He aims to leverage Machine Learning powers and technologies for business and social good.

He lead several communities at Tokyo such as Machine Learning Casual Talks, kawasaki.rb, and he is also one of the organizers of “Working Group of Machine Learning systems and operations for productionization” in Special Interest Group on Machine Learning System Engineering.

Interests

  • Machine Learning
  • ML Ops
  • Natural Language Processing

Education

  • MEng in Electrical Engineering and Computer Science, 2008

    Nagoya University

  • BSc in Electrical Engineering and Computer Science, 2006

    Nagoyua University

Recent Posts

py> operator development guide for Python users

This article show how to develop a digdag Python workflow task efficiently.

How to release Python package from GitHub Actions

Photo by Hitesh Choudhary on Unsplash Recently, I changed my CI from Travis to GitHub Actions. GitHub Actions is handy and useful for testing, publishing Python packages. Testing Python code on GitHub Actions Migration from Travis is super easy, just writing a simple workflow like: https://github.com/chezou/tabula-py/blob/master/.github/workflows/pythontest.yml The benefits of GitHub Actions for Python are: We can use build matrix (e.g., OS and Python versions) like Travis Launch time of GitHub is faster than Travis Easy for additional dependency installation by using uses syntax, which uses another workflow For example, installing JDK can be written as:

How to test a new Docker image for digdag workflow on CircleCI?

Photo by Campaign Creators on Unsplash Testing workflow runnability would be important when we build a complex workflow. digdag is a workflow engine which syntax is simple and is able to run tasks with SQL, Python, Ruby, shell script, etc. digdag has Docker executor and it works like a charm with py>, rb>, and sh> operators. How to ensure a new Docker image runnable with existing digdag workflow? I’ll show the way to run through it on CircleCI.

The first conference of Operational Machine Learning: OpML ‘19

I attended OpML ’19 is a conference for “Operational Machine Learning” held at Santa Clara on May 20th. OpML ‘19 _The 2019 USENIX Conference on Operational Machine Learning (OpML ‘19) will take place on Monday, May 20, 2019, at the…_www.usenix.org[](https://www.usenix.org/conference/opml19) The scope of this conference is varied and seems not to be specified yet, even if I attended it. I’ll borrow the description from the OpML website. The 2019 USENIX Conference on Operational Machine Learning (OpML ’19) provides a forum for both researchers and industry practitioners to develop and bring impactful research advances and cutting edge solutions to the pervasive challenges of ML production lifecycle management.

Ruby for Data Science and Machine Learning

I attended RubyKaigi 2019 held at Fukuoka from Apr 18 to Apr 21. This year’s RubyKaigi was a really great opportunity for me to know the possibility of Data Science and Machine Learning for Ruby. Data Science and Ruby As many of you may know, Ruby is widely known for web application with such as Ruby on Rails, but there is another momentum of Ruby or non-Python language. Here is the list of the sessions about Data Science.

Recent Posts (in Japanese)

Jupyter Notebook/LabsをMLのどのフェーズで使うのか?

機械学習ではよく使われるJupyter Notebookですが、これを使ってプロダクションで動くコードを書くのは非常に難しいなと思い、皆どのよ

Pythonistaのためのdigdag py> operator開発ガイド

Table of Contents workflowをpushするまえにローカル環境で開発とテストをする方法 基本戦略:Pythonのタスクを合理的な粒度にまとめる 1. Treasure Da

IBIS 2019の機械学習工学企画セッションに登壇しました

ML in productionの課題について話しました

RとTreasure Data

RTDとRPrstoを使ってRからTreasure Dataにアクセスしてみよう

『n月刊ラムダノート』創刊記念パーティに登壇しました

さる5/30に『n月刊ラムダノート』創刊記念パーティーという名のコルーチン祭りを開催しました。 会の中で一番嬉しそうだった鹿野さん 同僚の Toru Takahashi さ

OSS / notebooks

tabula-py

Extract you tables in PDF into pandas DataFrame

Machine Learning in Production Wiki

Machine Learning infrastructure/architecture/operation for productionization

Docker Sphinx Recommonmark

Sphinx documentation toolchain, including latex and recommonmark in an Ubuntu docker container

Notebooks

tutorial machine learning or data science, written in Japanese

cookiecutter-digdag

A template generates digdag workflows for SQL and Python

tdworkflow

Unofficial Treasure Workflow Client

RTD

Simple R client for Treasure Data

Mykytea

Python/Ruby wrapper for KyTea

Recent & Upcoming Talks

Challenges for Machine Learning Systems toward Continuous Improvement

When executing machine learning pipelines for trainings and inferences, the systems and machine learning infrastructures vary depending …

How do you debug/test your Workflow?

Developping and testing for workflows productively is hard. In this session, I talk about how to develop heavy data dependent workflow …

Train, predict, and serve: How to put your machine learning model into production

Adopting a machine learning system is an essential step for enterprise companies to progress to the next stage of their business. …

Recent Publications

Quickly discover relevant content by filtering publications.

MLOpsの歩き方 (Beginners Guide to MLOps)

This article covers very biginning guide for MLOps, i.e., What is MLOps? How do tech giants make Machine Learning systems? What …

仕事ではじめる機械学習 (Machine Learning for Business)

First book for how to design Machine Learning systems and how to proceed Machine Learning projects. This book is originally written in …