Sales Engineer / Field Data Scientist at Cloudera.
I’m interested in natural language processing and machine learning, and applying those technologies in production. I love to think how leverage data with technology for bussiness. I also love to write Ruby, Python and Julia for ML.
I am a podcaster of technical podcast rubyist.club.
Codes & Notebooks
- Julia implimentation of compact Japanese tokenizer
- Python and Ruby wrapper of KyTea, Japanese morophological analyzer
- Julia binding of Japanese morphological analyzer MeCab
- julia version of 100 numpy exercises
- Simple wrapper of tabula-java: extract table from PDF into pandas DataFrame
- Load Avro data into Spark with sparklyr
- Docker image for sphinx with recommonmark
- Build script and Demo for Cloudera Director with Sparklyr
- Example code for distributed Python package with pip
- Example project for Cloudera Data Science Workbench
- Dockernized for cdsw-simple-serving-python. An API server example for Cloudera Data Science Workbench
- Homebrew Formulas for cloudera tools
- Cloudera parcel for R
- Demo for spaCy on R with Sparklyr
Full code list, see GitHub
- Demo notebooks of ibis, Python productivity framework for the Apache Hadoop ecosystem.
- tutorial machine learning or data science, written in Japanese
- Founder of kawasaki.rb, regional Ruby community
- Founder of Machine Learning Casual Talks
- Co-founder of Julia Tokyo
- 「仕事ではじめる機械学習」オライリー・ジャパン ("Machine Learning at work" from O'reilly Japan. Japanese)
See Google scholar