How to run Cloudera Director on your macOS/Windows 10

Cloudera Director is a provisioning tool for CDH and Cloudera Enterprise. We can launch cluster with Web GUI or CLI tool. Using Cloudera…

avatar
Aki Ariga

Simple way to distribute your private Python packages within your organization

This article is a translation of this article, originally written by aodag in Japanese. I translated it with his permission. This article…

avatar
Aki Ariga

tabula-py now able to extract remote PDF and multiple tables at once

(Note: Oct 7th, 2019)As of Oct. 2019, I launched a documentation site and Google Colab notebook for tabula-py. The FAQ would be good place…

avatar
Aki Ariga

An easy way to get URL list of your Medium publication

I imported blog posts from own Wordpress but I have to redirect old articles to Medium manually. There is Wordpress plugin which enables…

avatar
Aki Ariga

sparkavro: Manupilate Apache Avro file with sparklyr

I created a simple sparklyr extension to handle Apache Avro file. It is just a simple wrapper of DataBrick’s spark-avro. It is listed in…

avatar
Aki Ariga

How to connect secure Impala cluster from RStudio on macOS with implyr

Impala is very fast SQL-on-Hadoop, and it will enhance your R experience with implyr, a dplyr based interface for Apache Impala…

avatar
Aki Ariga

Visualize your massive data with Impala and Redash

Redash is a famous OSS visualization tool, which enables to visualize your data with SQL. It supports Apache Impala (incubating), fast…

avatar
Aki Ariga

tabula-py: Extract table from PDF into Python DataFrame

(Note: Oct 7th, 2019)As of Oct. 2019, I launched a documentation site and Google Colab notebook for tabula-py. The FAQ would be good place…

avatar
Aki Ariga

Livy & Jupyter Notebook & Sparkmagic = Powerful & Easy Notebook for Data Scientist

livy is a REST server of Spark. You can see the talk of the Spark Summit 2016, Microsoft uses livy for HDInsight with Jupyter notebook and…

avatar
Aki Ariga

Text-to-speech based on deep learning for Web site using Amazon Polly and Ruby

Amazon Polly, Text-to-speech service from AWS was announced at today ‘s re:Invent. Amazon Polly is speech synthesize system based on deep…

avatar
Aki Ariga