Impala: A Modern, Open-Source SQL Engine for Hadoop

Implementation of an MPP SQL query engine for the Hadoop environment•Designed for performance: brand-new engine, written in C++•Maintains Hadoop flexibility by utilizing standard Hadoop components (HDFS, Hbase, Metastore, Yarn)•Reads widely used Hadoop file formats (e.g. Parquet, Avro, RC, …)•Runs on same nodes that run Hadoop processes•Plays well with traditional BI tools:exposes/interacts with industry-standard interfaces (odbc/jdbc, Kerberos and LDAP, ANSI SQL)

http://cidrdb.org/cidr2015/Slides/28_CIDR15_Slides_Paper28.pdf