Querying S3 with Presto
This post assumes you have an AWS account and a Presto instance (standalone or cluster) running. We'll use the Presto CLI to run the queries against the Yelp dataset. The dataset is a JSON dump of a subset of Yelp's data for businesses, reviews, checkins, users and tips.
Configure Hive metastore
Configure the Hive metastore to point at our data in S3. We are using the…