org.apache.hudi.hadoop.HoodieParquetInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' LOCATION 's3://s3 org.apache.hudi.hadoop.HoodieParquetInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' LOCATION 's3://s3 LOCATION 's3://s3-bucket/prefix/partition-path' Apache Hudi最早被AWS EMR官方集成,然后原生集成到AWS上不同云产品,如Athena、Redshift
s3.amazonaws.com' \ --s3-access-key='YOUR-ACCESSKEYID' \ --s3-secret-key='YOUR-SECRETACCESSKEY' \ --s3
--packages org.apache.hadoop:hadoop-aws:3.2.0 --conf spark.kubernetes.file.upload.path=s3a://<s3-bucket