Hive Lateral View、视图与索引-APISpace

Hive Lateral View、视图与索引

1.Hive Lateral View

Lateral View 用于和 UDTF 函数（explode、split）结合来使用。首先通过 UDTF 函数拆分成多行，再将多行结果组合成一个支持别名的虚拟表。主要解决在 select 使用 UDTF 做查询过程中，查询只能包含单个 UDTF，不能包含其他字段、以及多个 UDTF 的问题语法：

LATERAL VIEW udtf(expression) tableAlias AS columnAlias (',' columnAlias)

hive> select explode(likes) from person7;OKlolbookmovielolbookmovielolbookmovielolbookhive> select id,name,mycol1,mycol2,mycol3 from person7 lateral view explode(likes) mytb1 as mycol1 lateral view explode(address) mytb2 as mycol2,mycol3;OK1 小明 1 lol beijing xisanqi1 小明 1 lol shanghai pudong 1 小明 1 book beijing xisanqi1 小明 1 book shanghai pudong 1 小明 1 movie beijing xisanqi1 小明 1 movie shanghai pudong 2 小明 2 lol beijing xisanqi2 小明 2 lol shanghai pudong 2 小明 2 book beijing xisanqi2 小明 2 book shanghai pudong 2 小明 2 movie beijing xisanqi2 小明 2 movie shanghai pudong 3 小明 3 lol beijing xisanqi3 小明 3 lol shanghai pudong 3 小明 3 book beijing xisanqi3 小明 3 book shanghai pudong 3 小明 3 movie beijing xisanqi3 小明 3 movie shanghai pudong 4 小明 4 lol beijing xisanqi4 小明 4 lol shanghai pudong 4 小明 4 book beijing xisanqi4 小明 4 book shanghai pudong 4 小明 4 movie beijing xisanqi4 小明 4 movie shanghai pudong 5 小明 5 lol beijing xisanqihive> select count(distinct(mycol2)),count(distinct(mycol3)) from person7 lateral view explode(address) mytb2 as mycol2,mycol3;Query ID = root_20211112124842_626f1f8e-0764-4306-ab9d-dc404771b732Total jobs = 1Launching Job 1 out of 1Number of reduce tasks determined at compile time: 1In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=In order to limit the maximum number of reducers: set hive.exec.reducers.max=In order to set a constant number of reducers: set mapreduce.job.reduces=Starting Job = job_1636690643978_0001, Tracking URL = Command = /opt/hadoop-2.6.5/bin/hadoop job -kill job_1636690643978_0001Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 12021-11-12 12:49:07,938 Stage-1 map = 0%, reduce = 0%2021-11-12 12:49:29,804 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 3.54 sec2021-11-12 12:49:46,263 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 5.42 secMapReduce Total cumulative CPU time: 5 seconds 420 msecEnded Job = job_1636690643978_0001MapReduce Jobs Launched: Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 5.42 sec HDFS Read: 12719 HDFS Write: 4 SUCCESSTotal MapReduce CPU Time Spent: 5 seconds 420 msecOK2 3Time taken: 66.565 seconds, Fetched: 1 row(s)hive> create table yyy( > id int, > name string);OKTime taken: 2.269 seconds

2.Hive 视图

#mysql 中支持视图删除： CREATE VIEW v_users AS SELECT * FROM myusers; DELETE FROM v_users WHERE id = '1316403900579872';

创建视图：

hive> create view v_psn as select * from person5;OKTime taken: 0.275 seconds

hive> drop view v_psn;OKTime taken: 1.22 seconds

3.Hive 索引

目的：优化查询以及检索性能创建索引：

hive> create index t1_index > on table person5(name) > as "org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler" > with deferred rebuild > in table t1_index_table > comment "table person5 name index comment";OKTime taken: 1.732 seconds

as：指定索引器； in table：指定索引表，若不指定默认生成在 default__person5_t1_index__表中

重建索引（建立索引之后必须重建索引才能生效）

ALTER INDEX t1_index ON person5 REBUILD;

重建完毕之后，再次查询有索引数据：select * from t1_index_table;

hive> select * from t1_index_table;OKNULL hdfs://mycluster/user/hive_remote/warehouse/person5/age=10/person01.txt [513] 10小明 1 hdfs://mycluster/user/hive_remote/warehouse/person5/age=10/person01.txt [0] 10小明 2 hdfs://mycluster/user/hive_remote/warehouse/person5/age=10/person01.txt [59] 10小明 3 hdfs://mycluster/user/hive_remote/warehouse/person5/age=10/person01.txt [118] 10小明 4 hdfs://mycluster/user/hive_remote/warehouse/person5/age=10/person01.txt [177] 10小明 5 hdfs://mycluster/user/hive_remote/warehouse/person5/age=10/person01.txt [236] 10小明 6 hdfs://mycluster/user/hive_remote/warehouse/person5/age=10/person01.txt [290] 10小明 7 hdfs://mycluster/user/hive_remote/warehouse/person5/age=10/person01.txt [349] 10小明 8 hdfs://mycluster/user/hive_remote/warehouse/person5/age=10/person01.txt [402] 10小明 9 hdfs://mycluster/user/hive_remote/warehouse/person5/age=10/person01.txt [455] 10

删除索引

DROP INDEX IF EXISTS t1_index ON person5;

python怎么过滤字符串中的英文字母

252 2022-11-16

Hive Lateral View、视图与索引

c语言sscanf函数的用法是什么

r语言清空数组的方法是什么

python怎么过滤字符串中的英文字母

推荐文章

api接口有哪几种分类及功能

什么是API接口?API接口简单介绍

短信API接口概述，短信API接口的优势

7款快递物流的物流查询API工具，物流快递查询API接口怎么对接？

企业四要素: 了解企业经营成功的关键

什么是语音验证码?,语音验证码平台有哪些

全国工商查询系统怎么查企业名录

哪些平台提供实名认证的接口？

PHP如何调用API接口?

如何使用百度天气预报API接口?

最近发表

热评文章

数据接口api（数据接口API开发平台）

数据开放接口api（数据服务api开发）

Python爬虫教程：爬取酷狗音乐（python爬取

hbuilder怎么更改字体大小和颜色

直播平台api接口 - 构建卓越的直播平台

实时股票数据api接口（股票实时行情api接口）