flink iceberg寫數據到hdfs,hive同步讀取

目錄

1、組件版本

環境變量配置

2、hadoop配置

hadoop-env.sh

core-site.xml

hdfs-site.xml

mapred-site.xml

yarn-site.xml

3、hive配置

hive-env.sh

hive-site.xml

HIVE LIB 原始JAR

4、flink配置集成HDFS和YARN

修改iceberg源碼

為了兼容HIVE 4.0.1

config.yaml

Flink Lib目錄

TEZ

/cluster/tez/conf/tez-site.xml

TEZ LIB

啟動程序 

docker啟動postgresql數據庫

 Hive命令行執行

FlinkSQL執行命令


1、組件版本

名稱版本
hadoop3.4.1
flink1.20.1
hive4.0.1
kafka3.9.0
zookeeper3.9.3
tez0.10.4
spark(hadoop3)3.5.4
jdk11.0.13
maven3.9.9

環境變量配置

vim編輯保存后,要執行source /etc/profile

LD_LIBRARY_PATH=/usr/local/lib

export LD_LIBRARY_PATH

# Java環境

export JAVA_HOME=/cluster/jdk

export CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar

export TEZ_HOME=/cluster/tez/

export TEZ_CONF_DIR=$TEZ_HOME/conf

export TEZ_JARS=$TEZ_HOME/*.jar:$TEZ_HOME/lib/*.jar

# Hadoop生態

export HADOOP_HOME=/cluster/hadoop3

export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

HADOOP_CLASSPATH=`hadoop classpath`

export HADOOP_CLASSPATH=$TEZ_CONF_DIR:$TEZ_JARS:$HADOOP_CLASSPATH

export HDFS_NAMENODE_USER=root

export HDFS_DATANODE_USER=root

export HDFS_SECONDARYNAMENODE_USER=root

export YARN_RESOURCEMANAGER_USER=root

export YARN_NODEMANAGER_USER=root

# Hive配置

export HIVE_HOME=/cluster/hive

export HIVE_CONF_DIR=$HIVE_HOME/conf

# Spark配置

export SPARK_HOME=/cluster/spark

export SPARK_LOCAL_IP=10.10.10.99

export SPARK_CONF_DIR=$SPARK_HOME/conf

# Flink配置

export FLINK_HOME=/cluster/flink

# ZooKeeper/Kafka

export ZOOKEEPER_HOME=/cluster/zookeeper

export KAFKA_HOME=/cluster/kafka

# 其他工具

export FLUME_HOME=/cluster/flume

export M2_HOME=/cluster/maven

# 動態鏈接庫

export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native/:$LD_LIBRARY_PATH

# 環境變量合并

export PATH=$PATH:$HIVE_HOME/bin:$JAVA_HOME/bin:$SPARK_HOME/bin:$SPARK_HOME/sbin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$M2_HOME/bin:$FLINK_HOME/bin:$ZOOKEEPER_HOME/bin

export LC_ALL=zh_CN.UTF-8

export LANG=zh_CN.UTF-8

2、hadoop配置

hadoop-env.sh

#

# Licensed to the Apache Software Foundation (ASF) under one

# or more contributor license agreements.  See the NOTICE file

# distributed with this work for additional information

# regarding copyright ownership.  The ASF licenses this file

# to you under the Apache License, Version 2.0 (the

# "License"); you may not use this file except in compliance

# with the License.  You may obtain a copy of the License at

#

#     http://www.apache.org/licenses/LICENSE-2.0

#

# Unless required by applicable law or agreed to in writing, software

# distributed under the License is distributed on an "AS IS" BASIS,

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

# See the License for the specific language governing permissions and

# limitations under the License.

# Set Hadoop-specific environment variables here.

##

## THIS FILE ACTS AS THE MASTER FILE FOR ALL HADOOP PROJECTS.

## SETTINGS HERE WILL BE READ BY ALL HADOOP COMMANDS.  THEREFORE,

## ONE CAN USE THIS FILE TO SET YARN, HDFS, AND MAPREDUCE

## CONFIGURATION OPTIONS INSTEAD OF xxx-env.sh.

##

## Precedence rules:

##

## {yarn-env.sh|hdfs-env.sh} > hadoop-env.sh > hard-coded defaults

##

## {YARN_xyz|HDFS_xyz} > HADOOP_xyz > hard-coded defaults

##

# Many of the options here are built from the perspective that users

# may want to provide OVERWRITING values on the command line.

# For example:

#

#  JAVA_HOME=/usr/java/testing hdfs dfs -ls

#

# Therefore, the vast majority (BUT NOT ALL!) of these defaults

# are configured for substitution and not append.  If append

# is preferable, modify this file accordingly.

###

# Generic settings for HADOOP

###

# Technically, the only required environment variable is JAVA_HOME.

# All others are optional.  However, the defaults are probably not

# preferred.  Many sites configure these options outside of Hadoop,

# such as in /etc/profile.d

# The java implementation to use. By default, this environment

# variable is REQUIRED on ALL platforms except OS X!

# export JAVA_HOME=

# Location of Hadoop.  By default, Hadoop will attempt to determine

# this location based upon its execution path.

# export HADOOP_HOME=

# Location of Hadoop's configuration information.  i.e., where this

# file is living. If this is not defined, Hadoop will attempt to

# locate it based upon its execution path.

#

# NOTE: It is recommend that this variable not be set here but in

# /etc/profile.d or equivalent.  Some options (such as

# --config) may react strangely otherwise.

#

# export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop

# The maximum amount of heap to use (Java -Xmx).  If no unit

# is provided, it will be converted to MB.  Daemons will

# prefer any Xmx setting in their respective _OPT variable.

# There is no default; the JVM will autoscale based upon machine

# memory size.

# export HADOOP_HEAPSIZE_MAX=

# The minimum amount of heap to use (Java -Xms).  If no unit

# is provided, it will be converted to MB.  Daemons will

# prefer any Xms setting in their respective _OPT variable.

# There is no default; the JVM will autoscale based upon machine

# memory size.

# export HADOOP_HEAPSIZE_MIN=

# Enable extra debugging of Hadoop's JAAS binding, used to set up

# Kerberos security.

# export HADOOP_JAAS_DEBUG=true

# Extra Java runtime options for all Hadoop commands. We don't support

# IPv6 yet/still, so by default the preference is set to IPv4.

# export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true"

# For Kerberos debugging, an extended option set logs more information

# export HADOOP_OPTS="-Djava.net.preferIPv4Stack=true -Dsun.security.krb5.debug=true -Dsun.security.spnego.debug"

# Some parts of the shell code may do special things dependent upon

# the operating system.  We have to set this here. See the next

# section as to why....

export HADOOP_OS_TYPE=${HADOOP_OS_TYPE:-$(uname -s)}

# Extra Java runtime options for some Hadoop commands

# and clients (i.e., hdfs dfs -blah).  These get appended to HADOOP_OPTS for

# such commands.  In most cases, # this should be left empty and

# let users supply it on the command line.

# export HADOOP_CLIENT_OPTS=""

#

# A note about classpaths.

#

# By default, Apache Hadoop overrides Java's CLASSPATH

# environment variable.  It is configured such

# that it starts out blank with new entries added after passing

# a series of checks (file/dir exists, not already listed aka

# de-deduplication).  During de-deduplication, wildcards and/or

# directories are *NOT* expanded to keep it simple. Therefore,

# if the computed classpath has two specific mentions of

# awesome-methods-1.0.jar, only the first one added will be seen.

# If two directories are in the classpath that both contain

# awesome-methods-1.0.jar, then Java will pick up both versions.

# An additional, custom CLASSPATH. Site-wide configs should be

# handled via the shellprofile functionality, utilizing the

# hadoop_add_classpath function for greater control and much

# harder for apps/end-users to accidentally override.

# Similarly, end users should utilize ${HOME}/.hadooprc .

# This variable should ideally only be used as a short-cut,

# interactive way for temporary additions on the command line.

# export HADOOP_CLASSPATH="/some/cool/path/on/your/machine"

# Should HADOOP_CLASSPATH be first in the official CLASSPATH?

# export HADOOP_USER_CLASSPATH_FIRST="yes"

# If HADOOP_USE_CLIENT_CLASSLOADER is set, the classpath along

# with the main jar are handled by a separate isolated

# client classloader when 'hadoop jar', 'yarn jar', or 'mapred job'

# is utilized. If it is set, HADOOP_CLASSPATH and

# HADOOP_USER_CLASSPATH_FIRST are ignored.

# export HADOOP_USE_CLIENT_CLASSLOADER=true

# HADOOP_CLIENT_CLASSLOADER_SYSTEM_CLASSES overrides the default definition of

# system classes for the client classloader when HADOOP_USE_CLIENT_CLASSLOADER

<

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/web/75456.shtml
繁體地址,請注明出處:http://hk.pswp.cn/web/75456.shtml
英文地址,請注明出處:http://en.pswp.cn/web/75456.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

qq郵箱群發程序

1.界面設計 1.1 環境配置 在外部工具位置進行配置 1.2 UI界面設計 1.2.1 進入QT的UI設計界面 在pycharm中按順序點擊&#xff0c;進入UI編輯界面&#xff1a; 點擊第三步后進入QT的UI設計界面&#xff0c;通過點擊按鈕進行界面設計&#xff0c;設計后進行保存到當前Pycharm…

【C++游戲引擎開發】第10篇:AABB/OBB碰撞檢測

一、AABB(軸對齊包圍盒) 1.1 定義 ?最小點: m i n = ( x min , y min , z min ) \mathbf{min} = (x_{\text{min}}, y_{\text{min}}, z_{\text{min}}) min=(xmin?,ymin?,zmin?)?最大點: m a x = ( x max , y max , z max ) \mathbf{max} = (x_{\text{max}}, y_{\text{…

大模型是如何把向量解碼成文字輸出的

hidden state 向量 當我們把一句話輸入模型后&#xff0c;例如 “Hello world”&#xff1a; token IDs: [15496, 995]經過 Embedding Transformer 層后&#xff0c;會得到每個 token 的中間表示&#xff0c;形狀為&#xff1a; hidden_states: (batch_size, seq_len, hidd…

C++指針(三)

個人主頁:PingdiGuo_guo 收錄專欄&#xff1a;C干貨專欄 文章目錄 前言 1.字符指針 1.1字符指針的概念 1.2字符指針的用處 1.3字符指針的操作 1.3.1定義 1.3.2初始化 1.4字符指針使用注意事項 2.數組參數&#xff0c;指針參數 2.1數組參數 2.1.1數組參數的概念 2.1…

生命篇---心肺復蘇、AED除顫儀使用、海姆立克急救法、常見情況急救簡介

生命篇—心肺復蘇、AED除顫儀使用、海姆立克急救法、常見情況急救簡介 文章目錄 生命篇---心肺復蘇、AED除顫儀使用、海姆立克急救法、常見情況急救簡介一、前言二、急救1、心肺復蘇&#xff08;CPR&#xff09;&#xff08;1&#xff09;適用情況&#xff08;2&#xff09;操作…

基于神經環路的神經調控可增強遺忘型輕度認知障礙患者的延遲回憶能力

簡要總結 這篇文章提出了一種名為CcSi-MHAHGEL的框架&#xff0c;用于基于多站點、多圖譜fMRI的功能連接網絡&#xff08;FCN&#xff09;分析&#xff0c;以輔助自閉癥譜系障礙&#xff08;ASD&#xff09;的識別。該框架通過多視圖超邊感知的超圖嵌入學習方法&#xff0c;整合…

[WUSTCTF2020]level1

關鍵知識點&#xff1a;for匯編 ida64打開&#xff1a; 00400666 55 push rbp .text:0000000000400667 48 89 E5 mov rbp, rsp .text:000000000040066A 48 83 EC 30 sub rsp, 30h .text:000000…

cpp自學 day20(文件操作)

基本概念 程序運行時產生的數據都屬于臨時數據&#xff0c;程序一旦運行結束都會被釋放 通過文件可以將數據持久化 C中對文件操作需要包含頭文件 <fstream> 文件類型分為兩種&#xff1a; 文本文件 - 文件以文本的ASCII碼形式存儲在計算機中二進制文件 - 文件以文本的…

Gartner發布軟件供應鏈安全市場指南:軟件供應鏈安全工具的8個強制功能、9個通用功能及全球29家供應商

攻擊者的目標是由開源和商業軟件依賴項、第三方 API 和 DevOps 工具鏈組成的軟件供應鏈。軟件工程領導者可以使用軟件供應鏈安全工具來保護他們的軟件免受這些攻擊的連鎖影響。 主要發現 越來越多的軟件工程團隊現在負責解決軟件供應鏈安全 (SSCS) 需求。 軟件工件、開發人員身…

備賽藍橋杯-Python-考前突擊

額&#xff0c;&#xff0c;離藍橋杯開賽還有十個小時&#xff0c;最近因為考研復習節奏的問題&#xff0c;把藍橋杯的優先級后置了&#xff0c;突然才想起來還有一個藍橋杯呢。。 到目前為止python基本語法熟練了&#xff0c;再補充一些常用函數供明天考前再背背&#xff0c;算…

榕壹云外賣跑腿系統:基于Spring Boot+MySQL+UniApp的智慧生活服務平臺

項目背景與需求分析 隨著本地生活服務需求的爆發式增長&#xff0c;外賣、跑腿等即時配送服務成為現代都市的剛性需求。傳統平臺存在開發成本高、功能定制受限等問題&#xff0c;中小企業及創業團隊極需一款輕量級、可快速部署且支持二次開發的外賣跑腿系統。榕壹云外賣跑腿系統…

使用Docker安裝Gogs

1、拉取鏡像 docker pull gogs/gogs 2、運行容器 # 創建/var/gogs目錄 mkdir -p /var/gogs# 運行容器 # -d&#xff0c;后臺運行 # -p&#xff0c;端口映射&#xff1a;(宿主機端口:容器端口)->(10022:22)和(10880:3000) # -v&#xff0c;數據卷映射&#xff1a;(宿主機目…

【antd + vue】Modal 對話框:修改彈窗標題樣式、Modal.confirm自定義使用

一、標題樣式 1、目標樣式&#xff1a;修改彈窗標題樣式 2、問題&#xff1a; 直接在對應css文件中修改樣式不生效。 3、原因分析&#xff1a; 可能原因&#xff1a; 選擇器權重不夠&#xff0c;把在控制臺找到的選擇器直接復制下來&#xff0c;如果還不夠就再加&#xff…

Streamlit在測試領域中的應用:構建自動化測試報告生成器

引言 Streamlit 在開發大模型AI測試工具方面具有顯著的重要性&#xff0c;尤其是在簡化開發流程、增強交互性以及促進快速迭代等方面。以下是幾個關鍵點&#xff0c;說明了 Streamlit 對于構建大模型AI測試工具的重要性&#xff1a; 1. 快速原型設計和迭代 對于大模型AI測試…

docker 運行自定義化的服務-后端

docker 運行自定義化的服務-前端-CSDN博客 運行自定義化的后端服務 具體如下&#xff1a; ①打包后端項目&#xff0c;形成jar包 ②編寫dockerfile文件&#xff0c;文件內容如下&#xff1a; # 使用官方 OpenJDK 鏡像 FROM jdk8:1.8LABEL maintainer"ATB" version&…

解決java使用easyexcel填充模版后,高度不一致問題

自定義工具&#xff0c;可以通過獲取上一行行高設置后面所以行的高度 package org.springblade.modules.api.utils;import com.alibaba.excel.write.handler.RowWriteHandler; import com.alibaba.excel.write.metadata.holder.WriteSheetHolder; import com.alibaba.excel.wr…

repo倉庫文件清理

1. repo 倉庫內文件清理 # 清理所有Git倉庫中的項目 repo forall -c git clean -dfx # 重置所有Git 倉庫中的項目 repo forall -c git reset --hard 解釋&#xff1a; repo forall -c git clean -dfx&#xff1a; repo forall 是一個用于在所有項目中執行命令的工具。-c 后…

結合大語言模型整理敘述并生成思維導圖的思路

楔子 我比較喜歡長篇大論。這在代理律師界被視為一種禁忌。 我高中一年級的時候因為入學成績好&#xff08;所在縣榜眼名次&#xff09;&#xff0c;直接被所在班的班主任任命為班長。我其實不喜歡這個崗位。因為老師一來就要提前注意到&#xff0c;要及時喊“起立”、英語課…

spark-core編程2

Key-Value類型&#xff1a; foldByKey 當分區內計算規則和分區間計算規則相同時&#xff0c;aggregateByKey 就可以簡化為 foldByKey combineByKey 最通用的對 key-value 型 rdd 進行聚集操作的聚集函數&#xff08;aggregation function&#xff09;。類似于aggregate()&…

原理圖設計準備:頁面柵格模板應用設置

一、頁面大小的設置 &#xff08;1&#xff09;單頁原理圖頁面設置 首先&#xff0c;選中需要更改頁面尺寸的那一頁原理圖&#xff0c;鼠標右鍵&#xff0c;選擇“Schmatic Page Properties”選項&#xff0c;進行頁面大小設置。 &#xff08;2&#xff09;對整個原理圖頁面設…