[Spark][Python]Spark 訪問 mysql , 生成 dataframe 的例子:

[Spark][Python]Spark 訪問 mysql , 生成 dataframe 的例子:

mydf001=sqlContext.read.format("jdbc").option("url","jdbc:mysql://localhost/loudacre")\
.option("dbtable","accounts").option("user","training").option("password","training").load()

?

In [10]: mydf001=sqlContext.read.format("jdbc").option("url","jdbc:mysql://localhost/loudacre")\
....: .option("dbtable","accounts").option("user","training").option("password","training").load()
17/10/03 05:59:53 INFO hive.HiveContext: default warehouse location is /user/hive/warehouse
17/10/03 05:59:53 INFO hive.HiveContext: Initializing metastore client version 1.1.0 using Spark classes.
17/10/03 05:59:53 INFO client.ClientWrapper: Inspected Hadoop version: 2.6.0-cdh5.7.0
17/10/03 05:59:53 INFO client.ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0-cdh5.7.0
17/10/03 05:59:56 INFO hive.metastore: Trying to connect to metastore with URI thrift://localhost.localdomain:9083
17/10/03 05:59:56 INFO hive.metastore: Opened a connection to metastore, current connections: 1
17/10/03 05:59:56 INFO hive.metastore: Connected to metastore.
17/10/03 05:59:56 INFO session.SessionState: Created local directory: /tmp/c2d22d09-7425-4bb3-94c3-39cb32267c7d_resources
17/10/03 05:59:56 INFO session.SessionState: Created HDFS directory: /tmp/hive/training/c2d22d09-7425-4bb3-94c3-39cb32267c7d
17/10/03 05:59:56 INFO session.SessionState: Created local directory: /tmp/training/c2d22d09-7425-4bb3-94c3-39cb32267c7d
17/10/03 05:59:56 INFO session.SessionState: Created HDFS directory: /tmp/hive/training/c2d22d09-7425-4bb3-94c3-39cb32267c7d/_tmp_space.db
17/10/03 05:59:56 INFO session.SessionState: No Tez session required at this point. hive.execution.engine=mr.

In [11]:


In [11]: type(mydf001)
Out[11]: pyspark.sql.dataframe.DataFrame

In [12]: mydf001.count()
17/10/03 06:00:29 INFO spark.SparkContext: Starting job: count at NativeMethodAccessorImpl.java:-2
17/10/03 06:00:29 INFO scheduler.DAGScheduler: Registering RDD 2 (count at NativeMethodAccessorImpl.java:-2)
17/10/03 06:00:29 INFO scheduler.DAGScheduler: Got job 0 (count at NativeMethodAccessorImpl.java:-2) with 1 output partitions
17/10/03 06:00:29 INFO scheduler.DAGScheduler: Final stage: ResultStage 1 (count at NativeMethodAccessorImpl.java:-2)
17/10/03 06:00:29 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 0)
17/10/03 06:00:29 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 0)
17/10/03 06:00:29 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[2] at count at NativeMethodAccessorImpl.java:-2), which has no missing parents
17/10/03 06:00:30 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 11.0 KB, free 11.0 KB)
17/10/03 06:00:31 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 5.2 KB, free 16.1 KB)
17/10/03 06:00:31 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:36793 (size: 5.2 KB, free: 208.8 MB)
17/10/03 06:00:31 INFO spark.SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1006
17/10/03 06:00:31 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[2] at count at NativeMethodAccessorImpl.java:-2)
17/10/03 06:00:31 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
17/10/03 06:00:31 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, partition 0,PROCESS_LOCAL, 1911 bytes)
17/10/03 06:00:31 INFO executor.Executor: Running task 0.0 in stage 0.0 (TID 0)
17/10/03 06:00:32 INFO codegen.GenerateMutableProjection: Code generated in 425.82589 ms
17/10/03 06:00:32 INFO codegen.GenerateUnsafeProjection: Code generated in 78.278589 ms
17/10/03 06:00:33 INFO codegen.GenerateMutableProjection: Code generated in 84.676206 ms
17/10/03 06:00:33 INFO codegen.GenerateUnsafeRowJoiner: Code generated in 60.144399 ms
17/10/03 06:00:33 INFO codegen.GenerateUnsafeProjection: Code generated in 95.977074 ms
17/10/03 06:00:34 INFO jdbc.JDBCRDD: closed connection
17/10/03 06:00:34 INFO executor.Executor: Finished task 0.0 in stage 0.0 (TID 0). 1334 bytes result sent to driver
17/10/03 06:00:34 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 3081 ms on localhost (1/1)
17/10/03 06:00:34 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
17/10/03 06:00:34 INFO scheduler.DAGScheduler: ShuffleMapStage 0 (count at NativeMethodAccessorImpl.java:-2) finished in 3.163 s
17/10/03 06:00:34 INFO scheduler.DAGScheduler: looking for newly runnable stages
17/10/03 06:00:34 INFO scheduler.DAGScheduler: running: Set()
17/10/03 06:00:34 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 1)
17/10/03 06:00:34 INFO scheduler.DAGScheduler: failed: Set()
17/10/03 06:00:34 INFO scheduler.DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[5] at count at NativeMethodAccessorImpl.java:-2), which has no missing parents
17/10/03 06:00:34 INFO storage.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 12.1 KB, free 28.3 KB)
17/10/03 06:00:34 INFO storage.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 5.6 KB, free 33.9 KB)
17/10/03 06:00:34 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:36793 (size: 5.6 KB, free: 208.8 MB)
17/10/03 06:00:34 INFO spark.SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006
17/10/03 06:00:34 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 1 (MapPartitionsRDD[5] at count at NativeMethodAccessorImpl.java:-2)
17/10/03 06:00:34 INFO scheduler.TaskSchedulerImpl: Adding task set 1.0 with 1 tasks
17/10/03 06:00:34 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, localhost, partition 0,NODE_LOCAL, 1999 bytes)
17/10/03 06:00:34 INFO executor.Executor: Running task 0.0 in stage 1.0 (TID 1)
17/10/03 06:00:34 INFO storage.ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks
17/10/03 06:00:34 INFO storage.ShuffleBlockFetcherIterator: Started 0 remote fetches in 32 ms
17/10/03 06:00:35 INFO codegen.GenerateMutableProjection: Code generated in 52.636353 ms
17/10/03 06:00:35 INFO codegen.GenerateMutableProjection: Code generated in 49.757505 ms
17/10/03 06:00:35 INFO executor.Executor: Finished task 0.0 in stage 1.0 (TID 1). 1666 bytes result sent to driver
17/10/03 06:00:35 INFO scheduler.DAGScheduler: ResultStage 1 (count at NativeMethodAccessorImpl.java:-2) finished in 0.795 s
17/10/03 06:00:35 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 789 ms on localhost (1/1)
17/10/03 06:00:35 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool
17/10/03 06:00:35 INFO scheduler.DAGScheduler: Job 0 finished: count at NativeMethodAccessorImpl.java:-2, took 6.451521 s
Out[12]: 129761

In [13]:

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/393882.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/393882.shtml
英文地址,請注明出處:http://en.pswp.cn/news/393882.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

ffmpeg mac 批量腳本_使用批處理腳本(BAT)調用FFMPEG批量編碼視頻

使用批處理腳本(BAT)編碼視頻非常方便,尤其當視頻序列非常多的時候,更是省了不少簡單重復性勞動。只要學會批處理里面幾個基本的命令就行了,感覺和c/c差不多。set:設置變量(注意:變量一般情況下是字符串,而…

單實例oracle ha,Oracle單實例啟動多個實例

Oracle單實例啟動多個實例多實例運行,單個實例就是一個數據庫!一個數據庫對應多個實例是RAC。Linux建立oracle的實例步驟:1、在linux服務器的圖形界面下,打開一個終端,輸入如下的命令; xhost ###遠程調用…

leetcode357. 計算各個位數不同的數字個數(回溯)

給定一個非負整數 n&#xff0c;計算各位數字都不同的數字 x 的個數&#xff0c;其中 0 ≤ x < 10n 。示例:輸入: 2 輸出: 91 解釋: 答案應為除去 11,22,33,44,55,66,77,88,99 外&#xff0c;在 [0,100) 區間內的所有數字。代碼 class Solution {int numbers0;public int …

Shell編程 之 for 循環

1. 語法結構 2. 案例 2.1 批量解壓縮 #!/bin/bashcd /root/test/ ls *.tar.gz > ls.log ls *.tgz >> ls.logfor i in $( cat ls.log )dotar -zxf $i &> /dev/nulldone rm -rf ls.log ~ …

react實戰課程_在使用React一年后,我學到的最重要的課程

react實戰課程by Tomas Eglinskas由Tomas Eglinskas 在使用React一年后&#xff0c;我學到的最重要的課程 (The most important lessons I’ve learned after a year of working with React) Starting out with a new technology can be quite troublesome. You usually find …

化工原理物性參數_化工原理知識點總結整理

1一、流體力學及其輸送1.單元操作&#xff1a;物理化學變化的單個操作過程&#xff0c;如過濾、蒸餾、萃取。2.四個基本概念&#xff1a;物料衡算、能量衡算、平衡關系、過程速率。3.牛頓粘性定律&#xff1a;FτAμAdu/dy&#xff0c;(F&#xff1a;剪應力&#xff1b;A&#…

leetcode1415. 長度為 n 的開心字符串中字典序第 k 小的字符串(回溯)

一個 「開心字符串」定義為&#xff1a;僅包含小寫字母 [a, b, c]. 對所有在 1 到 s.length - 1 之間的 i &#xff0c;滿足 s[i] ! s[i 1] &#xff08;字符串的下標從 1 開始&#xff09;。 比方說&#xff0c;字符串 "abc"&#xff0c;"ac"&#xff0c…

8、linux上安裝hbase

1.基本信息 版本1.2.4安裝機器三臺機器賬號hadoop源路徑/opt/software/hbase-1.2.4-bin.tar.gz目標路徑/opt/hbase -> /opt/hbase-1.2.4依賴關系無2.安裝過程 1).使用hadoop賬號解壓到/opt/hadoop目錄下并設置軟連接&#xff1a; [rootbgs-5p173-wangwenting opt]# su hadoo…

c oracle 記錄,ORACLE 19c 操作相關記錄

#數據源導出導入#導出exp oracle/oraclelocalhost:1521/orcl file/home/oracle/dmp/oracle20191120.dmp owneroracle log/home/oracle/dmp/log.log#導入imp oracletest/oracletestlocalhost:1521/orcl file/home/oracle/dmp/oracle20191120.dmp fully ignorey log/home/oracle…

TensorFlow.js快速入門

by Pau Pavn通過保羅帕文(PauPavn) TensorFlow.js快速入門 (A quick introduction to TensorFlow.js) TensorFlow has been around for a while now. Until last month, though, it was only available for Python and a few other programming languages, like C and Java. A…

Mountain Number FZU-2109數位dp

Mountain NumberFZU-2109 題目大意&#xff1a;一個大于0的數字x&#xff0c;分寫成xa[0]a[1]a[2][3]..a[n]的形式&#xff0c;&#xff08;比如x1234,a[0]1,a[1]2,a[3]3,a[3]4&#xff09;,Mountain Number要滿足對于a[2*i1]要大于等于a[2*i]和a[2*i2]&#xff0c;給定范圍l,r…

[10.5模擬] dis

題意&#xff1a;給你一個主串&#xff0c;兩個分串&#xff0c;要求兩個分串的距離最大&#xff0c;兩個分串的距離定義為第一個分串的最右邊的字符和第二個分串的最左邊的字符之間的字符數 題解&#xff1a; 直接kmp匹配兩個分串即可 注&#xff1a;kmp匹配時&#xff0c;當分…

什么是非集計模型_集計與非集計模型的關系

集計與非集計模型的關系Wardrop第一.第二平衡原理集計模型在傳統的交通規劃或交通需求預測中&#xff0c;通常首先將對象地區或群體劃分為若干個小區或群體等特定的集合體&#xff0c;然后以這些小區或群體為基本單位&#xff0c;展開問題的討論。因此&#xff0c;在建立模型或…

微軟dns能做cname嗎_為什么域的根不能是CNAME以及有關DNS的其他花絮

微軟dns能做cname嗎This post will use the above question to explore DNS, dig, A records, CNAME records, and ALIAS/ANAME records from a beginner’s perspective. So let’s get started.這篇文章將使用上述問題從初學者的角度探討DNS &#xff0c; dig &#xff0c; A…

Java Timestamp Memo

timestamp的構造函數&#xff0c;把微妙作為納秒存儲&#xff0c;所以 Java.util.date.comepareTo(Timestamp) 結果肯定是1另外&#xff0c;?Timestamp.equal(object) 如果參數不是Timestamp&#xff0c;肯定返回false。Timestamps nanos value is NOT the number of nanoseco…

oracle虛擬機字符集,更改虛擬機上的oracle字符集

修改oracle上邊的字符集,需要用到DBA數據庫管理員的權限,再修改字符集時要注意到修改后的字符集只能范圍變大(例如:當前的字符集是GBK,那你修改后可以是UTF-8就是說后者只能比前者大,不能小.因為字符集都是向下兼容的)步驟:第一步:使用DBA身份登錄先以繞過日志的方式登錄在以然…

mybaits自連接查詢

看不太懂&#xff0c;先記錄再查&#xff0c;有沒有大大解釋下 resultmap里的collection設置select字段&#xff0c;看著像遞歸&#xff0c;沒見過這種用法&#xff0c;#{pid}從何而來&#xff1f; 轉載于:https://www.cnblogs.com/haon/p/10808739.html

token要加編碼decode嗎_徹底弄明白Base64 編碼

Base64 encoding/decoding常見于各種authentication和防盜鏈的實現當中。徹底搞懂它絕對提升團隊troubleshooting的底氣。我們從純手工方式編碼解碼開始&#xff0c;然后看看學到的技能怎么樣應用在實際的troubleshooting 中。準備工作&#xff1a;我們應知道一個byte有8個bits…

oracle的oradata,Oracle使用oradata恢復數據庫

SQL> host del D:\oracle\ora92\database\PWDoracle.ORASQL> host orapwd fileD:\oracle\ora92\DATABASE\PWDoracle.ORA passwordsystem entries10SQL> alter database open;數據庫已更改。SQL> conn system/system as sysdba已連接。SQL> shutdown immediate數…

Jenkins連接TFS出現錯誤:“jenkins com.microsoft.tfs.core.exceptions.TECoreException”的問題收集...

沒成功解決過&#xff0c;下面提供一些收集的鏈接地址&#xff0c;因為這個問題真的很少。 https://social.msdn.microsoft.com/Forums/vstudio/en-US/1a75a0b2-4591-4edd-999a-9696149c8144/integration-with-jenkins?forumtfsintegration http://www.itgo.me/a/900879197026…