如果您不將Docker用于數據科學項目,那么您將生活在1985年

重點 (Top highlight)

One of the hardest problems that new programmers face is understanding the concept of an ‘environment’. An environment is what you could say, the system that you code within. In principal it sounds easy, but later on in your career you begin to understand just how difficult it is to maintain.

新程序員面臨的最困難的問題之一是了解“環境”的概念。 您可以說的是環境,即您在其中編碼的系統。 從原則上講,這聽起來很容易,但是在職業生涯的后期,您開始了解維護的難易程度。

The reason being is that libraries and IDE’s and even the Python Code itself goes through updates and version changes, then sometimes, you’ll update one library, and a separate piece of code will fail, so you’ll need to go back and fix it.

原因是庫和IDE甚至Python代碼本身都會進行更新和版本更改,因此有時您將更新一個庫,而另一段代碼將失敗,因此您需要返回并進行修復它。

Moreover, if we have multiple projects being developed at the same time, there can be dependency conflicts, which is when things really get ugly as code fails directly because of another piece of code.

而且,如果我們同時開發多個項目,則可能存在依賴沖突,這是當代碼由于另一段代碼而直接失敗時,事情變得非常難看。

Also, say you want to share a project to a team mate working on a different OS, or even ship your project that you’ve built on your Mac to a production server on a different OS, would you have to reconfigure your code? Yes, you probably will have to.

另外,假設您想與在不同OS上工作的團隊共享一個項目,或者甚至將在Mac上構建的項目運送到在不同OS上的生產服務器上,是否需要重新配置代碼? 是的,您可能必須這樣做。

So to mitigate any of these issues, containers were proposed as a method to separate projects and the environments that they exist within. A container is basically a place where an environment can run, separate to everything else on the system. Once you define what’s in your container, it becomes so much easier to recreate the environment, and even share the project with teammates.

因此,為了緩解這些問題中的任何一個,提出了將containers作為一種將項目及其所處環境分開的方法。 一個 container 基本上是一個可以運行環境的地方,與系統上的所有其他地方分開。 一旦定義了container,中的container,就可以輕松地重新創建環境,甚至與隊友共享項目。

要求 (Requirements)

To get started, we need to install a few things to get set up:

首先,我們需要安裝一些東西進行設置:

  • Windows or macOS: Install Docker Desktop

    Windows或macOS: 安裝Docker桌面

  • Linux: Install Docker and then Docker Compose

    Linux:先安裝Docker ,再安裝Docker Compose

容器化Python服務 (Containerise a Python service)

Let’s imagine we’re creating a Flask service called server.py and let’s say the contents of the file are as follows:

假設我們正在創建一個名為server.py的Flask服務,并假設文件的內容如下:

from flask import Flask
server = Flask(__name__)@server.route("/")
def hello():
return "Hello World!"if __name__ == "__main__":
server.run(host='0.0.0.0')

Now as I said above, we need to keep a record of the dependencies for our code so for this, we can create a requirements.txt file that can contain the following requirement:

現在,如上所述,我們需要記錄代碼的依賴關系,因此,我們可以創建一個requirements.txt文件,其中可以包含以下要求:

Flask==1.1.1

So our package has the following structure:

因此,我們的軟件包具有以下結構:

app
├─── requirements.txt
└─── src
└─── server.py

The structure is pretty logical (source kept is kept in a separate directory). To execute our Python program, all is left to do is to install a Python interpreter and run it.

該結構非常合理(源代碼保存在單獨的目錄中)。 要執行我們的Python程序,剩下要做的就是安裝一個Python解釋器并運行它。

Now to run the program, we could run it locally but suppose we have 15 projects we’re working through — it makes sense to run it in a container to avoid any conflicts with any other projects.

現在要運行該程序,我們可以在本地運行它,但假設我們正在處理15個項目-在容器中運行它以避免與任何其他項目發生任何沖突都是有意義的。

Let’s move onto containerisation.

讓我們進入集裝箱化。

Image for post
Photo by Victoire Joncheray on Unsplash
Victoire Joncheray在Unsplash上拍攝的照片

Docker文件 (Dockerfile)

To run Python code, we pack the container as a Docker image and then run a container based on it. So as follows:

要運行Python代碼,我們將容器打包為Docker image ,然后基于該容器運行一個容器。 因此如下:

  1. Create a Dockerfile that contains instructions needed to build the image

    創建一個Dockerfile,其中包含構建映像所需的指令
  2. Then create an image by the Docker builder

    然后通過Docker構建器創建image

  3. The simple docker run <image> command then creates a container that is running an app

    簡單的docker run <image>命令然后創建一個運行應用程序的容器

Dockerfile的分析 (Analysis of a Dockerfile)

A Dockerfile is a file that contains instructions for assembling a Docker image (saved as myimage):

Dockerfile是一個文件,其中包含有關組裝Docker映像(保存為myimage )的說明:

# set base image (host OS)
FROM python:3.8# set the working directory in the container
WORKDIR /code# copy the dependencies file to the working directory
COPY requirements.txt .# install dependencies
RUN pip install -r requirements.txt# copy the content of the local src directory to the working directory
COPY src/ .# command to run on container start
CMD [ "python", "./server.py" ]

A Dockerfile is compiled line by line so the builder generates an image layer and stacks it upon previous images.

Dockerfile是逐行編譯的,因此構建器會生成圖像層并將其堆疊在先前的圖像上。

We can also observe in the output of the build command the Dockerfile instructions being executed as steps.

我們還可以在build命令的輸出中觀察到作為步驟執行的Dockerfile指令。

$ docker build -t myimage .
Sending build context to Docker daemon 6.144kBStep 1/6 : FROM python:3.8
3.8.3-alpine: Pulling from library/python

Status: Downloaded newer image for python:3.8.3-alpine
---> 8ecf5a48c789Step 2/6 : WORKDIR /code
---> Running in 9313cd5d834d
Removing intermediate container 9313cd5d834d
---> c852f099c2f9Step 3/6 : COPY requirements.txt .
---> 2c375052ccd6Step 4/6 : RUN pip install -r requirements.txt
---> Running in 3ee13f767d05

Removing intermediate container 3ee13f767d05
---> 8dd7f46dddf0Step 5/6 : COPY ./src .
---> 6ab2d97e4aa1Step 6/6 : CMD python server.py
---> Running in fbbbb21349be
Removing intermediate container fbbbb21349be
---> 27084556702b
Successfully built 70a92e92f3b5
Successfully tagged myimage:latest

Then, we can see that the image is in the local image store:

然后,我們可以看到該圖像在本地圖像存儲中:

$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
myimage latest 70a92e92f3b5 8 seconds ago 991MB

During development, we may need to rebuild the image for our Python service multiple times and we want this to take as little time as possible.

在開發過程中,我們可能需要多次重建Python服務的映像,并且我們希望這樣做花費盡可能少的時間。

Note: Docker and virtualenv are quite similar but different. Virtualenv only allows you to switch between Python Dependencies but you’re stuck with your host OS. However with Docker, you can swap out the entire OS — install and run Python on any OS (think Ubuntu, Debian, Alpine, even Windows Server Core). Therefore if you work in a team and want to future proof your technology, use Docker. If you don’t care about it — venv is fine, but remember it’s not future proof. Please reference this if you still want more information.

注意: Dockervirtualenv非常相似,但有所不同。 Virtualenv只允許您在Py??thon依賴關系之間進行切換,但是您對主機OS感到Virtualenv 。 但是,使用Docker ,您可以換出整個OS -在任何OS上安裝并運行Python(請考慮使用Ubuntu,Debian,Alpine甚至Windows Server Core)。 因此,如果您在團隊中工作,并且希望將來驗證您的技術,請使用Docker 。 如果您不關心它, venv很好,但是請記住,這并不是未來的證明。 如果您仍需要更多信息,請參考此內容。

There you have it! We’ve shown how to containerise a Python service. Hopefully, this process will make it a lot easier and gives your project a longer shelf life as it’ll be less likely to come down with code-bugs as dependencies change.

你有它! 我們已經展示了如何容器化Python服務。 希望這個過程將使它變得更容易,并為您的項目提供更長的保存期限,因為隨著依賴關系的改變,代碼錯誤的可能性將降低。

Thanks for reading, and please let me know if you have any questions!

感謝您的閱讀,如果您有任何疑問,請告訴我!

Keep up to date with my latest articles here!

在這里了解我的最新文章!

翻譯自: https://towardsdatascience.com/youre-living-in-1985-if-you-don-t-use-docker-for-your-data-science-projects-858264db0082

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/387910.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/387910.shtml
英文地址,請注明出處:http://en.pswp.cn/news/387910.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

jmeter對oracle壓力測試

下載Oracle的jdbc數據庫驅動包&#xff0c;注意Oracle數據庫的版本&#xff0c;這里使用的是&#xff1a;Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production&#xff1b; 一般數據庫的驅動包文件在安裝路徑下&#xff1a;D:\oracle\product\10.2.…

集合里面的 E是泛型 暫且認為是object

集合里面的 E是泛型 暫且認為是object轉載于:https://www.cnblogs.com/classmethond/p/10011374.html

docker部署flask_使用Docker,GCP Cloud Run和Flask部署Scikit-Learn NLP模型

docker部署flaskA brief guide to building an app to serve a natural language processing model, containerizing it and deploying it.構建用于服務自然語言處理模型&#xff0c;將其容器化和部署的應用程序的簡要指南。 By: Edward Krueger and Douglas Franklin.作者&am…

異常處理的原則

1&#xff1a;函數內部如果拋出需要檢測的異常&#xff0c;那么函數上必須要聲明&#xff0c;否則必須在函數內用try catch捕捉&#xff0c;否則編譯失敗。2&#xff1a;如果調用到了聲明異常的函數&#xff0c;要么try catch 要么throws&#xff0c;否則編譯失敗。3&#xff…

模塊化整理

#region常量#endregion#region 事件#endregion#region 字段#endregion#region 屬性#endregion#region 方法#endregion#region Unity回調#endregion#region 事件回調#endregion#region 幫助方法#endregion來自為知筆記(Wiz)轉載于:https://www.cnblogs.com/soviby/p/10013294.ht…

在oracle中處理日期大全

在oracle中處理日期大全 TO_DATE格式 Day: dd number 12 dy abbreviated fri day spelled out friday ddspth spelled out, ordinal twelfth Month: mm number 03 mon abbreviated mar month spelled out march Year: yy two digits 98 yyyy four …

BZOJ4868 Shoi2017期末考試(三分+貪心)

容易想到枚舉最晚發布成績的課哪天發布&#xff0c;這樣與ti和C有關的貢獻固定。每門課要么貢獻一些調節次數&#xff0c;要么需要一些調節次數&#xff0c;剩下的算貢獻也非常顯然。這樣就能做到平方級別了。 然后大膽猜想這是一個凸函數三分就能A掉了。具體的&#xff0c;延遲…

SQL的執行計劃

SQL的執行計劃實際代表了目標SQL在Oracle數據庫內部的具體執行步驟&#xff0c;作為調優&#xff0c;只有知道了優化器選擇的執行計劃是否為當前情形下最優的執行計劃&#xff0c;才能夠知道下一步往什么方向。 執行計劃的定義&#xff1a;執行目標SQL的所有步驟的組合。 我們首…

問卷 假設檢驗 t檢驗_真實問題的假設檢驗

問卷 假設檢驗 t檢驗A statistical Hypothesis is a belief made about a population parameter. This belief may or might not be right. In other words, hypothesis testing is a proper technique utilized by scientist to support or reject statistical hypotheses. Th…

webpack打包ES6降級ES5

Babel是一個廣泛使用的轉碼器&#xff0c;babel可以將ES6代碼完美地轉換為ES5代碼&#xff0c;所以我們不用等到瀏覽器的支持就可以在項目中使用ES6的特性。 安裝babel實現ES6到ES5 npm install -D babel-core babel-preset-es2015 復制代碼安裝babel-loader npm install -D ba…

[轉帖]USB-C和Thunderbolt 3連接線你搞懂了嗎?---沒搞明白.

USB-C和Thunderbolt 3連接線你搞懂了嗎&#xff1f; 2018年11月25日 07:30 6318 次閱讀 稿源&#xff1a;威鋒網 3 條評論按照計算行業的風潮&#xff0c;USB Type-C 將會是下一代主流的接口。不過&#xff0c;在過去兩年時間里&#xff0c;關于 USB-C、Thunderbolt 3、USB 3.1…

sqldeveloper的查看執行計劃快捷鍵F10

簡介&#xff1a;本文全面詳細介紹oracle執行計劃的相關的概念&#xff0c;訪問數據的存取方法&#xff0c;表之間的連接等內容。并有總結和概述&#xff0c;便于理解與記憶!目錄---一&#xff0e;相關的概念Rowid的概念Recursive Sql概念Predicate(謂詞)DRiving Table(驅動表)…

大數據技術 學習之旅_為什么聚焦是您數據科學之旅的關鍵

大數據技術 學習之旅David Robinson, a data scientist, has said the following quotes:數據科學家David Robinson曾說過以下話&#xff1a; “When you’ve written the same code 3 times, write a function.”“當您編寫了3次相同的代碼時&#xff0c;請編寫一個函數。” …

SQL 語句

去重字段里的值 SELECT DISTINCT cat_id,goods_sn,repay FROM ecs_goods where cat_id ! 20014 刪除除去 去重字段 DELETE FROM ecs_goods where goods_id NOT IN ( select bid from (select min(goods_id) as bid from ecs_goods group by cat_id,goods_sn,repay) as b );轉…

無監督學習 k-means_無監督學習-第4部分

無監督學習 k-means有關深層學習的FAU講義 (FAU LECTURE NOTES ON DEEP LEARNING) These are the lecture notes for FAU’s YouTube Lecture “Deep Learning”. This is a full transcript of the lecture video & matching slides. We hope, you enjoy this as much as …

vCenter 升級錯誤 VCSServiceManager 1603

近日&#xff0c;看到了VMware發布的vCenter 6.7 Update 1b的更新消息。其中有一條比較震撼。有誤刪所有VM的概率&#xff0c;這種BUG誰也承受不起。Removing a virtual machine folder from the inventory by using the vSphere Client might delete all virtual machinesIn t…

day28 socketserver

1. socketserver 多線程用的 例 import socket import timeclientsocket.socket() client.connect(("127.0.0.1",9000))while 1:cmdinput("請輸入指令")client.send(cmd.encode("utf-8"))from_server_msgclient.recv(1024).decode("utf…

車牌識別思路

本文源自我之前花了2天時間做的一個簡單的車牌識別系統。那個項目&#xff0c;時間太緊&#xff0c;樣本也有限&#xff0c;達不到對方要求的95%識別率&#xff08;主要對于車牌來說&#xff0c;D,0&#xff0c;O&#xff0c;I&#xff0c;1等等太相似了。然后&#xff0c;漢字…

深度學習算法原理_用于對象檢測的深度學習算法的基本原理

深度學習算法原理You just got a new drone and you want it to be super smart! Maybe it should detect whether workers are properly wearing their helmets or how big the cracks on a factory rooftop are.您剛剛擁有一架新無人機&#xff0c;并希望它變得超級聰明&…

【python】numpy庫linspace相同間隔采樣 詳解

linspace可以用來實現相同間隔的采樣&#xff1b; numpy.linspace(start,stop,num50,endpointTrue,retstepFalse, dtypeNone) 返回num均勻分布的樣本&#xff0c;在[start, stop]。 Parameters(參數): start : scalar(標量) The starting value of the sequence(序列的起始點)…