如何用python32位開發詞典軟件_Python如何實現字典?

I was wondering how python dictionaries work under the hood, particularly the dynamic aspect?

When we create a dictionary, what is its initial size?

If we update it with a lot of elements, I suppose we need to enlarge the hash table. I suppose we need to recompute the hash function to adapt the size of the new bigger hash table while keeping a kind of logic with the previous hash table?

As you can see, I do not fully understand the internal of this structure.

解決方案When we create a dictionary what is its initial size?

As can be seen in the source code:

/* PyDict_MINSIZE is the starting size for any new dict.

* 8 allows dicts with no more than 5 active entries; experiments suggested

* this suffices for the majority of dicts (consisting mostly of usually-small

* dicts created to pass keyword arguments).

* Making this 8, rather than 4 reduces the number of resizes for most

* dictionaries, without any significant extra memory use.

*/

#define PyDict_MINSIZE 8

Imagine we update with a lot of key value pairs, i suppose we need to externe the hash table. I suppose we need to recompute the hash function to adapt the size of the new bigger hash table while keeping a kind of logic with the previous hash table....

CPython checks the hash table size every time we add a key. If the table is two-thirds full, it would resize the hash table by GROWTH_RATE (which is currently set to 3), and insert all elements:

/* GROWTH_RATE. Growth rate upon hitting maximum load.

* Currently set to used*3.

* This means that dicts double in size when growing without deletions,

* but have more head room when the number of deletions is on a par with the

* number of insertions. See also bpo-17563 and bpo-33205.

*

* GROWTH_RATE was set to used*4 up to version 3.2.

* GROWTH_RATE was set to used*2 in version 3.3.0

* GROWTH_RATE was set to used*2 + capacity/2 in 3.4.0-3.6.0.

*/

#define GROWTH_RATE(d) ((d)->ma_used*3)

The USABLE_FRACTION is the two thirds I mentioned above:

/* USABLE_FRACTION is the maximum dictionary load.

* Increasing this ratio makes dictionaries more dense resulting in more

* collisions. Decreasing it improves sparseness at the expense of spreading

* indices over more cache lines and at the cost of total memory consumed.

*

* USABLE_FRACTION must obey the following:

* (0 < USABLE_FRACTION(n) < n) for all n >= 2

*

* USABLE_FRACTION should be quick to calculate.

* Fractions around 1/2 to 2/3 seem to work well in practice.

*/

#define USABLE_FRACTION(n) (((n) << 1)/3)

Furthermore, the index calculation is:

i = (size_t)hash & mask;

where mask is HASH_TABLE_SIZE-1.

Here's how hash collisions are dealt:

perturb >>= PERTURB_SHIFT;

i = (i*5 + perturb + 1) & mask;

Explained in the source code:

The first half of collision resolution is to visit table indices via this

recurrence:

j = ((5*j) + 1) mod 2**i

For any initial j in range(2**i), repeating that 2**i times generates each

int in range(2**i) exactly once (see any text on random-number generation for

proof). By itself, this doesn't help much: like linear probing (setting

j += 1, or j -= 1, on each loop trip), it scans the table entries in a fixed

order. This would be bad, except that's not the only thing we do, and it's

actually *good* in the common cases where hash keys are consecutive. In an

example that's really too small to make this entirely clear, for a table of

size 2**3 the order of indices is:

0 -> 1 -> 6 -> 7 -> 4 -> 5 -> 2 -> 3 -> 0 [and here it's repeating]

If two things come in at index 5, the first place we look after is index 2,

not 6, so if another comes in at index 6 the collision at 5 didn't hurt it.

Linear probing is deadly in this case because there the fixed probe order

is the *same* as the order consecutive keys are likely to arrive. But it's

extremely unlikely hash codes will follow a 5*j+1 recurrence by accident,

and certain that consecutive hash codes do not.

The other half of the strategy is to get the other bits of the hash code

into play. This is done by initializing a (unsigned) vrbl "perturb" to the

full hash code, and changing the recurrence to:

perturb >>= PERTURB_SHIFT;

j = (5*j) + 1 + perturb;

use j % 2**i as the next table index;

Now the probe sequence depends (eventually) on every bit in the hash code,

and the pseudo-scrambling property of recurring on 5*j+1 is more valuable,

because it quickly magnifies small differences in the bits that didn't affect

the initial index. Note that because perturb is unsigned, if the recurrence

is executed often enough perturb eventually becomes and remains 0. At that

point (very rarely reached) the recurrence is on (just) 5*j+1 again, and

that's certain to find an empty slot eventually (since it generates every int

in range(2**i), and we make sure there's always at least one empty slot).

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/537311.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/537311.shtml
英文地址,請注明出處:http://en.pswp.cn/news/537311.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

信息系統項目管理師:軟件測試、調試及其管理

1&#xff0e;4&#xff0e;5軟件測試及其管理 1、軟件測試方法可分為靜態測試和動態測試。 靜態測試是指被測試程序不在機器上運行&#xff0c;而采用人工檢測和計算機輔助靜態分析的手段對程序進行檢測。靜態測試包括對文檔的靜態測試和對代碼的靜態測試。對文檔的靜態測試…

項目驗收材料整合流程

目標&#xff1a;多份word整合成一份項目驗收材料 第一步&#xff1a;編寫好word&#xff1b;準備好一份驗收材料的封面與目錄word 第二步&#xff1a;用WPS的word轉PDF&#xff0c;批量轉成PDF&#xff1b; 第三步&#xff1a;用Adobe Acrobat DC 合并轉成的多個PDF成為一個…

python調用接口獲取文件_python接口文件使用說明

首先&#xff0c;python接口文件在安裝好的darknet目錄下的python文件夾&#xff0c;打開就可以看到這里的darknet.py文件就是python接口用編輯器打開查看最后部分代碼&#xff1a;使用十分簡單&#xff0c;先將網絡配置加載進去&#xff0c;然后進行檢測就行了。但其實現在還不…

[譯]Kube Router Documentation

體系結構 Kube路由器是圍繞觀察者和控制器的概念而建立的。 觀察者使用Kubernetes監視API來獲取與創建&#xff0c;更新和刪除Kubernetes對象有關的事件的通知。 每個觀察者獲取與特定API對象相關的通知。 在從API服務器接收事件時&#xff0c;觀察者廣播事件。 控制器注冊以獲…

windows11 22H2資源管理器開啟多標簽頁

效果 步驟 windows11 22H2后續可能會推送該功能&#xff0c;現在是隱藏的&#xff0c;需要借助工具把這個隱藏功能開啟 工具&#xff1a;vivetool 下載&#xff1a;Releases thebookisclosed/ViVe GitHub 步驟1&#xff1a;右鍵開始菜單&#xff0c;選擇“終端&#xff08;…

python像素處理_Python 處理圖片像素點的實例

###在做爬蟲的時候有時需要識別驗證碼,但是驗證碼一般都有干擾物,這時需要對驗證碼進行預處理,效果如下:from PIL import Imageimport itertoolsimg Image.open(C:/img.jpg).convert(L) #打開圖片,convert圖像類型有L,RGBA# 轉化為黑白圖def blackWrite(img):blackXY []# 遍歷…

Mysql更改表名大小寫不敏感

編輯配置文件 vi /etc/my.cnf 在[mysqld]后添加添加 lower_case_table_names1 重啟服務 service mysqld stop service mysqld start 部署會遇到的問題&#xff1a; MySQL在Linux下數據庫名、表名、列名、別名大小寫規則是這樣的&#xff1a;   1、數據庫名與表名是嚴格區分大…

遇到“我覺得行才算行”的業主怎么辦?

目錄 案例 分析 案例 項目初期UI設計需求不確定,我們設計了幾稿,業主還是不滿意,沒有確定最終稿。后來呢,業主安排了一位內部的美工A過來。美工A給出了很多修改意見,我們根據美工A的意見進行了修改,又反反復復改了好幾版,最后業主不算滿意地確定了。 后來項目要收尾…

python讀取多個文件夾下所有txt_Python實現合并同一個文件夾下所有txt文件的方法示例...

本文實例講述了Python實現合并同一個文件夾下所有txt文件的方法。分享給大家供大家參考&#xff0c;具體如下&#xff1a;一、需求分析合并一個文件夾下所有txt文件二、合并效果三、python實現代碼# -*- coding:utf-8*-import sysreload(sys)sys.setdefaultencoding(utf-8)impo…

項目是臨時的,那項目組成員也是臨時的嗎?

在PMBOK定義項目屬性&#xff0c;“臨時性”是項目的三大屬性之一。 在“結束項目或階段”過程里的活動&#xff0c;重新分配人員&#xff1a;釋放團隊資源&#xff0c;在一些合同里面&#xff0c;項目結束后&#xff0c;需要給客戶提供培訓和一段時間的維護保修&#xff0c;那…

ceph安裝配置

簡介 ceph是一個開源分布式存儲系統&#xff0c;支持PB級別的存儲&#xff0c;支持對 象存儲&#xff0c;塊存儲和文件存儲&#xff0c;高性能&#xff0c;高可用&#xff0c;可擴展。 部署網絡建議架構圖 部署 部署架構圖&#xff0c;本次實驗部署jewel版本 實驗環境的Vagrant…

推薦好用的JavaScript模塊

2019獨角獸企業重金招聘Python工程師標準>>> 譯者按&#xff1a; 作者將自己常用的JavaScript模塊分享給大家。 原文&#xff1a;? JavaScript Modules Worth Using ?譯者: Fundebug為了保證可讀性&#xff0c;本文采用意譯而非直譯。另外&#xff0c;本文版權歸原…

python直接連接oracle_python連接oracle

一&#xff1a;弄清版本&#xff0c;最重要&#xff01;&#xff01;&#xff01;首先安裝配置時&#xff0c;必須把握一個點&#xff0c;就是版本一致&#xff01;包括&#xff1a;系統版本&#xff0c;python版本&#xff0c;oracle客戶端的版本&#xff0c;cx_Oracle的版本&…

項目計劃不要拖,要趕緊排

目錄 案例 復盤 應對 總結 案例 業主:這個項目很急,趕緊干活吧,明天就安排人來干活。 于是,項目經理問公司要來資源,第二天就投入到項目里。 公司只有一個項目,這樣搞,項目能順利實施,業主滿意,公司老板感覺這種方法不錯哦。 當公司項目越來越多了,員工也越來…

select函數_SQL高級功能:窗口函數

一、窗口函數有什么用&#xff1f;在日常生活中&#xff0c;經常會遇到需要在每組內排名&#xff0c;比如下面的業務需求&#xff1a;排名問題&#xff1a;每個部門按業績來排名topN問題&#xff1a;找出每個部門排名前N的員工進行獎勵面對這類需求&#xff0c;就需要使用sql的…

客戶端C++與前端js交互

客戶端與前端交互 qwebchannel.js文件引入建立通信// c發送消息給js new QWebChannel(qt.webChannelTransport, function(channel){var content channel.objects.jsContext;// 建立通信后&#xff0c;客戶端通過調用 sMsg 方法來執行后面的回調函數&#xff0c;從而實現c與j…

python動態映射_sqlalchemy動態映射

似乎您可以直接使用屬性&#xff0c;而不是使用columnsdict。考慮以下設置&#xff1a;from sqlalchemy import Table, Column, Integer, Unicode, MetaData, create_enginefrom sqlalchemy.orm import mapper, create_sessionclass Word(object):passwordColumns [english, k…

linux外接顯示屏,關掉本身的筆記本電腦

https://blog.csdn.net/a2020883119/article/details/79561035 先用xrandr命令查看&#xff1a; eDP-1 connected eDP-1是連接著的 關掉&#xff1a;sudo xrandr --output eDP-1 --off 打開&#xff1a;sudo xrandr --output eDP-1 --auto

發揮項目“臨時性”威力,讓項目順利實施

所謂臨時性,就是要有明確的“開始”和“結束”。雖然大家都知道項目一定會有開始和結束的,但要更多地關注“明確“。 問題1:問商務(售前)或業主,這個項目什么時候結束? 答:商務或業主他們有時候也不知道,因為國內的項目大多數是提前開始交付,是一邊交付,一邊把里程…

上拉加載更多后臺數據_6-7【微信小程序全棧開發課程】記錄頁面(七)--分頁加載記錄數據...

現在是一次性加載所有的記錄數據&#xff0c;數據多的時候&#xff0c;會加載比較慢&#xff0c;所以我們改成分頁加載&#xff0c;一次最多加載15條數據每次拉倒底部都會自動加載下一頁的數據&#xff0c;知道所有的數據加載完成1、添加data變量編輯record.vue文件&#xff0c…