linux uniq命令_如何在Linux上使用uniq命令

linux uniq命令

linux uniq命令

A shell prompt on a Linux computer.
Fatmawati Achmad Zaenuri/ShutterstockFatmawati Achmad Zaenuri / Shutterstock

The Linux uniq command whips through your text files looking for unique or duplicate lines. In this guide, we cover its versatility and features, as well as how you can make the most of this nifty utility.

Linux uniq命令在您的文本文件中快速查找唯一或重復的行。 在本指南中,我們介紹了它的多功能性和功能,以及如何充分利用這個漂亮的實用程序。

在Linux上查找文本的匹配行 (Finding Matching Lines of Text on Linux)

The uniq command is fast, flexible, and great at what it does. However, like many Linux commands, it has a few quirks—which is fine, as long as you know about them. If you take the plunge without a bit of insider know-how, you could well be left scratching your head at the results. We’ll point out these quirks as we go.

uniq命令快速,靈活,并且功能強大。 但是,就像許多Linux命令一樣,它也有一些怪癖-只要您了解它們就可以了。 如果您在沒有任何內幕專業知識的情況下進行嘗試,那么您很可能會為結果感到撓頭。 我們將指出這些怪癖。

The uniq command is perfect for those in the single-minded, designed-to-do-one-thing-and-do-it-well camp. That’s why it’s also particularly well-suited to work with pipes and play its part in command pipelines. One of its most frequent collaborators is sort?because uniq?has to have sorted input on which to work.

uniq命令非常適合那些專一,一勞永逸的人。 這就是為什么它也特別適合與管道一起工作并在命令管道中發揮作用的原因。 它的一個的最頻繁的合作者是sort因為uniq必須有排序的文件在其上工作。

Let’s fire it up!

讓我們點火!

不帶選項運行uniq (Running uniq with No Options)

We’ve got a text file that contains the lyrics to Robert Johnson’s song I Believe I’ll Dust My Broom. Let’s see what uniq makes of it.

我們有一個文本文件,其中包含羅伯特·約翰遜(Robert Johnson)的歌曲《我相信我會除塵我的掃帚》的歌詞。 讓我們看看uniq是如何構成的。

We’ll type the following to pipe the output into less:

我們將鍵入以下內容以將輸出傳遞到less

uniq dust-my-broom.txt | less
The "uniq dust-my-broom.txt | less" command in a terminal window.

We get the entire song, including duplicate lines, in?less:

我們得到的整首歌曲,包括重復的線條, less

The output from the "uniq dust-my-broom.txt | less" command in less in a terminal window.

That doesn’t seem to be either the unique lines nor the duplicate lines.

似乎既不是唯一的行也不是重復的行。

Right—because this is the first quirk. If you run uniq with no options, it behaves as though you used the -u (unique lines) option. This tells uniq to print only the unique lines from the file. The reason you see duplicate lines is because, for uniq?to consider a line a duplicate, it must be adjacent to its duplicate, which is where sort comes in.

正確-因為這是第一個怪癖。 如果運行不帶任何選項的uniq ,它的行為就好像您使用了-u (唯一行)選項一樣。 這告訴uniq僅打印文件中的唯一行。 之所以會看到重復的行,是因為,為了使uniq認為行是重復的,它必須與它的重復相鄰,這就是sort來源。

When we sort the file, it groups the duplicate lines, and uniq?treats them as duplicates. We’ll use sort?on the file, pipe the sorted output into uniq, and then pipe the final output into less.

當我們對文件進行排序時,它會將重復的行進行分組,并且uniq將它們視為重復的行。 我們將對文件使用sort ,將排序后的輸出uniquniq ,然后將最終輸出uniqless

To do so, we type the following:

為此,我們鍵入以下內容:

sort dust-my-broom.txt | uniq | less
The "sort dust-my-broom.txt | uniq | less" command in a terminal window.

A sorted list of lines appears in less.

行的排序列表出現在less

Output from sort dust-my-broom.txt | uniq | less in less in a terminal window

The line, “I believe I’ll dust my broom,” definitely appears in the song more than once. In fact, it’s repeated twice within the first four lines of the song.

這首歌“我相信我會把掃帚除塵”肯定在歌曲中多次出現。 實際上,它在歌曲的前四行中重復了兩次。

So, why is it showing up in a list of unique lines? Because the first time a line appears in the file, it’s unique; only the subsequent entries are duplicates. You can think of it as listing the first occurrence of each unique line.

那么,為什么它顯示在唯一行列表中? 因為第一次在文件中出現一行,所以它是唯一的。 僅后續條目重復。 您可以將其視為列出每個唯一行的第一個匹配項。

Let’s use sort again and redirect the output into a new file. This way, we don’t have to use sort in every command.

讓我們再次使用sort并將輸出重定向到一個新文件。 這樣,我們不必在每個命令中都使用sort

We type the following command:

我們輸入以下命令:

sort dust-my-broom.txt > sorted.txt
The "sort dust-my-broom.txt > sorted.txt" command in a terminal window.

Now, we have a presorted file to work with.

現在,我們有了一個預分類的文件可以使用。

計數重復 (Counting Duplicates)

You can use the -c (count) option to print the number of times each line appears in a file.

您可以使用-c (計數)選項來打印每行出現在文件中的次數。

Type the following command:

鍵入以下命令:

uniq -c sorted.txt | less
The "uniq -c sorted.txt | less" command in a terminal window.

Each line begins with the number of times that line appears in the file. However, you’ll notice the first line is blank. This tells you there are five blank lines in the file.

每行以該行出現在文件中的次數開頭。 但是,您會注意到第一行是空白。 這告訴您文件中有五個空白行。

Output from the "uniq -c sorted.txt | less" command in less in a terminal window.

If you want the output sorted in numerical order, you can feed the output from uniq into sort. In our example, we’ll use the -r (reverse) and?-n (numeric sort) options, and pipe the results into less.

如果要按數字順序對輸出進行排序,可以將uniq的輸出輸入sort 。 在我們的示例中,我們將使用-r (反向)和-n (數字排序)選項,并將結果傳遞給less

We type the following:

我們輸入以下內容:

uniq -c sorted.txt | sort -rn | less
The "uniq -c sorted.txt | sort -rn | less" command in a terminal window.

The list is sorted in descending order based on the frequency of each line’s appearance.

該列表根據每行出現的頻率以降序排列。

Output from uniq -c sorted.txt | sort -rn | less in less in a terminal window

只列出重復的行 (Listing Only Duplicate Lines)

If you want to see only the lines that are repeated in a file, you can use the -d (repeated) option. No matter how many times a line is duplicated in a file, it’s listed only once.

如果只想查看文件中重復的行,則可以使用-d (重復)選項。 無論文件中一行被重復多少次,它只會被列出一次。

To use this option, we type the following:

要使用此選項,我們輸入以下內容:

uniq -d sorted.txt
The "uniq -d sorted.txt" command in a terminal window.

The duplicated lines are listed for us. You’ll notice the blank line at the top, which means the file contains duplicate blank lines—it isn’t a space left by uniq to cosmetically offset the listing.

為我們列出了重復的行。 您會注意到頂部的空白行,這意味著該文件包含重復的空白行-這不是uniq用來修飾列表的空白。

Output from the "uniq -d sorted.txt" command in a terminal window.

We can also combine the -d (repeated) and -c (count) options and pipe the output through sort. This gives us a sorted list of the lines that appear at least twice.

我們還可以結合使用-d (重復)和-c (計數)選項,并通過sort傳遞輸出。 這給了我們至少出現兩次的行的排序列表。

Type the following to use this option:

輸入以下內容以使用此選項:

uniq -d -c sorted.txt | sort -rn
The "uniq -d -c sorted.txt | sort -rn" command in a terminal window.

列出所有重復的行 (Listing All Duplicated Lines)

If you want to see a list of every duplicated line, as well as an entry for each time a line appears in the file, you can use the -D (all duplicate lines) option.

如果要查看每個重復行的列表,以及每次在文件中出現一行的條目,則可以使用-D (所有重復行)選項。

To use this option, you type the following:

要使用此選項,請鍵入以下內容:

uniq -D sorted.txt | less
The "uniq -D sorted.txt | less" command in a terminal window.

The listing contains an entry for each duplicated line.

該清單包含每個重復行的條目。

Output from uniq -D sorted.txt | less in less in a terminal window

If you use the --group?option, it prints every duplicated line with a blank line either before (prepend) or after each group (append), or both before and after (both) each group.

如果使用--group選項,它將在每個組之前( prepend )或之后( append ),或在每個組之前和之后( both )都在每行重復的行上打印空白行。

We’re using append?as our modifier, so we type the following:

我們使用append作為修飾符,因此我們輸入以下內容:

uniq --group=append sorted.txt | less
The "uniq --group=append sorted.txt | less" command in a terminal window.

The groups are separated by blank lines to make them easier to read.

這些組用空行分隔,以使它們更易于閱讀。

Output from the "uniq --group=append sorted.txt | less" command in less in a terminal window.

檢查一定數量的字符 (Checking a Certain Number of Characters)

By default, uniq checks the entire length of each line. If you want to restrict the checks to a certain number of characters, however, you can use the -w (check chars) option.

默認情況下, uniq檢查每行的整個長度。 但是,如果要將檢查限制為一定數量的字符,則可以使用-w (檢查字符)選項。

In this example, we’ll repeat the last command, but limit the comparisons to the first three characters. To do so, we type the following command:

在此示例中,我們將重復最后一個命令,但將比較限制為前三個字符。 為此,我們鍵入以下命令:

uniq -w 3 --group=append sorted.txt | less
The "uniq -w 3 --group=append sorted.txt | less" command in a terminal window.

The results and groupings we receive are quite different.

我們收到的結果和分組是完全不同的。

Output from the "uniq -w 3 --group=append sorted.txt | less" command in a terminal window.

All lines that start with “I b” are grouped together because those portions of the lines are identical, so they’re considered to be duplicates.

所有以“ I b”開頭的行都被分組在一起,因為這些行的那些部分是相同的,因此它們被認為是重復的。

Likewise, all lines that start with “I’m” are treated as duplicates, even if the rest of the text is different.

同樣,以“ I'm”開頭的所有行都被視為重復行,即使文本的其余部分不同。

忽略一定數量的字符 (Ignoring a Certain Number of Characters)

There are some cases in which it might be beneficial to skip a certain number of characters at the beginning of each line, such as when lines in a file are numbered. Or, say you need uniq to jump over a timestamp and start checking the lines from character six instead of from the first character.

在某些情況下,在每一行的開頭略過一定數量的字符可能是有益的,例如對文件中的行進行編號時。 或者,說您需要uniq跳過時間戳并開始檢查從字符6而不是從第一個字符開始的行。

Below is a version of our sorted file with numbered lines.

下面是帶有編號行的已排序文件的版本。

A numbered and sorted file of duplicate lines in less in a terminal window.

If we want?uniq to start its comparison checks at character three, we can use the -s (skip chars) option by typing the following:

如果我們希望uniq在第三個字符處開始比較檢查,則可以通過鍵入以下命令使用-s (跳過字符)選項:

uniq -s 3 -d -c numbered.txt
The "uniq -s 3 -d -c numbered.txt" command in a terminal window.

The lines are detected as duplicates and counted correctly. Notice the line numbers displayed are those of the first occurrence of each duplicate.

將這些行檢測為重復行并正確計數。 注意,顯示的行號是每個重復項第一次出現的行號。

You can also skip fields (a run of characters and some white space) instead of characters.?We’ll use the -f (fields) option to tell uniq which fields to ignore.

您也可以跳過字段(一系列字符和一些空白)來代替字符。 我們將使用-f (字段)選項來告訴uniq要忽略哪些字段。

We type the following to tell uniq to ignore the first field:

我們輸入以下內容告訴uniq忽略第一個字段:

uniq -f 1 -d -c? numbered.txt
The "uniq -f 1 -d -c? numbered.txt" command in a terminal window.

We get the same results we did when we told?uniq to skip three characters at the start of each line.

當我們告訴uniq在每一行的開頭跳過三個字符時,我們得到的結果相同。

忽略大小寫 (Ignoring Case)

By default,?uniq is case-sensitive. If the same letter appears capped and in lowercase, uniq?considers the lines to be different.

默認情況下, uniq區分大小寫。 如果相同的字母顯示為大寫且小寫,則uniq認為行是不同的。

For example, check out the output from the following command:

例如,檢查以下命令的輸出:

uniq -d -c sorted.txt | sort -rn
The "uniq -d -c sorted.txt | sort -rn" command and output in a terminal window.

The lines “I Believe I’ll dust my broom” and?“I believe I’ll dust my broom” aren’t treated as duplicates because of the difference in case on the “B” in “believe.”

由于“相信”中“ B”的大小寫不同,因此“我相信我會掃帚”和“我相信我會掃帚”這行不會被重復。

If we include the -i (ignore case) option, though, these lines will be treated as duplicates. We type the following:

但是,如果我們包括-i (忽略大小寫)選項,則這些行將被視為重復行。 我們輸入以下內容:

uniq -d -c -i sorted.txt | sort -rn
The "uniq -d -c -i sorted.txt | sort -rn" command in a terminal window.

The lines are now treated as duplicates and grouped together.

現在,這些行被視為重復行并分組在一起。



Linux puts a multitude of special utilities at your disposal. Like many of them, uniq isn’t a tool you’ll use every day.

Linux提供了許多特殊實用程序供您使用。 像其中許多工具一樣, uniq并不是您每天都會使用的工具。

That’s why a big part of becoming proficient in Linux is remembering which tool will solve your current problem, and where you can find it again. If you practice, though, you’ll be well on your way.

這就是為什么精通Linux的很大一部分原因是要記住哪種工具可以解決您當前的問題,以及在哪里可以找到它。 但是,如果您練習的話,將會很順利。

Or, you can always just search?How-To Geek—we probably have an article on it.

或者,您始終可以只搜索How-To Geek-我們可能在上面有一篇文章。

翻譯自: https://www.howtogeek.com/533406/how-to-use-the-uniq-command-on-linux/

linux uniq命令

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/278353.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/278353.shtml
英文地址,請注明出處:http://en.pswp.cn/news/278353.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

解決 display 和 transition 沖突的問題

問題: 既需要“顯示、隱藏”’效果,也需要動畫效果。此時使用了xxx.style.display "none / block" 之后,我們發現 transition 動畫效果就沒有了。 解決辦法一:用定時器(這種方法并不好) btn2.on…

win10任務欄和開始菜單_如何將網站固定到Windows 10任務欄或開始菜單

win10任務欄和開始菜單Having quick access to frequently-used or hard to remember websites can save you time and frustration. Whether you use Chrome, Firefox, or Edge, you can add a shortcut to any site right to your Windows 10 taskbar or Start menu. 快速訪問…

智能家居的尷尬:概念比用戶火

智能家居概念的走俏與用戶的接受程度成鮮明的對比,如何才能撬開這個市場,這是整個行業都需要思考的問題。 追溯起源,智能家居已經有20年的歷史,但由于技術缺陷、價格昂貴,實用性差、安裝復雜及產品同質化嚴重等原因&a…

WEB_矛盾

題目鏈接:http://123.206.87.240:8002/get/index1.php 題解: 打開題目,看題目信息,本題首先要弄清楚 is_numeric() 函數的作用 作用如下圖: 即想要輸出flag,num既不能是數字字符,不能為數1&…

如何在Windows上解決藍牙問題

Bluetooth gives you the freedom to move without a tether, but it isn’t always the most reliable way to use wireless devices. If you’re having trouble with Bluetooth on your Windows machine, you can follow the steps below to troubleshoot it. 藍牙使您可以不…

Multicast注冊中心

1234提供方啟動時廣播自己的地址。   消費方啟動時廣播訂閱請求。   提供方收到訂閱請求時&#xff0c;單播自己的地址給訂閱者&#xff0c;如果設置了unicastfalse&#xff0c;則廣播給訂閱者。   消費方收到提供方地址時&#xff0c;連接該地址進行RPC調用。 <du…

阻止a鏈接跳轉方法總結

總結下a標簽阻止默認行為的幾種簡單方法(1) <a href"javascript:void(0);" > 點我 </a> onclick方法負責執行js函數&#xff0c;而void是一個操作符&#xff0c;void(0)返回undefined&#xff0c;地址不發生跳轉。 <a href"javascript:;&qu…

美味奇緣_輕松訪問和管理您的美味書簽

美味奇緣Looking for an easy way to access and manage your Delicious Bookmarks collection with minimal UI impact? Now you can with SimpleDelicious for Firefox. 是否正在尋找一種簡單的方法來訪問和管理您的Delicious Bookmarks收藏&#xff0c;而對UI的影響最小&am…

談談如何使用Netty開發實現高性能的RPC服務器

RPC&#xff08;Remote Procedure Call Protocol&#xff09;遠程過程調用協議&#xff0c;它是一種通過網絡&#xff0c;從遠程計算機程序上請求服務&#xff0c;而不必了解底層網絡技術的協議。說的再直白一點&#xff0c;就是客戶端在不必知道調用細節的前提之下&#xff0c…

寒假萬惡之源3:抓老鼠啊~虧了還是賺了?

1.代碼&#xff1a; #include<iostream>using namespace std;int main(){ char a/*操作*/; int i/*計數工具*/,b0/*老鼠會開心幾天*/; int e/*正常的來*/,f/*老鼠會悲傷幾天*/; int c1/*老鼠來不來*/,d0/*奶酪數目*/,g0/*老鼠數目*/; for (i1;;i) { …

在Firefox中結合Wolfram Alpha和Google搜索結果

Do you wish there was a way to combine all that Wolfram Alpha and Google goodness together when you search for something? Now you can with the Wolfram Alpha Google extension for Firefox. 您是否希望有一種方法可以在搜索某些內容時將Wolfram Alpha和Google的所有…

Docker容器中開始.NETCore之路

一、引言  開始寫這篇博客前&#xff0c;已經嘗試練習過好多次Docker環境安裝,.Net Core環境安裝了&#xff0c;在這里替騰訊云做一個推廣,假如我們想學習、練手.net core 或是Docker卻苦于沒有開發環境&#xff0c;服務器也不想買&#xff0c;那么我們可以使用騰訊云提供的開…

分布式的數據一致性

一.前序 數據的一致性和系統的性能是每個分布式系統都需要考慮和權衡的問題。一致性的級別如下&#xff1a;1.強一致性這種一致性級別是最符合用戶直覺的&#xff0c;它要求系統寫入什么&#xff0c;讀出來的也會是什么&#xff0c;用戶體驗好&#xff0c;但實現起來往往對系統…

kompozer如何啟動_使用KompoZer創建網站

kompozer如何啟動Are you looking for a way to easily start creating your own webpages? KompoZer is a nice basic website editor that will allow you to quickly get started and become familiar with the process. 您是否正在尋找一種輕松創建自己的網頁的方法&#…

我也說說宏定義likely()和unlikely()

作者&#xff1a;gfree.windgmail.com 博客&#xff1a;blog.focus-linux.net linuxfocus.blog.chinaunix.net 本文的copyleft歸gfree.windgmail.com所有&#xff0c;使用GPL發布&#xff0c;可以自由拷貝&#xff0c;轉載。但轉載請保持文檔的完整性&#xff0c;注明原作者及…

圖片懶加載與預加載

預加載 常用的是new Image();&#xff0c;設置其src來實現預載&#xff0c;再使用onload方法回調預載完成事件。function loadImage(url, callback) {var img new Image(); //創建一個Image對象&#xff0c;實現圖片的預下載img.src url;if (img.complete){ // 如果圖片已經存…

電腦pin重置_如果忘記了如何重置Windows PIN

電腦pin重置A good password or PIN is difficult to crack but can be difficult to remember. If you forgot or lost your Windows login PIN, you won’t be able to retrieve it, but you can change it. Here’s how. 好的密碼或PIN很難破解&#xff0c;但很難記住。 如果…

android.support不統一的問題

今天supprt28遇到的問題&#xff0c;由于28還是預覽版&#xff0c;還存在一些bug 都是因為如果程序內出現不同的&#xff0c;support或者其他外部引用庫的多個版本&#xff0c;Gradle在進行合并的時候會使用本地持有的&#xff0c;最高版本的來進行編譯&#xff0c;所以25的sup…

輕松查看Internet Explorer緩存文件

Sometimes you may need a quick and easy way to access Internet Explorer’s cache. Today we take a look at IECacheView which is a great application to get the job done. 有時&#xff0c;您可能需要一種快速簡便的方法來訪問Internet Explorer的緩存。 今天&#xf…

洛谷P1019 單詞接龍

題目描述 單詞接龍是一個與我們經常玩的成語接龍相類似的游戲&#xff0c;現在我們已知一組單詞&#xff0c;且給定一個開頭的字母&#xff0c;要求出以這個字母開頭的最長的“龍”&#xff08;每個單詞都最多在“龍”中出現兩次&#xff09;&#xff0c;在兩個單詞相連時&…