一個單詞搜索的相關度排名,這個例子演示了一個單詞搜索的相關度排名計算。
mysql>?CREATE?TABLE?articles?(
->?id?INT?UNSIGNED?AUTO_INCREMENT?NOT?NULL?PRIMARY?KEY,
->?title?VARCHAR(200),
->?body?TEXT,
->?FULLTEXT?(title,body)
->?)?ENGINE=InnoDB;
Query?OK,?0?rows?affected?(0.56?sec)
mysql>?INSERT?INTO?articles?(title,body)?VALUES
->?('MySQL?Tutorial','This?database?tutorial?...'),
->?("How?To?Use?MySQL",'After?you?went?through?a?...'),
->?('Optimizing?Your?Database','In?this?database?tutorial?...'),
->?('MySQL?vs.?YourSQL','When?comparing?databases?...'),
->?('MySQL?Security','When?configured?properly,?MySQL?...'),
->?('Database,?Database,?Database','database?database?database'),
->?('1001?MySQL?Tricks','1.?Never?run?mysqld?as?root.?2.?...'),
->?('MySQL?Full-Text?Indexes',?'MySQL?fulltext?indexes?use?a?..');
Query?OK,?8?rows?affected?(0.02?sec)
Records:?8??Duplicates:?0??Warnings:?0
mysql>?SELECT?id,?title,?body,?MATCH?(title,body)?AGAINST?('database'?IN?BOOLEAN?MODE)?AS?score?FROM?articles?ORDER?BY?score?DESC;
+----+------------------------------+-------------------------------------+---------------------+
|?id?|?title????????????????????????|?body????????????????????????????????|?score???????????????|
+----+------------------------------+-------------------------------------+---------------------+
|??6?|?Database,?Database,?Database?|?database?database?database??????????|??1.0886961221694946?|
|??3?|?Optimizing?Your?Database?????|?In?this?database?tutorial?...???????|?0.36289870738983154?|
|??1?|?MySQL?Tutorial???????????????|?This?database?tutorial?...??????????|?0.18144935369491577?|
|??2?|?How?To?Use?MySQL?????????????|?After?you?went?through?a?...????????|???????????????????0?|
|??4?|?MySQL?vs.?YourSQL????????????|?When?comparing?databases?...????????|???????????????????0?|
|??5?|?MySQL?Security???????????????|?When?configured?properly,?MySQL?...?|???????????????????0?|
|??7?|?1001?MySQL?Tricks????????????|?1.?Never?run?mysqld?as?root.?2.?...?|???????????????????0?|
|??8?|?MySQL?Full-Text?Indexes??????|?MySQL?fulltext?indexes?use?a?..?????|???????????????????0?|
+----+------------------------------+-------------------------------------+---------------------+
8?rows?in?set?(0.00?sec)
總共有8條記錄,其中3條與數據庫搜索詞匹配。第一條記錄(id 6)包含搜索詞6次,相關度排名為1.0886961221694946。這個排名值計算使用TF的價值6(數據庫搜索詞出現6次記錄id 6)和IDF值為0.42596873216370745,計算如下(8是記錄的總數和3是包含搜索詞的記錄數量)
${IDF}?=?log10(?8?/?3?)?=?0.42596873216370745
mysql>?select?log10(?8?/?3?)?;
+---------------------+
|?log10(?8?/?3?)??????|
+---------------------+
|?0.42596873216370745?|
+---------------------+
1?row?in?set?(0.00?sec)
然后將TF和IDF值輸入到排名公式中
${rank}?=?${TF}?*?${IDF}?*?${IDF}
在MySQL命令行客戶端執行計算將返回一個排名值1.088696164686938。
mysql>?SELECT?6*log10(8/3)*log10(8/3);
+-------------------------+
|?6*log10(8/3)*log10(8/3)?|
+-------------------------+
|???????1.088696164686938?|
+-------------------------+
1?row?in?set?(0.00?sec)
您可能會注意到SELECT ... MATCH ... AGAINST語句和MySQL命令行客戶端所計算的排名值有差別(1.0886961221694946對1.088696164686938)。區別在于InnoDB內部是如何執行整數和浮點數/雙精度類型轉換的(以及相關的精度和四舍五的決定),以及它們在其他地方是如何執行的,比如在MySQL命令行客戶端或其他類型的計算器中。
多詞搜索的相關度排名
這個示例演示了基于前面示例中使用的articles表和數據計算多單詞全文搜索的相關度排名。
如果你搜索的是一個以上的單詞,那么相關度排名值就是每個單詞相關度排名值的總和,如下公式所示:
${rank}?=?${TF}?*?${IDF}?*?${IDF}?+?${TF}?*?${IDF}?*?${IDF}
執行兩個搜索詞('mysql tutorial')搜索將返回以下結果:
mysql>?SELECT?id,?title,?body,?MATCH?(title,body)?AGAINST?('mysql?tutorial'?IN?BOOLEAN?MODE)?AS?score?FROM?articles?ORDER?BY?score?DESC;
+----+------------------------------+-------------------------------------+----------------------+
|?id?|?title????????????????????????|?body????????????????????????????????|?score????????????????|
+----+------------------------------+-------------------------------------+----------------------+
|??1?|?MySQL?Tutorial???????????????|?This?database?tutorial?...??????????|???0.7405621409416199?|
|??3?|?Optimizing?Your?Database?????|?In?this?database?tutorial?...???????|???0.3624762296676636?|
|??5?|?MySQL?Security???????????????|?When?configured?properly,?MySQL?...?|?0.031219376251101494?|
|??8?|?MySQL?Full-Text?Indexes??????|?MySQL?fulltext?indexes?use?a?..?????|?0.031219376251101494?|
|??2?|?How?To?Use?MySQL?????????????|?After?you?went?through?a?...????????|?0.015609688125550747?|
|??4?|?MySQL?vs.?YourSQL????????????|?When?comparing?databases?...????????|?0.015609688125550747?|
|??7?|?1001?MySQL?Tricks????????????|?1.?Never?run?mysqld?as?root.?2.?...?|?0.015609688125550747?|
|??6?|?Database,?Database,?Database?|?database?database?database??????????|????????????????????0?|
+----+------------------------------+-------------------------------------+----------------------+
8?rows?in?set?(0.00?sec)
在第一條記錄(id 1)中,“mysql”出現一次,“tutorial”出現兩次。“mysql”有六條匹配記錄,“tutorial”有兩條匹配記錄。當將這些值插入到用于多個單詞搜索的排名公式中時,MySQL命令行客戶端返回預期的排名值
mysql>?SELECT?(1*log10(8/6)*log10(8/6))?+?(2*log10(8/2)*log10(8/2));
+-------------------------------------------------------+
|?(1*log10(8/6)*log10(8/6))?+?(2*log10(8/2)*log10(8/2))?|
+-------------------------------------------------------+
|????????????????????????????????????0.7405621541938003?|
+-------------------------------------------------------+
1?row?in?set?(0.00?sec)
與單個單詞搜索一樣,使用select ... match ... against語句和MySQL命令行工具執行的結果有差