using mysql from xaamp v8.1.10 along with laravel v5.5.3.
Full text Search Index enabled in table 'ex1' for column 'des' and 'ft_min_word_len = 1'.
laravel query:
$string='pink item product';
$result = ex::active()->selectRaw("*,match(des) against('$string' in boolean mode) as relevance")->whereRaw("match(des) against('$string' in boolean mode)")->orderBy('relevance', 'desc')->get();
I am getting the result as
( results are truncated )
id | Relevance | des |
---|---|---|
64 | 1.3543391227722 | amazing pink petite lace item is item. |
182 | 1.3543391227722 | this pink tall polyester item is item. |
598 | 1.3543391227722 | which pink large polyester item is item. |
653 | 1.3543391227722 | amazing pink plus size linen item is item. |
808 | 1.3543391227722 | what pink tall cashmere item is item. |
19 | 1.344024181366 | different pink large silk product is item. |
134 | 1.344024181366 | that pink large wool product is item. |
140 | 1.344024181366 | this pink tall lace item is product. |
233 | 1.344024181366 | new pink extra-small lace item is product. |
273 | 1.344024181366 | this pink tall cashmere item is product. |
625 | 1.344024181366 | amazing pink medium lace product is item. |
893 | 1.344024181366 | pink extra-small leather item is product. |
908 | 1.344024181366 | which pink tall lace product is item. |
992 | 1.344024181366 | this pink large cotton product is item. |
21 | 1.3337094783783 | pink extra-small cashmere product is product. |
My question is :
id | Relevance | des |
---|---|---|
64 | 1.3543391227722 | amazing pink petite lace item is item. |
21 | 1.3337094783783 | that pink extra-small cashmere product is product. |
though having the same number of occurences:
column 64 has : item-2 times, pink-1 time and column 21 has : product-2 times pink-1 time.
Initial Thoughts
I thought the calculation is like this:
(refrence from MySQL FUll Text Search Doc, which says:)
${rank} = ${TF} * ${IDF} * ${IDF} + ${TF} * ${IDF} * ${IDF}
SELECT (1*log10(8/6)*log10(8/6)) + (2*log10(8/2)*log10(8/2));
which is :
(number of occurences of word 1* log10(total records in table/total matching records where the word is found) + (number of occurences of word 2* log10(total records in table/total matching records where the word is found) + ... + (number of occurences of word n * log10(total records in table/total matching records where the word n is found)
my question is how is the relevance calulated ?
also, even though column 64 and column 182 has same set of occureneces, both has pink-1 time and item-2 times, the relevance for column 64 is higher. why ?
initially i though this was based on row number in database, the order in which it was created. But in that case, my first question should be invalid as column 21 should come before column 64. I cant understand how it is calculated.
i did calculation this far with MySQL doc on full text search and i cant find a documnetation covering this kind of scenario.
Thanks in Advance.
via Chebli Mohamed
Aucun commentaire:
Enregistrer un commentaire