[mod] rerank plugin: hide if bm25s is not installed & add to docs
This commit is contained in:
parent
9f9ae5427d
commit
7fa0fbd93d
9
docs/src/searx.plugins.rerank.rst
Normal file
9
docs/src/searx.plugins.rerank.rst
Normal file
@ -0,0 +1,9 @@
|
|||||||
|
.. _rerank plugin:
|
||||||
|
|
||||||
|
================
|
||||||
|
Rerank plugin
|
||||||
|
================
|
||||||
|
|
||||||
|
.. automodule:: searx.plugins.rerank
|
||||||
|
:members:
|
||||||
|
|
@ -1,8 +1,14 @@
|
|||||||
# SPDX-License-Identifier: AGPL-3.0-or-later
|
# SPDX-License-Identifier: AGPL-3.0-or-later
|
||||||
"""Plugin which reranks the search results using the Okapi BM25 algorithm.
|
"""Plugin which reranks the search results using the Okapi BM25 algorithm.
|
||||||
Before enabling the Rerank plugin, you must the install the pip package ``bm25s``.
|
|
||||||
|
|
||||||
Enable in ``settings.yml``:
|
This plugin utilizes the `bm25s` library to reorder search results based on their relevance to the search query,
|
||||||
|
potentially improving the quality of results. Before enabling this plugin,
|
||||||
|
ensure you have installed the ``bm25s`` pip package. e.g. by installing it directly via pip or
|
||||||
|
by adding it to the project's `requirements.txt` file.
|
||||||
|
|
||||||
|
Configuration:
|
||||||
|
--------------
|
||||||
|
To enable the Rerank plugin, add it to the `enabled_plugins` list in your `settings.yml` file:
|
||||||
|
|
||||||
.. code:: yaml
|
.. code:: yaml
|
||||||
|
|
||||||
@ -10,9 +16,13 @@ Enable in ``settings.yml``:
|
|||||||
..
|
..
|
||||||
- 'Rerank plugin'
|
- 'Rerank plugin'
|
||||||
|
|
||||||
By default, the engine list is retained, so results found by multiple engines receive a score boost.
|
By default, the plugin retains the information about which engines found a particular result.
|
||||||
The following setting can be used to ensure that the engine list only contains the first engine.
|
Results that appear in multiple engine results will receive a score boost.
|
||||||
This will prevent overlapping search engine results from affecting the ranking:
|
This approach might be relevant if you wish results found by different engines to be prioritized.
|
||||||
|
You can modify this behaviour by configuring the ``remove_extra_engines`` setting.
|
||||||
|
If ``remove_extra_engines`` is set to ``true``, the original engine list is reduced to only the first engine.
|
||||||
|
This is useful when you prefer the reranking to not be affected by any potential overlap
|
||||||
|
of results from different engines.
|
||||||
|
|
||||||
.. code:: yaml
|
.. code:: yaml
|
||||||
|
|
||||||
@ -26,8 +36,8 @@ from searx import settings
|
|||||||
try:
|
try:
|
||||||
import bm25s
|
import bm25s
|
||||||
except ImportError:
|
except ImportError:
|
||||||
# Import error is ignored because the admin has to install bm25s manually to use the engine
|
# Import error is ignored because the admin has to install bm25s manually to use the plugin
|
||||||
pass
|
bm25s = None
|
||||||
|
|
||||||
name = 'Rerank plugin'
|
name = 'Rerank plugin'
|
||||||
description = 'Rerank search results, ignoring original engine ranking'
|
description = 'Rerank search results, ignoring original engine ranking'
|
||||||
@ -41,13 +51,16 @@ remove_extra_engines = settings.get('rerank', {}).get('remove_extra_engines')
|
|||||||
|
|
||||||
|
|
||||||
def post_search(_request, search):
|
def post_search(_request, search):
|
||||||
|
if not bm25s:
|
||||||
|
return True
|
||||||
|
|
||||||
# pylint: disable=protected-access
|
# pylint: disable=protected-access
|
||||||
results = search.result_container._merged_results
|
results = search.result_container._merged_results
|
||||||
query = search.search_query.query
|
query = search.search_query.query
|
||||||
locale = search.search_query.locale
|
locale = search.search_query.locale
|
||||||
|
|
||||||
# Determine the stopwords based on the selected locale
|
# Determine the stopwords based on the selected locale
|
||||||
stopwords = locale.language if locale and locale.language in stopword_langs else True
|
stopwords = locale.language if locale and locale.language in stopword_langs else 'en'
|
||||||
|
|
||||||
retriever = bm25s.BM25()
|
retriever = bm25s.BM25()
|
||||||
result_tokens = bm25s.tokenize(
|
result_tokens = bm25s.tokenize(
|
||||||
@ -75,3 +88,7 @@ def post_search(_request, search):
|
|||||||
results[index]['positions'] = [position + 1] * len(results[index]['positions'])
|
results[index]['positions'] = [position + 1] * len(results[index]['positions'])
|
||||||
|
|
||||||
return True
|
return True
|
||||||
|
|
||||||
|
|
||||||
|
def is_allowed():
|
||||||
|
return bm25s is not None
|
||||||
|
@ -249,6 +249,7 @@ outgoing:
|
|||||||
# - 'Hostnames plugin' # see 'hostnames' configuration below
|
# - 'Hostnames plugin' # see 'hostnames' configuration below
|
||||||
# - 'Open Access DOI rewrite'
|
# - 'Open Access DOI rewrite'
|
||||||
# - 'Tor check plugin'
|
# - 'Tor check plugin'
|
||||||
|
# - 'Rerank plugin' # requires the bm25s python dependency to be installed
|
||||||
|
|
||||||
# Configuration of the "Hostnames plugin":
|
# Configuration of the "Hostnames plugin":
|
||||||
#
|
#
|
||||||
|
Loading…
x
Reference in New Issue
Block a user