A local development server can be launched by one of these command lines::
$ flask --app searx.webapp run
$ python -m searx.webapp
The different ways of starting the server should lead to the same result, which
is generally the case. However, if the modules are reloaded after code
changes (reload option), it must be avoided that the application is initialized
twice at startup. We have already discussed this in 2022 [1][2].
Further information on this topic can be found in [3][4][5].
To test a bash in the ./local environment was started and the follwing commands
had been executed::
$ ./manage pyenv.cmd bash --norc --noprofile
(py3) SEARXNG_DEBUG=1 flask --app searx.webapp run --reload
(py3) SEARXNG_DEBUG=1 python -m searx.webapp
Since the generic parts of the docs also initialize the app to generate doc from
it, the build of the docs was also tested::
$ make docs.clean docs.live
[1] https://github.com/searxng/searxng/pull/1656#issuecomment-1214198941
[2] https://github.com/searxng/searxng/pull/1616#issuecomment-1206137468
[3] https://flask.palletsprojects.com/en/stable/api/#flask.Flask.run
[4] https://github.com/pallets/flask/issues/5307#issuecomment-1774646119
[5] https://stackoverflow.com/a/25504196
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
BTW: fix a bug with sys.path: repo-root (not util) needs to added to generate
autodoc from scripts in ./searxng_extra
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Do a DNS-lookup of 'all.api.radio-browser.info', add reverse lookup and select
randomly a URL from available servers
Closes: https://github.com/searxng/searxng/issues/4576
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
On a long-running server, the tracebacks below can be found (albeit rarely),
which indicate problems with NoneType where a string or another data type is
expected.
result.img_src::
File "/usr/local/searxng/searxng-src/searx/templates/simple/result_templates/images.html", line 13, in top-level template code
<img src="" data-src="{{ image_proxify(result.img_src) }}" alt="{{ result.title|striptags }}">{{- "" -}}
^
File "/usr/local/searxng/searxng-src/searx/webapp.py", line 284, in image_proxify
if url.startswith('//'):
^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'startswith'
result.content::
File "/usr/local/searxng/searxng-src/searx/result_types/_base.py", line 105, in _normalize_text_fields
result.content = WHITESPACE_REGEX.sub(" ", result.content).strip()
~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
TypeError: expected string or bytes-like object, got 'NoneType'
html_to_text, when html_str is a NoneType::
File "/usr/local/searxng/searxng-src/searx/engines/wikipedia.py", line 190, in response
title = utils.html_to_text(api_result.get('titles', {}).get('display') or api_result.get('title'))
File "/usr/local/searxng/searxng-src/searx/utils.py", line 158, in html_to_text
html_str = html_str.replace('\n', ' ').replace('\r', ' ')
^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'replace'
presearch engine, when json_resp is a NoneType::
File "/usr/local/searxng/searxng-src/searx/engines/presearch.py", line 221, in response
results = parse_search_query(json_resp.get('results'))
File "/usr/local/searxng/searxng-src/searx/engines/presearch.py", line 161, in parse_search_query
for item in json_results.get('specialSections', {}).get('topStoriesCompact', {}).get('data', []):
^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'get'
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
This patch brings two major changes:
- ``Result.filter_urls(..)`` to pass a filter function for URL fields
- The ``enabled_plugins:`` section in SearXNG's settings do no longer exists.
To understand plugin development compile documentation:
$ make docs.clean docs.live
and read http://0.0.0.0:8000/dev/plugins/development.html
There is no longer a distinction between built-in and external plugin, all
plugins are registered via the settings in the ``plugins:`` section.
In SearXNG, plugins can be registered via a fully qualified class name. A
configuration (`PluginCfg`) can be transferred to the plugin, e.g. to activate
it by default / *opt-in* or *opt-out* from user's point of view.
built-in plugins
================
The built-in plugins are all located in the namespace `searx.plugins`.
.. code:: yaml
plugins:
searx.plugins.calculator.SXNGPlugin:
active: true
searx.plugins.hash_plugin.SXNGPlugin:
active: true
searx.plugins.self_info.SXNGPlugin:
active: true
searx.plugins.tracker_url_remover.SXNGPlugin:
active: true
searx.plugins.unit_converter.SXNGPlugin:
active: true
searx.plugins.ahmia_filter.SXNGPlugin:
active: true
searx.plugins.hostnames.SXNGPlugin:
active: true
searx.plugins.oa_doi_rewrite.SXNGPlugin:
active: false
searx.plugins.tor_check.SXNGPlugin:
active: false
external plugins
================
SearXNG supports *external plugins* / there is no need to install one, SearXNG
runs out of the box.
- Only show green hosted results: https://github.com/return42/tgwf-searx-plugins/
To get a developer installation in a SearXNG developer environment:
.. code:: sh
$ git clone git@github.com:return42/tgwf-searx-plugins.git
$ ./manage pyenv.cmd python -m \
pip install -e tgwf-searx-plugins
To register the plugin in SearXNG add ``only_show_green_results.SXNGPlugin`` to
the ``plugins:``:
.. code:: yaml
plugins:
# ...
only_show_green_results.SXNGPlugin:
active: false
Result.filter_urls(..)
======================
The ``Result.filter_urls(..)`` can be used to filter and/or modify URL fields.
In the following example, the filter function ``my_url_filter``:
.. code:: python
def my_url_filter(result, field_name, url_src) -> bool | str:
if "google" in url_src:
return False # remove URL field from result
if "facebook" in url_src:
new_url = url_src.replace("facebook", "fb-dummy")
return new_url # return modified URL
return True # leave URL in field unchanged
is applied to all URL fields in the :py:obj:`Plugin.on_result` hook:
.. code:: python
class MyUrlFilter(Plugin):
...
def on_result(self, request, search, result) -> bool:
result.filter_urls(my_url_filter)
return True
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Fix the issues reported by sphinx build::
docstring of searx.engines.google.max_page:1: ERROR: Unknown target name: "google: max 50 pages".
docstring of searx.engines.google_images.max_page:1: ERROR: Unknown target name: "google: max 50 pages".
docstring of searx.engines.google_scholar.max_page:1: ERROR: Unknown target name: "google: max 50 pages".
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Hostname "www" in URL results can't be normalized to an empty string:
- https://www.tu-darmstadt.de/
- https://tu-darmstadt.de/
Reported-By: @Bnyro <bnyro@tutanota.com>
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>