[fix] engine archlinux: avoid Anubis challenge by User-Agent "SearXNG"

Of the archlinux wikis only wiki.archlinux.org has a has Anubis challenge.

About Anubis[1]:

> Anubis decides to present a challenge using this logic:
>
> - User-Agent contains "Mozilla"
> ...
> This should ensure that git clients, RSS readers, and other low-harm clients
> can get through without issue ..

[1] 6c0ff3f4d5/docs/docs/design/how-anubis-works.mdx (challenge-presentation)

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Suggested-by: @unixfox https://github.com/searxng/searxng/issues/4646#issuecomment-2855322406
Closes: https://github.com/searxng/searxng/issues/4646
This commit is contained in:
Markus Heiser 2025-05-13 09:19:15 +02:00
parent 5d99373bc6
commit 2e44a10dbf

View File

@ -51,6 +51,9 @@ def request(query, params):
if netloc == main_wiki: if netloc == main_wiki:
eng_lang: str = traits.get_language(sxng_lang, 'English') # type: ignore eng_lang: str = traits.get_language(sxng_lang, 'English') # type: ignore
query += ' (' + eng_lang + ')' query += ' (' + eng_lang + ')'
# wiki.archlinux.org is protected by anubis
# - https://github.com/searxng/searxng/issues/4646#issuecomment-2817848019
params['headers']['User-Agent'] = "SearXNG"
elif netloc == 'wiki.archlinuxcn.org': elif netloc == 'wiki.archlinuxcn.org':
base_url = 'https://' + netloc + '/wzh/index.php?' base_url = 'https://' + netloc + '/wzh/index.php?'