Create initial dev doc for Kagi: kagi.rst
This commit is contained in:
parent
44f5c299be
commit
2a421825be
105
docs/dev/engines/online/kagi.rst
Normal file
105
docs/dev/engines/online/kagi.rst
Normal file
@ -0,0 +1,105 @@
|
|||||||
|
.. _kagi engine:
|
||||||
|
|
||||||
|
Kagi
|
||||||
|
====
|
||||||
|
|
||||||
|
The Kagi engine scrapes search results from Kagi's HTML search interface.
|
||||||
|
|
||||||
|
Example
|
||||||
|
-------
|
||||||
|
|
||||||
|
Configuration
|
||||||
|
~~~~~~~~~~~~
|
||||||
|
|
||||||
|
.. code:: yaml
|
||||||
|
|
||||||
|
- name: kagi
|
||||||
|
engine: kagi
|
||||||
|
shortcut: kg
|
||||||
|
categories: [general, web]
|
||||||
|
timeout: 4.0
|
||||||
|
api_key: "YOUR-KAGI-TOKEN" # required
|
||||||
|
about:
|
||||||
|
website: https://kagi.com
|
||||||
|
use_official_api: false
|
||||||
|
require_api_key: true
|
||||||
|
results: HTML
|
||||||
|
|
||||||
|
|
||||||
|
Parameters
|
||||||
|
~~~~~~~~~~
|
||||||
|
|
||||||
|
``api_key`` : required
|
||||||
|
The Kagi API token used for authentication. Can be obtained from your Kagi account settings.
|
||||||
|
|
||||||
|
``pageno`` : optional
|
||||||
|
The page number for paginated results. Defaults to 1.
|
||||||
|
|
||||||
|
Example Request
|
||||||
|
~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
.. code:: python
|
||||||
|
|
||||||
|
params = {
|
||||||
|
'api_key': 'YOUR-KAGI-TOKEN',
|
||||||
|
'pageno': 1,
|
||||||
|
'headers': {
|
||||||
|
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
|
||||||
|
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
|
||||||
|
'Accept-Language': 'en-US,en;q=0.5',
|
||||||
|
'DNT': '1'
|
||||||
|
}
|
||||||
|
}
|
||||||
|
query = 'test query'
|
||||||
|
request_params = kagi.request(query, params)
|
||||||
|
|
||||||
|
Example Response
|
||||||
|
~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
.. code:: python
|
||||||
|
|
||||||
|
[
|
||||||
|
# Search result
|
||||||
|
{
|
||||||
|
'url': 'https://example.com/',
|
||||||
|
'title': 'Example Title',
|
||||||
|
'content': 'Example content snippet...',
|
||||||
|
'domain': 'example.com'
|
||||||
|
}
|
||||||
|
]
|
||||||
|
|
||||||
|
Implementation
|
||||||
|
-------------
|
||||||
|
|
||||||
|
The engine performs the following steps:
|
||||||
|
|
||||||
|
1. Constructs a GET request to ``https://kagi.com/html/search`` with:
|
||||||
|
- ``q`` parameter for the search query
|
||||||
|
- ``token`` parameter for authentication
|
||||||
|
- ``batch`` parameter for pagination
|
||||||
|
|
||||||
|
2. Parses the HTML response using XPath to extract:
|
||||||
|
- Result titles
|
||||||
|
- URLs
|
||||||
|
- Content snippets
|
||||||
|
- Domain information
|
||||||
|
|
||||||
|
3. Handles various error cases:
|
||||||
|
- 401: Invalid API token
|
||||||
|
- 429: Rate limit exceeded
|
||||||
|
- Other non-200 status codes
|
||||||
|
|
||||||
|
Dependencies
|
||||||
|
-----------
|
||||||
|
|
||||||
|
- lxml: For HTML parsing and XPath evaluation
|
||||||
|
- urllib.parse: For URL handling and encoding
|
||||||
|
- searx.utils: For text extraction and XPath helpers
|
||||||
|
|
||||||
|
Notes
|
||||||
|
-----
|
||||||
|
|
||||||
|
- The engine requires a valid Kagi API token to function
|
||||||
|
- Results are scraped from Kagi's HTML interface rather than using an official API
|
||||||
|
- Rate limiting may apply based on your Kagi subscription level
|
||||||
|
- The engine sets specific browser-like headers to ensure reliable scraping
|
Loading…
x
Reference in New Issue
Block a user