Merge pull request #109 from return42/sql-engines
Merge MySQL and PostgreSQL from searx
This commit is contained in:
		
						commit
						e239369f1f
					
				| @ -13,3 +13,4 @@ Blog | |||||||
|    private-engines |    private-engines | ||||||
|    command-line-engines |    command-line-engines | ||||||
|    search-indexer-engines |    search-indexer-engines | ||||||
|  |    sql-engines | ||||||
|  | |||||||
							
								
								
									
										160
									
								
								docs/blog/sql-engines.rst
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										160
									
								
								docs/blog/sql-engines.rst
									
									
									
									
									
										Normal file
									
								
							| @ -0,0 +1,160 @@ | |||||||
|  | =========== | ||||||
|  | SQL engines | ||||||
|  | =========== | ||||||
|  | 
 | ||||||
|  | .. sidebar:: further read | ||||||
|  | 
 | ||||||
|  |    - `SQLite <https://www.sqlite.org/index.html>`_ | ||||||
|  |    - `PostgreSQL <https://www.postgresql.org>`_ | ||||||
|  |    - `MySQL <https://www.mysql.com>`_ | ||||||
|  | 
 | ||||||
|  | With the *SQL engines* you can bind SQL databases into SearxNG.  The following | ||||||
|  | Relational Database Management System (RDBMS) are supported: | ||||||
|  | 
 | ||||||
|  | - :ref:`engine sqlite` | ||||||
|  | - :ref:`engine postgresql` | ||||||
|  | - :ref:`engine mysql_server` | ||||||
|  | 
 | ||||||
|  | All of the engines above are just commented out in the :origin:`settings.yml | ||||||
|  | <searx/settings.yml>`, as you have to set the required attributes for the | ||||||
|  | engines, e.g. ``database:`` ... | ||||||
|  | 
 | ||||||
|  | .. code:: yaml | ||||||
|  | 
 | ||||||
|  |    - name: ... | ||||||
|  |      engine: {sqlite|postgresql|mysql_server} | ||||||
|  |      database: ... | ||||||
|  |      result_template: {template_name} | ||||||
|  |      query_str: ... | ||||||
|  | 
 | ||||||
|  | By default, the engines use the ``key-value`` template for displaying results / | ||||||
|  | see :origin:`oscar <searx/templates/oscar/result_templates/key-value.html>` & | ||||||
|  | :origin:`simple <searx/templates/simple/result_templates/key-value.html>` | ||||||
|  | themes.  If you are not satisfied with the original result layout, you can use | ||||||
|  | your own template, set ``result_template`` attribute to ``{template_name}`` and | ||||||
|  | place the templates at:: | ||||||
|  | 
 | ||||||
|  |   searx/templates/{theme_name}/result_templates/{template_name} | ||||||
|  | 
 | ||||||
|  | As mentioned in previous blog posts, if you do not wish to expose these engines | ||||||
|  | on a public instance, you can still add them and limit the access by setting | ||||||
|  | ``tokens`` as described in section :ref:`private engines`. | ||||||
|  | 
 | ||||||
|  | Configure the engines | ||||||
|  | ===================== | ||||||
|  | 
 | ||||||
|  | The configuration of the new database engines are similar.  You must put a valid | ||||||
|  | SQL-SELECT query in ``query_str``.  At the moment you can only bind at most one | ||||||
|  | parameter in your query.  By setting the attribute ``limit`` you can define how | ||||||
|  | many results you want from the SQL server.  Basically, it is the same as the | ||||||
|  | ``LIMIT`` keyword in SQL. | ||||||
|  | 
 | ||||||
|  | Please, do not include ``LIMIT`` or ``OFFSET`` in your SQL query as the engines | ||||||
|  | rely on these keywords during paging.  If you want to configure the number of | ||||||
|  | returned results use the option ``limit``. | ||||||
|  | 
 | ||||||
|  | .. _engine sqlite: | ||||||
|  | 
 | ||||||
|  | SQLite | ||||||
|  | ------ | ||||||
|  | 
 | ||||||
|  | .. _MediathekView: https://mediathekview.de/ | ||||||
|  | 
 | ||||||
|  | SQLite is a small, fast and reliable SQL database engine.  It does not require | ||||||
|  | any extra dependency.  To demonstrate the power of database engines, here is a | ||||||
|  | more complex example which reads from a MediathekView_ (DE) movie database.  For | ||||||
|  | this example of the SQlite engine download the database: | ||||||
|  | 
 | ||||||
|  | - https://liste.mediathekview.de/filmliste-v2.db.bz2 | ||||||
|  | 
 | ||||||
|  | and unpack into ``searx/data/filmliste-v2.db``.  To search the database use e.g | ||||||
|  | Query to test: ``!mediathekview concert`` | ||||||
|  | 
 | ||||||
|  | .. code:: yaml | ||||||
|  | 
 | ||||||
|  |    - name: mediathekview | ||||||
|  |      engine: sqlite | ||||||
|  |      disabled: False | ||||||
|  |      categories: general | ||||||
|  |      result_template: default.html | ||||||
|  |      database: searx/data/filmliste-v2.db | ||||||
|  |      query_str:  >- | ||||||
|  |        SELECT title || ' (' || time(duration, 'unixepoch') || ')' AS title, | ||||||
|  |               COALESCE( NULLIF(url_video_hd,''), NULLIF(url_video_sd,''), url_video) AS url, | ||||||
|  |               description AS content | ||||||
|  |          FROM film | ||||||
|  |         WHERE title LIKE :wildcard OR description LIKE :wildcard | ||||||
|  |         ORDER BY duration DESC | ||||||
|  | 
 | ||||||
|  | 
 | ||||||
|  | Extra Dependencies | ||||||
|  | ------------------ | ||||||
|  | 
 | ||||||
|  | For using :ref:`engine postgresql` or :ref:`engine mysql_server` you need to | ||||||
|  | install additional packages in Python's Virtual Environment of your SearxNG | ||||||
|  | instance.  To switch into the environment (:ref:`searx-src`) you can use | ||||||
|  | :ref:`searx.sh`:: | ||||||
|  | 
 | ||||||
|  |   $ sudo utils/searx.sh shell | ||||||
|  |   (searx-pyenv)$ pip install ... | ||||||
|  | 
 | ||||||
|  | 
 | ||||||
|  | .. _engine postgresql: | ||||||
|  | 
 | ||||||
|  | PostgreSQL | ||||||
|  | ---------- | ||||||
|  | 
 | ||||||
|  | .. _psycopg2: https://www.psycopg.org/install | ||||||
|  | 
 | ||||||
|  | .. sidebar:: requirements | ||||||
|  | 
 | ||||||
|  |    ``pip install`` psycopg2_ | ||||||
|  | 
 | ||||||
|  | PostgreSQL is a powerful and robust open source database.  Before configuring | ||||||
|  | the PostgreSQL engine, you must install the dependency ``psychopg2``.  You can | ||||||
|  | find an example configuration below: | ||||||
|  | 
 | ||||||
|  | .. code:: yaml | ||||||
|  | 
 | ||||||
|  |    - name: my_database | ||||||
|  |      engine: postgresql | ||||||
|  |      database: my_database | ||||||
|  |      username: searx | ||||||
|  |      password: password | ||||||
|  |      query_str: 'SELECT * from my_table WHERE my_column = %(query)s' | ||||||
|  | 
 | ||||||
|  | .. _engine mysql_server: | ||||||
|  | 
 | ||||||
|  | MySQL | ||||||
|  | ----- | ||||||
|  | 
 | ||||||
|  | .. _mysql-connector-python: https://pypi.org/project/mysql-connector-python | ||||||
|  | 
 | ||||||
|  | .. sidebar:: requirements | ||||||
|  | 
 | ||||||
|  |    ``pip install`` mysql-connector-python_ | ||||||
|  | 
 | ||||||
|  | MySQL is said to be the most popular open source database. Before enabling MySQL | ||||||
|  | engine, you must install the package ``mysql-connector-python``. | ||||||
|  | 
 | ||||||
|  | The authentication plugin is configurable by setting ``auth_plugin`` in the | ||||||
|  | attributes.  By default it is set to ``caching_sha2_password``.  This is an | ||||||
|  | example configuration for quering a MySQL server: | ||||||
|  | 
 | ||||||
|  | .. code:: yaml | ||||||
|  | 
 | ||||||
|  |    - name: my_database | ||||||
|  |      engine: mysql_server | ||||||
|  |      database: my_database | ||||||
|  |      username: searx | ||||||
|  |      password: password | ||||||
|  |      limit: 5 | ||||||
|  |      query_str: 'SELECT * from my_table WHERE my_column=%(query)s' | ||||||
|  | 
 | ||||||
|  | 
 | ||||||
|  | Acknowledgement | ||||||
|  | =============== | ||||||
|  | 
 | ||||||
|  | This development was sponsored by `Search and Discovery Fund | ||||||
|  | <https://nlnet.nl/discovery>`_ of `NLnet Foundation <https://nlnet.nl/>`_ . | ||||||
|  | 
 | ||||||
							
								
								
									
										57
									
								
								searx/engines/mysql_server.py
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										57
									
								
								searx/engines/mysql_server.py
									
									
									
									
									
										Normal file
									
								
							| @ -0,0 +1,57 @@ | |||||||
|  | # SPDX-License-Identifier: AGPL-3.0-or-later | ||||||
|  | # lint: pylint | ||||||
|  | # pylint: disable=missing-function-docstring | ||||||
|  | """MySQL database (offline) | ||||||
|  | 
 | ||||||
|  | """ | ||||||
|  | 
 | ||||||
|  | # import error is ignored because the admin has to install mysql manually to use | ||||||
|  | # the engine | ||||||
|  | import mysql.connector  # pylint: disable=import-error | ||||||
|  | 
 | ||||||
|  | engine_type = 'offline' | ||||||
|  | auth_plugin = 'caching_sha2_password' | ||||||
|  | host = "127.0.0.1" | ||||||
|  | database = "" | ||||||
|  | username = "" | ||||||
|  | password = "" | ||||||
|  | query_str = "" | ||||||
|  | limit = 10 | ||||||
|  | paging = True | ||||||
|  | result_template = 'key-value.html' | ||||||
|  | _connection = None | ||||||
|  | 
 | ||||||
|  | def init(engine_settings): | ||||||
|  |     global _connection  # pylint: disable=global-statement | ||||||
|  | 
 | ||||||
|  |     if 'query_str' not in engine_settings: | ||||||
|  |         raise ValueError('query_str cannot be empty') | ||||||
|  | 
 | ||||||
|  |     if not engine_settings['query_str'].lower().startswith('select '): | ||||||
|  |         raise ValueError('only SELECT query is supported') | ||||||
|  | 
 | ||||||
|  |     _connection = mysql.connector.connect( | ||||||
|  |         database = database, | ||||||
|  |         user = username, | ||||||
|  |         password = password, | ||||||
|  |         host = host, | ||||||
|  |         auth_plugin=auth_plugin, | ||||||
|  |     ) | ||||||
|  | 
 | ||||||
|  | def search(query, params): | ||||||
|  |     query_params = {'query': query} | ||||||
|  |     query_to_run = query_str + ' LIMIT {0} OFFSET {1}'.format(limit, (params['pageno'] - 1) * limit) | ||||||
|  | 
 | ||||||
|  |     with _connection.cursor() as cur: | ||||||
|  |         cur.execute(query_to_run, query_params) | ||||||
|  | 
 | ||||||
|  |         return _fetch_results(cur) | ||||||
|  | 
 | ||||||
|  | def _fetch_results(cur): | ||||||
|  |     results = [] | ||||||
|  |     for res in cur: | ||||||
|  |         result = dict(zip(cur.column_names, map(str, res))) | ||||||
|  |         result['template'] = result_template | ||||||
|  |         results.append(result) | ||||||
|  | 
 | ||||||
|  |     return results | ||||||
							
								
								
									
										69
									
								
								searx/engines/postgresql.py
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										69
									
								
								searx/engines/postgresql.py
									
									
									
									
									
										Normal file
									
								
							| @ -0,0 +1,69 @@ | |||||||
|  | # SPDX-License-Identifier: AGPL-3.0-or-later | ||||||
|  | # lint: pylint | ||||||
|  | # pylint: disable=missing-function-docstring | ||||||
|  | """PostgreSQL database (offline) | ||||||
|  | 
 | ||||||
|  | """ | ||||||
|  | 
 | ||||||
|  | # import error is ignored because the admin has to install mysql manually to use | ||||||
|  | # the engine | ||||||
|  | import psycopg2  # pylint: disable=import-error | ||||||
|  | 
 | ||||||
|  | engine_type = 'offline' | ||||||
|  | host = "127.0.0.1" | ||||||
|  | port = "5432" | ||||||
|  | database = "" | ||||||
|  | username = "" | ||||||
|  | password = "" | ||||||
|  | query_str = "" | ||||||
|  | limit = 10 | ||||||
|  | paging = True | ||||||
|  | result_template = 'key-value.html' | ||||||
|  | _connection = None | ||||||
|  | 
 | ||||||
|  | def init(engine_settings): | ||||||
|  |     global _connection  # pylint: disable=global-statement | ||||||
|  | 
 | ||||||
|  |     if 'query_str' not in engine_settings: | ||||||
|  |         raise ValueError('query_str cannot be empty') | ||||||
|  | 
 | ||||||
|  |     if not engine_settings['query_str'].lower().startswith('select '): | ||||||
|  |         raise ValueError('only SELECT query is supported') | ||||||
|  | 
 | ||||||
|  |     _connection = psycopg2.connect( | ||||||
|  |         database = database, | ||||||
|  |         user = username, | ||||||
|  |         password = password, | ||||||
|  |         host = host, | ||||||
|  |         port = port, | ||||||
|  |     ) | ||||||
|  | 
 | ||||||
|  | def search(query, params): | ||||||
|  |     query_params = {'query': query} | ||||||
|  |     query_to_run = ( | ||||||
|  |         query_str | ||||||
|  |         + ' LIMIT {0} OFFSET {1}'.format(limit, (params['pageno'] - 1) * limit) | ||||||
|  |     ) | ||||||
|  | 
 | ||||||
|  |     with _connection: | ||||||
|  |         with _connection.cursor() as cur: | ||||||
|  |             cur.execute(query_to_run, query_params) | ||||||
|  |             return _fetch_results(cur) | ||||||
|  | 
 | ||||||
|  | def _fetch_results(cur): | ||||||
|  |     results = [] | ||||||
|  |     titles = [] | ||||||
|  | 
 | ||||||
|  |     try: | ||||||
|  |         titles = [column_desc.name for column_desc in cur.description] | ||||||
|  | 
 | ||||||
|  |         for res in cur: | ||||||
|  |             result = dict(zip(titles, map(str, res))) | ||||||
|  |             result['template'] = result_template | ||||||
|  |             results.append(result) | ||||||
|  | 
 | ||||||
|  |     # no results to fetch | ||||||
|  |     except psycopg2.ProgrammingError: | ||||||
|  |         pass | ||||||
|  | 
 | ||||||
|  |     return results | ||||||
| @ -871,6 +871,16 @@ engines: | |||||||
|     url: https://thepiratebay.org/ |     url: https://thepiratebay.org/ | ||||||
|     timeout : 3.0 |     timeout : 3.0 | ||||||
| 
 | 
 | ||||||
|  | # Required dependency: psychopg2 | ||||||
|  | #  - name : postgresql | ||||||
|  | #    engine : postgresql | ||||||
|  | #    database : postgres | ||||||
|  | #    username : postgres | ||||||
|  | #    password : postgres | ||||||
|  | #    limit : 10 | ||||||
|  | #    query_str : 'SELECT * from my_table WHERE my_column = %(query)s' | ||||||
|  | #    shortcut : psql | ||||||
|  | 
 | ||||||
|   - name : pubmed |   - name : pubmed | ||||||
|     engine : pubmed |     engine : pubmed | ||||||
|     shortcut : pub |     shortcut : pub | ||||||
| @ -1252,6 +1262,16 @@ engines: | |||||||
|     # See : http://mymemory.translated.net/doc/usagelimits.php |     # See : http://mymemory.translated.net/doc/usagelimits.php | ||||||
|     # api_key : '' |     # api_key : '' | ||||||
| 
 | 
 | ||||||
|  | # Required dependency: mysql-connector-python | ||||||
|  | #  - name : mysql | ||||||
|  | #    engine : mysql_server | ||||||
|  | #    database : mydatabase | ||||||
|  | #    username : user | ||||||
|  | #    password : pass | ||||||
|  | #    limit : 10 | ||||||
|  | #    query_str : 'SELECT * from mytable WHERE fieldname=%(query)s' | ||||||
|  | #    shortcut : mysql | ||||||
|  | 
 | ||||||
|   - name : 1337x |   - name : 1337x | ||||||
|     engine : 1337x |     engine : 1337x | ||||||
|     shortcut : 1337x |     shortcut : 1337x | ||||||
|  | |||||||
		Loading…
	
	
			
			x
			
			
		
	
		Reference in New Issue
	
	Block a user
	 Alexandre Flament
						Alexandre Flament