Search plugin development

Full text search, including documents. It’s impossible to do that right with a relational DB.

I partially agree with @ThiefMaster, if it’s cheap to get fresh information from the DB, then it may make sense to get it. The important part about querying ElasticSearch is knowing what to display, not how. OTOH, if we’re confident that the data will be relevant almost all the time, then I also don’t understand why we would need to spend the extra milliseconds to go to Postgres.

Don’t want want to do another access check on the Indico side as well? For that we already need to get the object from the DB.

And yeah, I’d say it’s pretty cheap. If you have maybe 100 IDs you can just filter using Event.id.in_(...) to get those…

Anyway, for the first version I don’t think there’s anything wrong with just showing the data directly using the JSON returned by the search service - changing it to query fresh objects from the DB + do access checks again wouldn’t be much work.

It’s true that doing an additional access check wouldn’t hurt. Anyway, let’s discuss that after the first prototype is done, as you suggest.

BTW, which version of INDICO should be working with?
The installation we have at BNL is 2.1.7, which comes with marshmallow 2.x.
Latest version available for pip is 2.1.8, also using marshmallow 2.
IIRC, you upgraded to marshmallow 3 in version 2.2, right?
However, I just tried to build the RPM for 2.2 via distutils (after adding requirements.txt to MANIFEST.in), but it failed [*].
So:

  • are we OK working with 2.1.7 or should we upgrade that?
  • if we must upgrade to 2.2, how should it be done?

[*]

 $ python setup.py bdist_rpm

... 
... 

+ python setup.py build
/usr/lib64/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'long_description_content_type'
  warnings.warn(msg)
/tmp/indico/venv/lib/python2.7/site-packages/setuptools/dist.py:331: UserWarning: Normalizing '2.2-dev' to '2.2.dev0'
  normalized_version,
running build
compiling catalog indico/translations/en_GB/LC_MESSAGES/messages.po to indico/translations/en_GB/LC_MESSAGES/messages.mo
error: indico/translations/es_ES/LC_MESSAGES/messages.po:2640: placeholders are incompatible
error: indico/translations/es_ES/LC_MESSAGES/messages.po:2648: placeholders are incompatible
compiling catalog indico/translations/es_ES/LC_MESSAGES/messages.po to indico/translations/es_ES/LC_MESSAGES/messages.mo
compiling catalog indico/translations/fr_FR/LC_MESSAGES/messages.po to indico/translations/fr_FR/LC_MESSAGES/messages.mo
compiling catalog indico/translations/pt_BR/LC_MESSAGES/messages.po to indico/translations/pt_BR/LC_MESSAGES/messages.mo
2 errors encountered.
error: No such file or directory
error: Bad exit status from /tmp/indico/venv/indico/build/bdist.linux-x86_64/rpm/TMP/rpm-tmp.CG3yWO (%build)


RPM build errors:
    Bad exit status from /tmp/indico/venv/indico/build/bdist.linux-x86_64/rpm/TMP/rpm-tmp.CG3yWO (%build)
error: command 'rpmbuild' failed with exit status 1

replying to myself… I forgot you don’t install with RPMs, but with pip.
So I found some recipe in my email inbox, and tried it. First I had to install npm and npx.

    $ git clone https://github.com/indico/indico
    $ cd indico
    $ pip install -e .
    $ npm install
    $ pip install -r requirements.dev.txt
    $ mkdir dist
    $ python bin/maintenance/build-wheel.py --target-dir=./dist indico

    building assets
    Error: building assets failed
    Object.entries is not a function
    Error: running webpack failed

I keep investigating…

outdated nodejs; get something recent like node 10 and it’ll work

Now that we released 2.2, please make sure to use the the 2.2-maintenance branch of Indico and not master anymore.

Using something like nvm would be a good choice, since you can easily change versions regardless of your distro.

Hi,

just for curiosity, I gave it a try to deploy the CERN Search via the docker container. Command “docker-compose up” is failing with this error message:

ERROR: Service 'cern-search-api' failed to build: Error: image webservices/cern-search/cern-search-rest-api/cern-search-rest-api-base:cfe1fe3d1aba819d240acbb6a7bfe79678f82ee5 not found

Is the URL wrong? Or maybe it is a private repo and I need to be added somehow?

New question: recommended way to add an extra field to the URLs, needed for pagination.

In order to implement pagination, the code (class JSONSearchEngine @ indico_search_json/engine.py) needs to know which one is the current page, in order to pass the right value to the next call to the CERN Search API cern search api

I assume that can be done adding a new field to the URLs: &search-current_page=number

I have been investigating how, in that case, to change the code to recognize that new field and use it. I see two options here, at least:

  • straightforward one would be to simply import request:

    class JSONSearchEngine(SearchEngine):
       def process(self):
          from flask import request
          current_page = request.args['search-current_page'] 
    

    that is simple and easy. But breaks the consistency with the rest of the code. The rest of fields have been recorded in self.values, so it would be weird to have only some of them in that attribute.

  • second option is adding a new field in indico_search/forms.py

    class SearchForm(IndicoForm):
       phrase = StringField(_('Phrase'))
       field = SelectField(_('Search in'), choices=FIELD_CHOICES, default='')
       start_date = IndicoDateField('Start Date', [Optional()])
       end_date = IndicoDateField('End Date', [Optional()])
       
       # new field
       current_page = StringField()     
    

    But this requires touching code outside the JSON plugin, and I am not sure we are allowed to do that. Could that have unexpected collateral effects?

If it is safe, I would go for option 2. Comments?

I’d go for the easiest option, i.e. option 1. We’ll eventually add a new react-based frontend anyway, so chances are good the whole SearchForm will go away at that point.

But if you prefer adding a field to the form, please use an IntegerField and not a StringField :wink:

I guess you meant option 1, right?

yes, that’s what i meant :wink:

Hello,

I am trying to test the livesync_json plugin (https://github.com/penelopec/indico-plugins/tree/Elasticsearch/livesync_json). I am using the latest indico from master (I updated it yesterday).
I run into problems when I try to create a revision for the plugin executing (https://docs.getindico.io/en/latest/plugins/models/):

$ indico db --plugin livesync_json migrate -m ‘Create table - searchapp_id_map’

Traceback (most recent call last):
File “/opt/indico/.venv/bin/indico”, line 10, in
sys.exit(cli())
File “/opt/indico/.venv/lib/python2.7/site-packages/click/core.py”, line 764, in call
return self.main(*args, **kwargs)
File “/opt/indico/.venv/lib/python2.7/site-packages/flask/cli.py”, line 586, in main
return super(FlaskGroup, self).main(*args, **kwargs)
File “/opt/indico/.venv/lib/python2.7/site-packages/click/core.py”, line 717, in main
rv = self.invoke(ctx)
File “/opt/indico/.venv/lib/python2.7/site-packages/click/core.py”, line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File “/opt/indico/.venv/lib/python2.7/site-packages/indico/cli/util.py”, line 110, in invoke
return self._impl.invoke(ctx)
File “/opt/indico/.venv/lib/python2.7/site-packages/click/core.py”, line 1134, in invoke
Command.invoke(self, ctx)
File “/opt/indico/.venv/lib/python2.7/site-packages/click/core.py”, line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File “/opt/indico/.venv/lib/python2.7/site-packages/click/core.py”, line 555, in invoke
return callback(*args, **kwargs)
File “/opt/indico/.venv/lib/python2.7/site-packages/click/decorators.py”, line 17, in new_func
return f(get_current_context(), *args, **kwargs)
File “/opt/indico/.venv/lib/python2.7/site-packages/click/decorators.py”, line 17, in new_func
return f(get_current_context(), *args, **kwargs)
File “/opt/indico/.venv/lib/python2.7/site-packages/flask/cli.py”, line 425, in decorator
with __ctx.ensure_object(ScriptInfo).load_app().app_context():
File “/opt/indico/.venv/lib/python2.7/site-packages/flask/cli.py”, line 381, in load_app
app = call_factory(self, self.create_app)
File “/opt/indico/.venv/lib/python2.7/site-packages/flask/cli.py”, line 117, in call_factory
return app_factory(script_info)
File “/opt/indico/.venv/lib/python2.7/site-packages/indico/cli/util.py”, line 28, in _create_app
return make_app(set_path=True)
File “/opt/indico/.venv/lib/python2.7/site-packages/indico/web/flask/app.py”, line 345, in make_app
raise Exception(‘Could not load some plugins: {}’.format(’, '.join(plugin_engine.get_failed_plugins(app))))
Exception: Could not load some plugins: livesync_json

I would appreciate some help…

Thank you
Penelope

check indico.log for the real reason why the plugin couldn’t be loaded

The log file has the following:

 2019-10-09 16:35:48,058  ERROR    0000000000000000  indico.plugins            Could not load plugin livesync_json
 Traceback (most recent call last):
   File "/opt/indico/.venv/lib/python2.7/site-packages/flask_pluginengine/engine.py", line 76, in _import_plugins
     plugin_class = entry_point.load()
   File "/opt/indico/.venv/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2434, in load
     return self.resolve()
   File "/opt/indico/.venv/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2440, in resolve
     module = __import__(self.module_name, fromlist=['__name__'], level=0)
   File "/opt/indico/.venv/lib/python2.7/site-packages/indico_livesync_json/plugin.py", line 11, in <module>
     from indico_livesync_json.backend import livesyncjson_backend
   File "/opt/indico/.venv/lib/python2.7/site-packages/indico_livesync_json/backend.py", line 26, in <module>
     from indico_livesync_json.plugin import LiveSyncJsonPlugin
 ImportError: cannot import name LiveSyncJsonPlugin

I did check that the names are correct, rebuild the plugin but this issue was not resolved.
Thank you

You have a circular dependency between plugin.py and backend.py. Remove the top-level from indico_livesync_json.plugin import LiveSyncJsonPlugin from backend.py and move it inside the function where you use it.

Perfect it worked. Now I heed to fix the rest of the errors that created my moving things around…

Now I get the following with no errors reports in indico.log. The line that it complains about is the following, which is similar to other definitions for plugin settings form:

search_app_url = URLField(_('Search app URL'), [DataRequired(), URL(require_tld=False)],
                      description=_("URL <url:port> of search app import endpoint"))

$ indico db --plugin livesync_json migrate -m ‘Create table - searchapp_id_map’
Traceback (most recent call last):
File “/opt/indico/.venv/bin/indico”, line 10, in
sys.exit(cli())
File “/opt/indico/.venv/lib/python2.7/site-packages/click/core.py”, line 764, in call
return self.main(*args, **kwargs)
File “/opt/indico/.venv/lib/python2.7/site-packages/flask/cli.py”, line 586, in main
return super(FlaskGroup, self).main(*args, **kwargs)
File “/opt/indico/.venv/lib/python2.7/site-packages/click/core.py”, line 717, in main
rv = self.invoke(ctx)
File “/opt/indico/.venv/lib/python2.7/site-packages/click/core.py”, line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File “/opt/indico/.venv/lib/python2.7/site-packages/indico/cli/util.py”, line 110, in invoke
return self._impl.invoke(ctx)
File “/opt/indico/.venv/lib/python2.7/site-packages/click/core.py”, line 1134, in invoke
Command.invoke(self, ctx)
File “/opt/indico/.venv/lib/python2.7/site-packages/click/core.py”, line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File “/opt/indico/.venv/lib/python2.7/site-packages/click/core.py”, line 555, in invoke
return callback(*args, **kwargs)
File “/opt/indico/.venv/lib/python2.7/site-packages/click/decorators.py”, line 17, in new_func
return f(get_current_context(), *args, **kwargs)
File “/opt/indico/.venv/lib/python2.7/site-packages/click/decorators.py”, line 17, in new_func
return f(get_current_context(), *args, **kwargs)
File “/opt/indico/.venv/lib/python2.7/site-packages/flask/cli.py”, line 425, in decorator
with __ctx.ensure_object(ScriptInfo).load_app().app_context():
File “/opt/indico/.venv/lib/python2.7/site-packages/flask/cli.py”, line 381, in load_app
app = call_factory(self, self.create_app)
File “/opt/indico/.venv/lib/python2.7/site-packages/flask/cli.py”, line 117, in call_factory
return app_factory(script_info)
File “/opt/indico/.venv/lib/python2.7/site-packages/indico/cli/util.py”, line 28, in _create_app
return make_app(set_path=True)
File “/opt/indico/.venv/lib/python2.7/site-packages/indico/web/flask/app.py”, line 344, in make_app
if not plugin_engine.load_plugins(app):
File “/opt/indico/.venv/lib/python2.7/site-packages/flask_pluginengine/engine.py”, line 46, in load_plugins
plugins = self._import_plugins(state.app)
File “/opt/indico/.venv/lib/python2.7/site-packages/flask_pluginengine/engine.py”, line 76, in import_plugins
plugin_class = entry_point.load()
File “/opt/indico/.venv/lib/python2.7/site-packages/pkg_resources/init.py”, line 2434, in load
return self.resolve()
File “/opt/indico/.venv/lib/python2.7/site-packages/pkg_resources/init.py”, line 2440, in resolve
module = import(self.module_name, fromlist=[‘name’], level=0)
File “/opt/indico/.venv/lib/python2.7/site-packages/indico_livesync_json/plugin.py”, line 20, in
class livesyncjson_settingsform(IndicoForm):
File “/opt/indico/.venv/lib/python2.7/site-packages/indico_livesync_json/plugin.py”, line 21, in livesyncjson_settingsform
searchapp_url = URLField(
(‘Search app URL’), [DataRequired()],
File “/opt/indico/.venv/lib/python2.7/site-packages/indico/util/i18n.py”, line 185, in ugettext
self._check_stack()
File “/opt/indico/.venv/lib/python2.7/site-packages/indico/util/i18n.py”, line 180, in check_stack
raise RuntimeError(msg)
RuntimeError: Using the gettext function (_) patched into the builtins is disallowed.
Please import it from indico.util.i18n instead.
The offending code was found in this location:
File “/opt/indico/.venv/lib/python2.7/site-packages/indico_livesync_json/plugin.py”, line 21, in livesyncjson_settingsform
searchapp_url = URLField(
(‘Search app URL’), [DataRequired()],

you are missing an import for the _ function