Search plugin development


#41

I had a devel server that used to work, but not anymore. I reinstalled the packages, redis and postgresql are running, and re-created the SSL keys. Still getting the same errors when I go to 127.0.0.1:8000. Are they familiar to you?

127.0.0.1 - - [26/Apr/2019 11:23:41] code 400, message Bad request syntax ('\x16\x03\x01\x02\x00\x01\x00\x01\xfc\x03\x030\xcf\x07J\xa7\xb1\x84\xe6-\x92<1Uy\x8f\xab\x04t\xe0\x15\x01dD\xa5\x1c \xe0\x0c\xb5\x97,\xc5 \x0c`\xc8\x1b\x13\xc7\x98\x83D\xd0KD\xd2a\xef3(\xdd\xb4\xbd\xb5\x9bT36\xe3\xb4\x11\x01\x02H_\x00"\xba\xba\x13\x01\x13\x02\x13\x03\xc0+\xc0/\xc0,\xc00\xcc\xa9\xcc\xa8\xc0\x13\xc0\x14\x00\x9c\x00\x9d\x00/\x005\x00')
127.0.0.1 - - [26/Apr/2019 11:23:41] code 400, message Bad HTTP/0.9 request type ("\x16\x03\x01\x02\x00\x01\x00\x01\xfc\x03\x03-\xa8\xd2\xa0*\xe7\xbc\x7f5\xc4\xe7\x1eD\x99\x12Oi\xa9\x833)\x11'\xb8hE\xf8\x9a\xd5\xf62\x9e")
--------------------------------------------------------------------------------
Exception happened during processing of request from 
(Exception happened during processing of request from'127.0. 0('127..1', 0.0.1'56228)
Traceback (most recent call last):
, 56226)
  File "/usr/lib64/python2.7/SocketServer.py", line 593, in process_request_thread
Traceback (most recent call last):
  File "/usr/lib64/python2.7/SocketServer.py", line 593, in process_request_thread
    self.finish_request(request, client_address)
    self.finish_request(request, client_address)
  File "/usr/lib64/python2.7/SocketServer.py", line 334, in finish_request
  File "/usr/lib64/python2.7/SocketServer.py", line 334, in finish_request
    self.RequestHandlerClass(request, client_address, self)
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/lib64/python2.7/SocketServer.py", line 649, in __init__
  File "/usr/lib64/python2.7/SocketServer.py", line 649, in __init__
    self.handle()
    self.handle()
  File "/home/fakeusername/dev/indico/env/lib/python2.7/site-packages/werkzeug/serving.py", line 293, in handle
  File "/home/fakeusername/dev/indico/env/lib/python2.7/site-packages/werkzeug/serving.py", line 293, in handle
    rv = BaseHTTPRequestHandler.handle(self)
    rv = BaseHTTPRequestHandler.handle(self)
  File "/usr/lib64/python2.7/BaseHTTPServer.py", line 340, in handle
  File "/usr/lib64/python2.7/BaseHTTPServer.py", line 340, in handle
    self.handle_one_request()
  File "/home/fakeusername/dev/indico/env/lib/python2.7/site-packages/werkzeug/serving.py", line 327, in handle_one_request
    self.handle_one_request()
    elif self.parse_request():
  File "/home/fakeusername/dev/indico/env/lib/python2.7/site-packages/werkzeug/serving.py", line 327, in handle_one_request
  File "/usr/lib64/python2.7/BaseHTTPServer.py", line 286, in parse_request
    elif self.parse_request():
    self.send_error(400, "Bad request syntax (%r)" % requestline)
  File "/usr/lib64/python2.7/BaseHTTPServer.py", line 281, in parse_request
  File "/usr/lib64/python2.7/BaseHTTPServer.py", line 368, in send_error
    "Bad HTTP/0.9 request type (%r)" % command)
    self.send_response(code, message)
  File "/usr/lib64/python2.7/BaseHTTPServer.py", line 368, in send_error
  File "/home/fakeusername/dev/indico/env/lib/python2.7/site-packages/werkzeug/serving.py", line 332, in send_response
    self.log_request(code)
    self.send_response(code, message)
  File "/home/fakeusername/dev/indico/src/indico/cli/devserver.py", line 161, in log_request
  File "/home/fakeusername/dev/indico/env/lib/python2.7/site-packages/werkzeug/serving.py", line 332, in send_response
    self.log_request(code)
  File "/home/fakeusername/dev/indico/src/indico/cli/devserver.py", line 161, in log_request
    super(QuietWSGIRequestHandler, self).log_request(code, size)
  File "/home/fakeusername/dev/indico/env/lib/python2.7/site-packages/werkzeug/serving.py", line 373, in log_request
    self.log('info', '"%s" %s %s', msg, code, size)
  File "/home/fakeusername/dev/indico/env/lib/python2.7/site-packages/werkzeug/serving.py", line 384, in log
    message % args))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xfc in position 18: ordinal not in range(128)
    super(QuietWSGIRequestHandler, self).log_request(code, size)
----------------------------------------
  File "/home/fakeusername/dev/indico/env/lib/python2.7/site-packages/werkzeug/serving.py", line 373, in log_request
    self.log('info', '"%s" %s %s', msg, code, size)
  File "/home/fakeusername/dev/indico/env/lib/python2.7/site-packages/werkzeug/serving.py", line 384, in log
    message % args))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xfc in position 18: ordinal not in range(128)
----------------------------------------
127.0.0.1 - - [26/Apr/2019 11:23:41] code 400, message Bad request syntax ("\x16\x03\x01\x00\xb5\x01\x00\x00\xb1\x03\x03m\x90\xd6\xda\x16!'3;\x03\xd8_\xa1\xf80\xcd\xe6\xe9\xd0\xd9\x055\xe78F\x9e\xfb\xf5Vv\xca\xb5\x00\x00\x1cJJ\xc0+\xc0/\xc0,\xc00\xcc\xa9\xcc\xa8\xc0\x13\xc0\x14\x00\x9c\x00\x9d\x00/\x005\x00")
----------------------------------------
Exception happened during processing of request from ('127.0.0.1', 56230)
Traceback (most recent call last):
  File "/usr/lib64/python2.7/SocketServer.py", line 593, in process_request_thread
    self.finish_request(request, client_address)
  File "/usr/lib64/python2.7/SocketServer.py", line 334, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/lib64/python2.7/SocketServer.py", line 649, in __init__
    self.handle()
  File "/home/fakeusername/dev/indico/env/lib/python2.7/site-packages/werkzeug/serving.py", line 293, in handle
    rv = BaseHTTPRequestHandler.handle(self)
  File "/usr/lib64/python2.7/BaseHTTPServer.py", line 340, in handle
    self.handle_one_request()
  File "/home/fakeusername/dev/indico/env/lib/python2.7/site-packages/werkzeug/serving.py", line 327, in handle_one_request
    elif self.parse_request():
  File "/usr/lib64/python2.7/BaseHTTPServer.py", line 286, in parse_request
    self.send_error(400, "Bad request syntax (%r)" % requestline)
  File "/usr/lib64/python2.7/BaseHTTPServer.py", line 368, in send_error
    self.send_response(code, message)
  File "/home/fakeusername/dev/indico/env/lib/python2.7/site-packages/werkzeug/serving.py", line 332, in send_response
    self.log_request(code)
  File "/home/fakeusername/dev/indico/src/indico/cli/devserver.py", line 161, in log_request
    super(QuietWSGIRequestHandler, self).log_request(code, size)
  File "/home/fakeusername/dev/indico/env/lib/python2.7/site-packages/werkzeug/serving.py", line 373, in log_request
    self.log('info', '"%s" %s %s', msg, code, size)
  File "/home/fakeusername/dev/indico/env/lib/python2.7/site-packages/werkzeug/serving.py", line 384, in log
    message % args))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xb5 in position 14: ordinal not in range(128)
----------------------------------------

Thanks,
Jose


#42

Looks like you’re accessing a dev server that’s running in http mode via https.


#43

Hmm, my notes included https… But yes, using http works.


#44

The dev server has some options to use https, or you can put e.g. nginx in front of it (the dev setup docs mention this as an option). But by default it’s http-only since that’s the easiest way to use it in development.


#45

As agreed during yesterday’s meeting, I have created a docker-compose file that sets up the CERN Search microservice alongside Nginx, Postgres, Redis, ElasticSearch and Tika. This should be enough to get us started with the development of the plugin:

In order to run it, you should download the file to the root folder of the cern-search repo. You will also have to generate the test certificates by hand (we could have it in a separate Dockerfile for nginx, though…)

$ sh scripts/gen-cert.ch
$ mv nginx.crt nginx/tls/tls.crt
$ mv nginx.key nginx/tls/tls.key
$ rm nginx.csr

If OpenSSL complains about the password being too short, just replace pass:x with pass:12345 in gen-cert.sh (I’ll send a PR to fix that upstream).

Then do docker-compose up and you should have your development cluster running.

I managed to log in to Invenio (https://localhost:8080)

(username: test@example.com, password: test1234)

Retrieving records through the REST API results in an error, probably because I haven’t set up the ElasticSearch indices propertly. In any case, it’s a start.

Apache Tika seems to work fine when I connect to it using tika-python:

In [16]: from tika import parser

In [17]: parser.from_file('/tmp/test.docx', serverEndpoint="http://localhost:9998")
Out[17]:
{'content': u'\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nTEST2\n',
 'metadata': {u'Application-Name': u'LibreOffice/5.3.6.1$Linux_X86_64 LibreOffice_project/30$Build-1',
...

#46

@pferreir I would appreciate, if you can provide a bit more information about docker (I never used docker before apart from the initial test to run the HelloWorld and list the docker images…).
I installed docker (version: 1.13.1, API version 1.26) and docker-compose (version: 1.24.0)
I created a directory with the downloaded docker-compose.yml file you created and tried to run docker-compose up but this required the Dockerfile. What should the Dockerfile contain? And obviously I am missing the commands to initialize docker and the “development cluster”.
Also, you are using gen-cert.ch to create the certificates. What is the content of this file? a simple openssh command?


#47

I think this answers your question :wink:

In order to run it, you should download the file to the root folder of the cern-search repo.

The repo to clone is https://github.com/inveniosoftware-contrib/cern-search - it includes the Dockerfile and get-cert.sh script.


#48

THANK YOU! Yes it does.