DB deadlock during 2.1 upgrade + miscellaneous other problems

Hi,

I just upgraded to 2.1 with the usual procedure (pip install -U + indico db upgrade) without stopping the Indico service. I never had problem but this time during DB upgrade I got the following error during Running upgrade c820455976ba -> 813ea74ce8dc, Add AttachmentFolder.is_hidden step:

sqlalchemy.exc.OperationalError: (psycopg2.extensions.TransactionRollbackError) deadlock detected
DETAIL: Process 13182 waits for AccessExclusiveLock on relation 78502 of database 77071; blocked by process 13099.
Process 13099 waits for AccessShareLock on relation 78947 of database 77071; blocked by process 13182.
HINT: See server log for query details.
[SQL: “ALTER TABLE attachments.folders ADD COLUMN is_hidden BOOLEAN DEFAULT ‘false’ NOT NULL”] (Background on this error at: http://sqlalche.me/e/e3q8)

I re-run indico db upgrade successfully after the error. Do I need to worry about something or was it just a transient error with everything remaining to be done fixed during the second run of the DB upgrade?

Cheers,

Michel

1 Like

Structure changes in postgres are transactional, so if it was successful when re-running it everything is fine.

Good to get the confirmation! It was my guess…

BTW, if you still had pip<10 (where all dependencies are updated, not just the ones absolutely required), the upgrade to 2.1 may have broken Celery.
Just in case, you can run pip install kombu==4.1.0 and restart Celery after that.

After the upgrade I have kombu v4.2.0. Is it the problem? How to know that Celery has been broken?

Michel

If you have kombu 4.2, it’s broken (we’ll soon release a 2.1.1 that will also include an update to celery to avoid this problem altogether).

Unfortunately this error happens at a point where it won’t go to celery.log, but the systemd service doesn’t fully crash either. So it usually only shows up because emails won’t be sent and other tasks won’t run either.

I have another problem. I don’t see anymore how to define the event managers? Did I do something wrong? Is it related to the new role management?

It seems to be connected to the new role management. But I see that at CERN I have the old permissions displayed as role (MANAGE, CHR, SUBMISSION…) with the possibility to assign/change permissions for each users when in my instance, when clicking on Protection, I don’t see any role assigned to any user, despite I’m logged as the Indico admin (or as an event manager).

indeed, there is just a single place for the ACL now and someone with “manage” permissions in there is the equivalent to someone listed in the list of event managers before.

But how do you explain that I don’t see the current permissions of an event (Permissions list is empty in the Protection page?

Do you have any errors in your browser’s JS console?

ok, this is a problem with my preferred browser, Opera! With Firefox it works just fine. Not clear why CERN Indico works with Opera but not my instance…

Activating the console in Opera, I got the following error:

utils_ac34573e.min.js:17 Uncaught TypeError: Cannot read property 'type' of undefined
    at build_url (utils_ac34573e.min.js:17)
    at $.(anonymous function).(anonymous function)._renderEditBtn (https://indico.lal.in2p3.fr/static/assets/core/js/indico_jquery_6154eeaf.min.js:216:629)
    at $.(anonymous function).(anonymous function)._renderEditBtn (https://indico.lal.in2p3.fr/static/assets/core/js/jquery_f35ae1d0.min.js:1033:334)
    at $.(anonymous function).(anonymous function)._renderPermissionsButtons (https://indico.lal.in2p3.fr/static/assets/core/js/indico_jquery_6154eeaf.min.js:216:160)
    at $.(anonymous function).(anonymous function)._renderPermissionsButtons (https://indico.lal.in2p3.fr/static/assets/core/js/jquery_f35ae1d0.min.js:1033:334)
    at $.(anonymous function).(anonymous function)._renderPermissions (https://indico.lal.in2p3.fr/static/assets/core/js/indico_jquery_6154eeaf.min.js:215:488)
    at $.(anonymous function).(anonymous function)._renderPermissions (https://indico.lal.in2p3.fr/static/assets/core/js/jquery_f35ae1d0.min.js:1033:334)
    at $.(anonymous function).(anonymous function)._renderItem (https://indico.lal.in2p3.fr/static/assets/core/js/indico_jquery_6154eeaf.min.js:217:263)
    at $.(anonymous function).(anonymous function)._renderItem (https://indico.lal.in2p3.fr/static/assets/core/js/jquery_f35ae1d0.min.js:1033:334)
    at indico_jquery_6154eeaf.min.js:218

Is Indico.Urls.PermissionsDialog defined if you enter it in your browser’s JS console?

If not, try clearing your cache.

Thanks. Clearing the cache fixed the issue. This may be worth an entry in the upgrade documentation or in the blog… I guess I may not be the only one to hit this problem…

A typical end user won’t read the upgrade docs since it’s not really relevant for him… Maybe we should add a cache buster to the URL like we do for normal static assets…

Good remark! Is it a problem that every event manager can expect to have? Do you know what tend to trigger it?

The file is cached by the browser for up to 12 hours - after that the problem will be gone in any case. CTRL-F5 to reload without cache is the easiest fix for someone affected.

I guess in our case this didn’t really show up since we deployed in the evening, so when people came back to work in the morning the cached file had already expired.

Thanks for the clarification. It should not be a major problem for us too, in these conditions. I may have been the only one affected!