Dear Indico developers,
I have some questions regarding the migration of Indico 2.3.5 to 3.x:
I would like to do a fresh installation of Indico 3, but then first do some tests before migrating our Indico 2.3 data. To migrate the Indico 2.3 data to the Indico 3 server that has already a database setup with
indico db prepare and has been used for testing, can I just do the following?
su - postgres
pg_restore -C -d postgres indico.sql
Assuming indico.sql is the output file from
pg_dump -U postgres -d indico > indico.sql
on the old Indico 2.3 server.
I guess I do not need to set the ownership of the database or create database extensions after using
pg_restore, since these should be restored from the dump.
Then deleting everything in /opt/indico/archive and copy the old data there.
Is there anything else to consider? E.g. emptying cache/tmp folders?
Thanks a lot!
Hi, looks fine!
I’d rather run the
pg_restore as the
indico user to make sure there are no ownership screwups, but generally what you do should work as well since the dump should contain everything important including the correct ownership (it would only be a problem when using different postgres users on the different systems).
FWIW, this is how I would do it:
su - postgres -c 'dropdb indico'
su - postgres -c 'createdb -O indico indico'
su - postgres -c 'psql indico -c "CREATE EXTENSION unaccent; CREATE EXTENSION pg_trgm;"'
and then as the
pg_restore -d indico -O /path/to/indico.dump
For creating the backup I’d use
pg_dump -Fc -f /tmp/indico.dump indico to use the custom postgres dump format; it’s compressed and AFAIK also more efficient (and you don’t need a raw readable SQL file anyway if you don’t plan to modify any data in the sql dump)
Yes, that’s a good idea.
Thank you also for this!
I guess you mean
pg_restore -d indico -O /path/to/indico.dump ?
Yes… that happens when copying commands from your own shell history and forgetting to fully change it
@ThiefMaster: I am now in the middle of the migration process and ended up with a problem accessing some old timetable files that contains special characters in the filename in the legacy archive. Everything else works fine and I would prefer not to undo everything and hope for some fast help.
Accessing the file on the old Indico 2.3.5 server is no problem, but accessing them on the Indico 3.1 server fails.
The apache error log on the new server contains this (I replaced some content with XXX):
[XXX] [:error] [pid XXX:tid XXX] (2)No such file or directory: [client XXX] xsendfile: cannot open file: /data/indico/legacy-archive/XXX/Pr\xe4sentation_Mainz_end2.pdf
I try to access the file using this URL (which is from the timetable, again replaced part of URL by XXX): https://XXX/Praesentation_Mainz_end2.pdf
What can I do? I did the database dump and restore in the way you suggested.
Uploading a new file with special characters and downloading it again works. Most of the old files also work.
Edit: The name of the file in the filesystem on the server is “Präsentation_Mainz_end2.pdf”.
Attachment.get(ID).file.storage_file_id gives you in
indico shell. the ID is in the last path segment before the filename in the URL, and compare it with what you have in the file system.
I have no idea what exactly is going wrong there - feels like a charset different between the local file system and what’s in the database, but some ideas:
- Is the filesystem using UTF8 now, while it used latin1/iso-8859-1 before? Or is that
\xe4 actually part of the
storage_file_id (you may want to
print() the value in indico shell, otherwise you see a python repr that may contain the escape sequence).?
- If it’s just very few files you could rename them on disk and update the
storage_file_id in the DB accordingly.
@ThiefMaster: The last part of the URL is /1747/Praesentation_Mainz_end2.pdf. So the ID is 1747. I am sorry but I never used indico shell. How can I start/use it? I did not find information in the doc.
You mean I should enter “print(Attachment.get(1747).file.storage_file_id)” somewhere?
When I type “locale” in the linux shell the output on the old and the new system are the same.
I have no idea how many files have that problem.
Login as the indico user on the server, then run
indico shell. This basically gives you an interactive Python interpreter where you can execute code snippets.
In : print(Attachment.get(1747).file.storage_file_id)
Same folder and file name as in the file system. What to do?
Can you show me the
repr() instead of
print() as well?
Also, the output of this:
(ie the path to the file without the file name)
In : repr(Attachment.get(1747).file.storage_file_id)
In : import os
In : os.listdir('/data/indico/legacy-archive/2015/XXX/')
How about this one?
open('/data/indico/legacy-archive/2015/XXX/Präsentation_Mainz_end2.pdf', 'r') work?
In : import sys
In : sys.getfilesystemencoding()
In : open('/data/indico/legacy-archive/2015/XXX/Präsentation_Mainz_end2.pdf', 'r')
Out: <_io.TextIOWrapper name='/data/indico/legacy-archive/2015/XXX/Präsentation_Mainz_end2.pdf' mode='r' encoding='UTF-8'>
now the filesystem encoding would be interesting to see on the old indico server (assuming you moved to a new server/VM as recommended in the docs) as well…
PS: Please wrap your output in triple backticks; that way it’s more readable and i don’t need to edit your posts to add this
Yes, I moved to a a new VM. On the old server:
Indico v2.3.5 is ready for your commands
In : import sys
In : sys.getfilesystemencoding()
I am sorry
hm so the main question is… where does this escaping happen between indico and apache… and without an environment where it happens it’s also kind of hard to debug
BTW as a workaroudn you could remove/comment out the
STATIC_FILE_METHOD line in indico.conf. That way files would be sent directly by indico instead of handing it off to apache; in case of a small instance that’d probably be OK, even though it’s not ideal of course (apache/nginx are much better at service static file content)
STATIC_FILE_METHOD it works. Is it only a question of performance? I need to decide if I will continue migration and work with the new server or roll back (which I really do not want to do). The problem is: If I continue to work with the migrated data I will be unable to get the old state in case it turns out there was some kind of error when migrating the data. But it seems this is not the case, right?
In case you will have some ideas in future I am happy to try them out.
Thanks a lot for your help!
Yes, it’s just a matter of performance. So no need to rollback. The problem is only when passing the path to Apache - the
open() test showed that the files can be read just fine.
PS: I think theoretically downgrading the DB to the 2.3 state would be possible, but it’s something we did not test and of course do not recommend anyone to do.
I don’t know if it’s correlated, but I find these errors in my syslog.all:
XXX indico-uwsgi[XXX]: uwsgi_response_write_body_do() TIMEOUT !!!
XXX indico-uwsgi[XXX]: OSError: write error
Is there some kind of timeout time I need to change? Where do these errors come from?
I think that happens if a client disconnects while data is being sent to them. Should be fine to ignore.