Runtime Architecture Refactor

Hello Indico,

We have spent the last couple weeks analyzing your database from a runtime architecture perspective. Our analysis focuses on two key transactions in the Indico project: creating an event and viewing an event. These were chosen to represent both a write and read-only workflow accordingly. We will examine how the backend components interact during each process, highlight the runtime behavior, data flow, and potential scalability challenges within the system. Although we are less familiar with the extents of this project, we would still love to share what we have discovered.

Potential Refactor:

While running load tests for event creation and user testing, we believe we may have identified some potential improvements that could be made in case of higher usage. Although we are unsure of what Indico faces on day-to-day server traffic, and we acknowledge the potential for issues with our testing files, the image below displays what we have found when having a higher user load on Indico. s

The above image was taken from a Locust web UI, showcasing a user logging in, then creating an event enmasse. As the users ramp towards 1000 so does the percentiles.

With our current understanding of the Indico project and the identification of these potential runtime problems, we have generated potential refactors. Once again, these suggestions come with the acknowledgement that we are unaware of Indico’s current user traffic and lack the experience of other seasoned Indico developers.

Below is a modified operations.py from the events directory. This new version of the file removes a majority of the db.session.flush() that were present in the previous version of the file. The main reasoning behind this proposed change is because a flush in SQLAlchemy is quite expensive, and reducing the number of database operations will save time for the users and reduce server/database load.

@no_autoflush  
def create_event(category, event_type, data, add_creator_as_manager=True, features=None, cloning=False):
	"""
	Create a new event with batched DB flushes to reduce round-trips and ensure atomicity.
	"""
	event = Event(category=category, type_=event_type)
	data.setdefault('creator', session.user)
	theme = data.pop('theme', None)
	create_booking = data.pop('create_booking', False)
	person_link_data = data.pop('person_link_data', {})
	if category is None:
    	data.pop('protection_mode', None)

	event.populate_from_dict(data)
	event.person_link_data = person_link_data

	if theme is not None:
    	layout_settings.set(event, 'timetable_theme', theme)
	if add_creator_as_manager:
    	with event.logging_disabled:
        	event.update_principal(event.creator, full_access=True)
	if features is not None:
    	features_event_settings.set(event, 'enabled', features)

	db.session.flush()

	signals.event.created.send(event, cloning=cloning)
	logger.info('Event %r created in %r by %r', event, category, session.user)
	sep = ' » '
	event.log(EventLogRealm.event, LogKind.positive, 'Event', 'Event created',
          	session.user, data={'Category': sep.join(category.chain_titles) if category else None})
	if category:
    	category.log(CategoryLogRealm.events, LogKind.positive, 'Content',
                 	f'Event created: "{event.title}"', session.user,
                 	data={'ID': event.id, 'Type': orig_string(event.type_.title)})

	if create_booking:
    	room_id = data.get('location_data', {}).pop('room_id', None)
    	if room_id:
        	booking = create_booking_for_event(room_id, event)
        	if booking:
            	logger.info('Booking %r created for event %r', booking, event)
            	log_data = {
                	'Room': booking.room.full_name,
                	'Date': booking.start_dt.strftime('%d/%m/%Y'),
                	'Times': f"{booking.start_dt.strftime('%H:%M')} - {booking.end_dt.strftime('%H:%M')}"
            	}
            	event.log(EventLogRealm.event, LogKind.positive, 'Event',
                      	'Room booked for the event', session.user, data=log_data)
            	db.session.flush()
	return event

If this solution is not suitable for the design of Indico, we would still suggest considering reducing database operations because we have observed that the frequency of storage and network calls may be causing delays which could be solved by doing something in-memory.

We would appreciate it if you could review the refactoring that we have proposed and let us know if you have any suggestions. We are excited about the opportunity to contribute and collaborate with you. We look forward to learning from this community and sharing our insights.
Thank you!

Best regards,
Sam Gorun, Bryce Jensenius, Alec Moore, Jesse Williams, Zach Kehoe, Maggie Sullivan