Skip to content

Conversation

Jakoma02
Copy link
Contributor

@Jakoma02 Jakoma02 commented Jul 31, 2025

Summary

This PR ensures that if an older channel that does not have a versioned database file yet is added to the community library, the versioned database file is created.

References

Solves #5191.

This PR depends on changes from #5228, and must be merged after it. (Done)

Reviewer guidance

After merging #5228, this PR should first be rebased onto the merged changes and only then reviewed and merged. (Done)

mapper.run()


def _possibly_migrate_unversioned_database(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't read it in depth yet 😅. But, a general comment is that we should make this copy when we create the submission, not after it's approved and mapped to the public models.

Mainly for two reasons:

  1. If the user has published more recent versions between submission creation and submission approval, then it will no longer be true that the current channel database is the database for that channel version.
  2. In the future, we'll need to create a way to preview the channel version related to the submission, and for that, we'll need to ensure that the channel-versioned database exists, and this preview would happen before approving the submission.

If there are arguments to have this copy at export time instead, Im happy to hear it too 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea for doing this at export time was to deal with content databases at a place that was already dealing with content databases, without complicating the viewset logic with something I thought it did not need to care about. I was trying to solve 1 by checking whether the database contains the channel metadata with the given version, but 2 alone is a good reason for actually doing this at submission creation time.

I am thinking that I could create a create_versioned_database_if_needed method inside contentcuration/utils/publish.sh and use it from the submission viewset -- or is there a better place for it?

Also, I think that the using_temp_migrated_database helper is fairly useful and makes the export_channel_to_kolibri_public implementation (arguably) more readable, but my motivation for creating it was to avoid reimplementing its logic in _possibly_migrate_unversioned_database, and it is no longer valid. Should I scratch this, or should I keep this change anyway since it is already done?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure! I agree on not complicating the viewset logic, I think this function can perfectly live in the publish.py module.

Should I scratch this, or should I keep this change anyway since it is already done?

I also think it is more readable now, and we can re-use this if we ever need it, so Im fine with keeping this change!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented in 3f83c80.

@Jakoma02 Jakoma02 force-pushed the ensure-channel-version-database-exists branch from 30b24d0 to 3f83c80 Compare August 6, 2025 17:40
@Jakoma02
Copy link
Contributor Author

Jakoma02 commented Aug 6, 2025

I have rebased this PR onto current community-channels right now.

@Jakoma02 Jakoma02 requested a review from AlexVelezLl August 6, 2025 17:46
Copy link
Member

@AlexVelezLl AlexVelezLl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Jakoma02! Code changes looks mostly correct, I just found a little bug on how we are copying the versioned database, and noticed that we should probably have this process as an async task. Apart from that, code changes looks good, and tests provide a lot of confidence.

)

with storage.open(unversioned_db_storage_path, "rb") as unversioned_db_file:
with storage.open(versioned_db_storage_path, "wb") as versioned_db_file:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am getting this error when try to create a submission that doesn't have a versioned channel database:

  File "/home/alexvelezll/.pyenv/versions/studio-py3.10/lib/python3.10/site-packages/django_s3_storage/storage.py", line 318, in _open
    raise ValueError("S3 files can only be opened in read-only mode")
ValueError: S3 files can only be opened in read-only mode

So, it seems like a better way to go here is just to save the same database in the new path just like we do in the publish_channel method:

        with storage.open(unversioned_db_storage_path, "rb") as unversioned_db_file:
            storage.save(versioned_db_storage_path, unversioned_db_file)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, this slipped through. It should be fixed in 054c89d, and I did more thorough manual testing this time.

# When creating a new submission, ensure the channel has a versioned database
# (it might not have if the channel was published before versioned databases
# were introduced).
ensure_versioned_database_exists(self.channel)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realized that this should probably happen in an async task since downloading the databases may take some time, and we should not keep the connection open for that long. So could you please create a new task in contentcuration/tasks.py that just calls the ensure_versioned_database_exists method (so we dont have all this logic in the tasks module) and then enqueue it here? Apologies I did not catch this earlier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 611b641.

@rtibbles rtibbles changed the base branch from community-channels to unstable August 28, 2025 00:00
@Jakoma02 Jakoma02 requested a review from AlexVelezLl September 2, 2025 21:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ESoCC: Ensure that channel version's database exists when a community library submission is created
3 participants