Page MenuHomePhabricator

Create a RESTBase script to purge (Parsoid) content in the event of a train rollback
Open, MediumPublic

Description

While the RESTBase + Parsoid conbo has content versioning and respects content version headers, that is semantically versioned and only matters for major version bumps. For minor version bumps, occasionally, Parsoid may not have support for the right minor version because of train rollbacks without building special support for forward compatibility.

Next week, we anticipate rolling out a Parsoid 2.4.0 HTML version that Parsoid v0.15.0-a10 (shipped with MW 1.38.0-wmf.9) doesn't have support for. So, if the train is rolled back for whatever reason, RESTBase will have 2.4.0 content for the timeframe when the new train was active on wikis. So, on train rollback, RESTBase 2.4.0 content would need to be purged for wikis that had the train rolled back. My understanding is that RESTBase cannot purge by HTML version (since it probably doesn't record that info), but can purge based on a time window.

In case it is not possible to only purge this only for wikis that had the train rolled back, it is okay to purge unconditionally - it may lead to a temporary spike in HTML parse requests to Parsoid to fulfill requests for content that might have hit in RESTBase.

Ideally, we need this script in place before the train rolls out on Tuesday next week so we aren't scrambling at the last minute if the train rolls back later in the week.

Event Timeline

Arlolra triaged this task as High priority.Nov 24 2021, 9:34 PM
Arlolra moved this task from Needs Triage to Feature requests on the Parsoid board.

@Pchelolo created this @ https://github.com/wikimedia/restbase/pull/1297 ... he says: "if it's needed and I'm not around, just force-merge it even if tests fail. The actual start/end time needs to go to the restbase deploy repo. If I'm not available, ask Clara Andrew-Wani for help"

@Clarakosi @Pchelolo Is it possible to purge storage only for some wikis vs. all wikis?

ssastry lowered the priority of this task from High to Medium.Dec 7 2021, 5:14 PM

@Clarakosi @Pchelolo Is it possible to purge storage only for some wikis vs. all wikis?

Yes, we'd need to modify the filter a bit but it should be possible. As far as deployment I've mostly done it before with the help of @hnowlan and we'd need some coordination in the event of a rollback

Thanks! For this train, we eliminated the dependency on needing a RESTBase purge even on train rollbacks, but this is useful to have in place for any future deploy when things go wrong and we need to clear RESTBase of bad content from the time of bad deploy.

Aklapper added a subscriber: Pchelolo.

Removing inactive assignee (Platform Engineering: Please unassign tasks of previous team members.)