Description
Related Objects
Event Timeline
Scratching the limits is basically the point of these maintenance pages. Whenever something like this is done, it is pretty much guaranteed there will be a few extreme outliers. The first example provided loads 300+ Wikidata items, and for each:
- Loads all labels in all languages
- Checks if 13 specific properties are present, and filled in properly
The configuration for this can be seen in https://www.wikidata.org/wiki/Module:Roadtable/models.
The main code is in https://www.wikidata.org/wiki/Module:Roadtable. It contains a problematic mw.wikibase.getEntityObject that might be worth replacing with more specific methods like getBestStatements or the new getAllStatements (T166056).
As the author of the module in question, I should add that there were no such time issues when the module was written, as the expensive function count was reached well before time limits were reached.
The third page linked in the report uses a test module specifically written to test these limits (https://www.wikidata.org/wiki/Module:Roadtable/section/sandbox). This module only loads the entity object and retrieves its properties and English label.
https://www.wikidata.org/wiki/User:Happy5214/sandbox/California4 replaces that sandbox with https://www.wikidata.org/wiki/Module:Roadtable/section/sandbox2, which replaces the getEntityObject call with a call to getLabelByLang and one getAllStatements call for each property in the "full" model. As you can see, it is significantly worse than the getEntityObject version linked in my last comment.
In fact, https://www.wikidata.org/w/index.php?title=Wikidata:WikiProject_Roads/United_States/California&oldid=276427363 was working as of November 2015. Something changed in Wikibase between then and now, more likely closer to now (I only noticed this in the last few weeks).
If Phase 3 (auto-generated lists from Wikidata) is ever to become a reality, this is not promising.
What generally changed since November 2015 is:
- The Items are all bigger now. Here are the diffs for two randomly picked example Items from the California maintenance list: https://www.wikidata.org/w/index.php?title=Q400522&diff=626131991&oldid=310722154, https://www.wikidata.org/w/index.php?title=Q805375&diff=594954459&oldid=275088526.
- Way more pages use way more information from Wikidata, which means the available CPU resources need to be shared between way more processes often running at the same time.
The next step in this task should be to recreate an example use case on a local system, and profile the code involved to find bottlenecks. Maybe it's wb_terms again? Quite possible.
I took a look at this and it seems this task is a symptom and not a problem on its own. If it's going down gradually, it means we are hitting scalability issues and there are several ways to handle that. One of them includes optimizing wb_terms table which we are on it but it takes months to become fully useful but if we are returning too many errors like this suddenly, it is a regression bug. Looking at the slow-parse graph in logstash, it tells me that we had an increase around 15 of February which started to decrease around first of March and last week is the lowest in the past 30 days. This drop started exactly when I enabled reading from term_full_entity_id instead of term_entity_id. It helps because we remove an additional where clause from the query (term_entity_type = 'item', etc.).
For historical reasons:
Working again now. I assume this was connected to the performance hit we encountered and fixed yesterday.
So far so good, some of the pages are working again, and others are working after a purge. I'm still monitoring the script errors category.
Thanks for fixing this!