Page MenuHomePhabricator

Special:GlobalUsers varies between claiming a user is or isn't attached
Closed, ResolvedPublic

Description

Reload the below urls and find that it sometimes claims the user is attached and sometimes that the user isn't attached the local wiki:

https://en.wikipedia.org/w/index.php?title=Special:GlobalUsers&limit=1&username=Glaisher
https://en.wikipedia.org/w/index.php?title=Special:GlobalUsers&username=PiRSquared17&limit=1

It appears to keep changing between attached and unattached.

Glaisher (unattached or doesn't exist locally, .. )

OR

Glaisher (exists locally, .. )

Event Timeline

Glaisher raised the priority of this task from to Needs Triage.
Glaisher updated the task description. (Show Details)
Glaisher subscribed.
Krinkle renamed this task from Special:GlobalUsers says unattached while it's attached to Special:GlobalUsers varies between claiming a user is or isn't attached.Jun 18 2015, 6:13 AM
Krinkle updated the task description. (Show Details)
Krinkle added a project: acl*sre-team.
Krinkle set Security to None.

Relevant code:
https://github.com/wikimedia/mediawiki-extensions-CentralAuth/blob/f9ed32d1e/includes/specials/SpecialGlobalUsers.php#L102-L141

I checked all six currently registered centralauth (s7 cluster) slaves and master (db1033, db1028, db1041, db1034, db1039, db1062). They all return the same data for this query:

SELECT * from localuser WHERE lu_name='PiRSquared17' AND lu_wiki='enwiki';

I figured maybe some have row missing or with a different lu_attached_method, but nope. All the same from what I can tell. Any ideas what it could be?

Here's the full query it runs for Glaisher:

SELECT  gu_id,gu_name,gu_locked,lu_attached_method,COUNT(gug_group) AS gug_numgroups,MAX(gug_group) AS gug_singlegroup  FROM `globaluser` LEFT JOIN `localuser` ON ((gu_name = lu_name) AND lu_wiki = 'enwiki') LEFT JOIN `global_user_groups` ON ((gu_id = gug_user))  WHERE gu_hidden = '' AND (gu_name >= 'Glaisher')  GROUP BY gu_name ORDER BY gu_name LIMIT 1;

Works as expected against db1041, however lu_attached_method ends up being NULL on db1062 and db1039...

Simplified query to get rid of irrelevant parts (extra selected fields and groups):

mysql:wikiadmin@db1062 [centralauth]> SELECT lu_attached_method FROM `globaluser` LEFT JOIN `localuser` ON ((gu_name = lu_name) AND lu_wiki = 'enwiki') WHERE gu_hidden = '' AND (gu_name >= 'Glaisher')  GROUP BY gu_name ORDER BY gu_name LIMIT 1;
lu_attached_method
NULL
mysql:wikiadmin@db1062 [centralauth]> SELECT lu_attached_method FROM `globaluser` LEFT JOIN `localuser` ON ((gu_name = lu_name) AND lu_wiki = 'enwiki') WHERE gu_hidden = '' AND (gu_name = 'Glaisher')  GROUP BY gu_name ORDER BY gu_name LIMIT 1;
lu_attached_method
login

Difference is the removal of the > in the gu_name condition, same thing happens on db1039. On db1041 (and the master, db1033) these both return login.

jcrespo triaged this task as High priority.
jcrespo moved this task from Triage to In progress on the DBA board.

Results per database:

db1033: login
db1028: NULL
db1034: login
db1039: NULL
db1041: login
db1062: NULL
db2029: login
db2040: NULL
db2047: login
db2054: login
dbstore1001: login
dbstore1002: login
dbstore2001: login

But a closer look at the query shows that this is not a data-related issue:

mysql> set sql_mode='ONLY_FULL_GROUP_BY';
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT lu_attached_method FROM `globaluser` LEFT JOIN `localuser` ON ((gu_name = lu_name) AND lu_wiki = 'enwiki') WHERE gu_hidden = '' AND (gu_name >= 'Glaisher')  GROUP BY gu_name ORDER BY gu_name LIMIT 1;
ERROR 1055 (42000): 'centralauth.localuser.lu_attached_method' isn't in GROUP BY

The result of this query is undefined, as we are selecting fields that we do not GROUP BY with. A stricter SQL mode disables this query, but we have some blockers to enable this features: T108255.

To confirm that this is the issue (returning a random/undefined row)- dbstore2001 has been recently imported from db2040, but they return different results (because exact same logical content, but rows are returned in different order due to its different physical layout). I will however perform a checksum of the table, as it was programmed T104459.

I need a dev with centralauth knowledge and with time to make sure I am rewriting it well.

This query returns the same results on all hosts:

   SELECT MAX(gu_id),
          gu_name,
          MAX(gu_locked),
          MAX(lu_attached_method),
          GROUP_CONCAT(gug_group)
     FROM globaluser
LEFT JOIN localuser
       ON gu_name = lu_name AND lu_wiki = 'enwiki'
LEFT JOIN global_user_groups
       ON gu_id = gug_user
    WHERE gu_hidden = '' AND (gu_name = 'Glaisher')
 GROUP BY gu_name;

But I am unsure if we want the GROUP_CONCAT, and what to show if this returns more than one gu_id; if we should return the max of locked, etc.

I "fixed" the query but, as I said before, I need some mediawiki feedback now to leave it in the intended state.

  1. Original query when no username is specified:
SELECT gu_id,gu_name,gu_locked,lu_attached_method,COUNT(gug_group) AS gug_numgroups,MAX(gug_group) AS gug_singlegroup FROM `globaluser` LEFT JOIN `localuser` ON ((gu_name = lu_name) AND lu_wiki = 'enwiki') LEFT JOIN `global_user_groups` ON ((gu_id = gug_user)) WHERE gu_hidden = '' GROUP BY gu_name ORDER BY gu_name LIMIT 51
  1. When a username is specified:
SELECT gu_id,gu_name,gu_locked,lu_attached_method,COUNT(gug_group) AS gug_numgroups,MAX(gug_group) AS gug_singlegroup FROM `globaluser` LEFT JOIN `localuser` ON ((gu_name = lu_name) AND lu_wiki = 'enwiki') LEFT JOIN `global_user_groups` ON ((gu_id = gug_user)) WHERE gu_hidden = '' AND (gu_name >= 'Glaisher') GROUP BY gu_name ORDER BY gu_name LIMIT 51

You can get the intended result by selecting MAX(lu_attached_method) instead of just lu_attached_method in the original query but afaict, it doesn't solve the underlying issue.

Also, I don't see gug_singlegroup being used anywhere now in that page now so we can probably remove that now. Also, I think we want to retrieve the groups by the way which you suggested instead of what we do currently. Right now, for retreiving the global groups, it does separate queries for each user if gug_numgroups > 0.

Looks like GROUP_CONCAT would break if we end up with a group name containing a comma

You can get the intended result by selecting MAX(lu_attached_method) instead of just lu_attached_method in the original query but afaict, it doesn't solve the underlying issue.

How does it not solve the issue? What am I missing here?

jcrespo lowered the priority of this task from High to Medium.Aug 18 2015, 9:35 AM

Lowering priority, as it is confirmed to not be a data issue.

Open for grabbing, but feel free to ping me if you need me in a support role.

I highly recommend that you test it with SET SESSION sql_mode='ONLY_FULL_GROUP_BY'; (T108255#1539055)

How does it not solve the issue? What am I missing here?

I meant that it will fail when 'ONLY_FULL_GROUP_BY' sql mode is enabled. I will try to submit a patch which uses the query provided by jcrespo here.

Change 232492 had a related patch set uploaded (by Glaisher):
Improve querying on GlobalUsersPager

https://gerrit.wikimedia.org/r/232492

Change 232492 merged by jenkins-bot:
Improve querying on GlobalUsersPager

https://gerrit.wikimedia.org/r/232492

It will also probably be a bit faster now.