Wikidata:Requests for comment/Disallow merging into newer entity: Difference between revisions

From Wikidata
Jump to navigation Jump to search
Content deleted Content added
→‎Discussion: Withdraw, my error
Line 56: Line 56:
*::: {{ping|Peter James}} you support vandals by giving them a chance to break queries by creating a higher ID item. What is the relevance of accuracy? How do you measure "use"+ "externally"? How will the merging person know the result of such measurement? How will it be ensured that the merging person merges in the item with more use? [[User:MrProperLawAndOrder|MrProperLawAndOrder]] ([[User talk:MrProperLawAndOrder|<span class="signature-talk">{{int:Talkpagelinktext}}</span>]]) 22:54, 15 June 2020 (UTC)
*::: {{ping|Peter James}} you support vandals by giving them a chance to break queries by creating a higher ID item. What is the relevance of accuracy? How do you measure "use"+ "externally"? How will the merging person know the result of such measurement? How will it be ensured that the merging person merges in the item with more use? [[User:MrProperLawAndOrder|MrProperLawAndOrder]] ([[User talk:MrProperLawAndOrder|<span class="signature-talk">{{int:Talkpagelinktext}}</span>]]) 22:54, 15 June 2020 (UTC)
*::::It isn't usually vandalism; it is more likely to be editors who don't know this shouldn't usually be done and merge to the new item just because it has a better label or description. Also there are other ways vandals could merge items in ways that are not so easy to fix and in most cases these would not be prevented. An edit filter could warn editors when merging to items that have recently been created. [[User:Peter James|Peter James]] ([[User talk:Peter James|<span class="signature-talk">{{int:Talkpagelinktext}}</span>]]) 13:20, 16 June 2020 (UTC)
*::::It isn't usually vandalism; it is more likely to be editors who don't know this shouldn't usually be done and merge to the new item just because it has a better label or description. Also there are other ways vandals could merge items in ways that are not so easy to fix and in most cases these would not be prevented. An edit filter could warn editors when merging to items that have recently been created. [[User:Peter James|Peter James]] ([[User talk:Peter James|<span class="signature-talk">{{int:Talkpagelinktext}}</span>]]) 13:20, 16 June 2020 (UTC)
*:::::Actually I have no proof any vandal is using the ability. So I withdraw my statement regarding vandal support. My error. [[User:MrProperLawAndOrder|MrProperLawAndOrder]] ([[User talk:MrProperLawAndOrder|<span class="signature-talk">{{int:Talkpagelinktext}}</span>]]) 21:29, 16 June 2020 (UTC)
*{{o}} Wikidata exists for more than 6 years, but still there are some items with low Qs and one sitelink to some small wiki, which can be merged with newer items with more sitelinks or wide usage. e.g. some important items are without sitelinks, but corresponding article should exist in some wiki and have item. [[User:JAn Dudík|JAn Dudík]] ([[User talk:JAn Dudík|<span class="signature-talk">{{int:Talkpagelinktext}}</span>]]) 07:08, 25 May 2020 (UTC)
*{{o}} Wikidata exists for more than 6 years, but still there are some items with low Qs and one sitelink to some small wiki, which can be merged with newer items with more sitelinks or wide usage. e.g. some important items are without sitelinks, but corresponding article should exist in some wiki and have item. [[User:JAn Dudík|JAn Dudík]] ([[User talk:JAn Dudík|<span class="signature-talk">{{int:Talkpagelinktext}}</span>]]) 07:08, 25 May 2020 (UTC)
*:[[User:JAn Dudík|JAn Dudík]], where is the logic in oppose? What does that have to do with 6 years etc? [[User:MrProperLawAndOrder|MrProperLawAndOrder]] ([[User talk:MrProperLawAndOrder|<span class="signature-talk">{{int:Talkpagelinktext}}</span>]]) 01:12, 13 June 2020 (UTC)
*:[[User:JAn Dudík|JAn Dudík]], where is the logic in oppose? What does that have to do with 6 years etc? [[User:MrProperLawAndOrder|MrProperLawAndOrder]] ([[User talk:MrProperLawAndOrder|<span class="signature-talk">{{int:Talkpagelinktext}}</span>]]) 01:12, 13 June 2020 (UTC)

Revision as of 21:30, 16 June 2020

An editor has requested the community to provide input on "Disallow merging into newer entity" via the Requests for comment (RFC) process. This is the discussion page regarding the issue.

If you have an opinion regarding this issue, feel free to comment below. Thank you!

Proposal: Disallow merging into newer entity

Some people merge into newer entity, probably by unchecking:

  • Always merge into the older entity (uncheck to merge into the "Merge with" entity)

This

  1. goes against ID stability
  2. can be used for vandalism
  3. can break queries

This option should be removed. MrProperLawAndOrder (talk) 01:52, 13 May 2020 (UTC)[reply]

Merging into newer IDs can happen endlessly because newer IDs can be created at any time - older IDs not. This bad behavior can be seen in VIAF, ISNI, GND. Wikidata can show, that it does better. MrProperLawAndOrder (talk) 02:20, 14 May 2020 (UTC)[reply]

Discussion

  • There are good reasons to keep the option to merge into the newer item, although this process should of course be the exception. There are cases where the newer item has a better definition, almost all sitelinks, and all backlinks. In that case, it does not help to merge into the older item, as all the backlinks would have to be corrected as well. ---MisterSynergy (talk) 10:55, 13 May 2020 (UTC)[reply]
    What backlinks? Bots fix redirects and sitelinks and as you say "better definitions" are transferred during merge. It completely goes against ID stability to merge to newer ones - it can happen endlessly because newer IDs can be created at any time. If someone created a duplicate then this should not be honored. On top of it, vandals can abuse it. How would one control if a merge to a later ID was one under "good reasons". The older an ID is, the more incoming external links it will have ceteris paribus. MrProperLawAndOrder (talk) 02:16, 14 May 2020 ACC(UTC)
    Bot fixes are not ideal. For example if the newer ID item is used widely on a large number (hundreds?) of "large" items, those edits to fix the redirect can cause a backlog in WDQS synchronization. Also I don't know what you mean about "better definitions" being transferred during merge - if there's no definition already then yes the other definition is merged in, but if there's an existing definition than a possibly "better" one might be lost. Anyway, most of this is pretty exceptional and I do agree that in the vast majority of cases merging into the older entity is better. ArthurPSmith (talk) 17:39, 18 May 2020 (UTC)[reply]
  •  Support I agree with the reasons of the proposal. If not complete disabling, merging into newer entity should at least be more difficult (e.g. requiring two clicks), so that the user has to reflect if it is really the best solution and doesn't do it without much reflection. --Epìdosis 15:35, 18 May 2020 (UTC)[reply]
  •  Strong oppose In most cases it is sensible to merge into the newer entity. But sometimes it can be counterproductive, and it is very important that we retain the ability to keep the newer ID. For example, Member of Parliament (Q16707842) (heavily linked and coded into many queries) was discovered to be the same as Member of Parliament (Q11765029) (not used at all except for one sitelink). A merge from the newer into the older item would break all the existing queries and reports (which would have to be tracked down and fixed off-wiki), and meant that large numbers of links would become redirects needing fixed.
As a result, a merge from the older into the newer item had none of these negative impacts, but a hard "must always merge to the older ID" would have meant hours of completely unnecessary work to fix things, and links to queries would have stopped working indefinitely. I don't see the benefit to prohibiting it. Andrew Gray (talk) 20:06, 18 May 2020 (UTC)[reply]
Andrew Gray, where is the proof for "not used at all except for one sitelink"? Re "and meant that large numbers of links would become redirects needing fixed" - bots fix redirects. Re "which would have to be tracked down and fixed off-wiki" - how was that done for Q11765029? MrProperLawAndOrder (talk) 01:27, 13 June 2020 (UTC)[reply]
I have explained a problem that I personally worked on, and have cited the items involved. I have gone back to check and found that:
a) no other pages (save one maintenance page created on the same day) now link to that item - I missed that one, but it's fixed now
b) at the time I did not have to correct any inbound links to that item from other items, suggesting there were not any, and I am pretty sure I'd have checked. It had no metadata and only a Polish label and sitelink. It was as unused as a Wikidata item can be, pretty much.
In terms of broken queries, I was not aware of any queries using Q11765029, and as the item was not being used I cannot imagine there were any (what useful answers would they have returned?) I was, however, aware of a large number of queries using Q16707842 because I had written many of them and discussed them extensively with people off-wiki. Your proposal would mean that all those queries would be broken forever. You yourself said that the goals for this proposal are ID stability and not breaking queries - so here is an example of how it would go against stability and break queries. Andrew Gray (talk) 12:10, 13 June 2020 (UTC)[reply]
Andrew Gray, so, no proof no one had a query using the other. "so here is an example of how it would go against stability and break queries" - merging always breaks the queries that use the merge source. But the proposal is, that careful people, using the item with the lowest ID, the one that is the one longest in use, don't have their queries broken due to other people creating duplicates. If one can merge into a higher ID there is no limit, since new IDs are created all the time. Perfect target for vandals. But there is only one lowest ID. MrProperLawAndOrder (talk) 22:42, 15 June 2020 (UTC)[reply]
I have explained the problem twice already, and don't want to do it a third time. This RFC seems to be quite pointless now. No-one has given any new opinions on this RFC for two weeks, they have only responded to your increasingly aggressive comments, and you are now accusing other users who disagree with you of supporting vandalism. I will ask for it to be closed by a neutral third party. Andrew Gray (talk) 12:05, 16 June 2020 (UTC)[reply]
  •  Oppose for the same reasons. Suggest making the merge tool available only for autoconfirmed users. --SCIdude (talk) 08:54, 19 May 2020 (UTC)[reply]
    SCIdude you support vandals by giving them a chance to break queries by creating a higher ID item. MrProperLawAndOrder (talk) 22:42, 15 June 2020 (UTC)[reply]
    If queries can really be "broken" this way they should be fixed. --SCIdude (talk) 06:51, 16 June 2020 (UTC)[reply]
  •  Oppose Cannot be enforced, software we use is not just for Wikidata. Merging to the older entity is the default option in the gadget; merging to the newer entity using it is already "two-factor": navigate to the older entity AND untick the option. By the way, I am thinking about implementing a warning in the gadget when you try to redirect an entity which has many uses to an entity which for example has one link and a few or no statements. That there can be good reasons to merge to the newer entity is a really good reason to keep this possibility. --Matěj Suchánek (talk) 11:46, 20 May 2020 (UTC)[reply]
    Matěj Suchánek That warning sounds like a great idea! Andrew Gray (talk) 13:50, 20 May 2020 (UTC)[reply]
    Andrew Gray, but doesn't prevent a user from doing it anyway, against WD consensus. And it may conflict with the proposers own reasoning "software we use is not just for Wikidata". MrProperLawAndOrder (talk) 01:22, 13 June 2020 (UTC)[reply]
    Matěj Suchánek "software we use is not just for Wikidata" - so no software change proposal at all to be made to improve Wikidata, because of that? And are you just suggesting changing the consensus to merge into the older item via showing warnings to users? MrProperLawAndOrder (talk) 01:29, 13 June 2020 (UTC)[reply]
  •  Support. However, the ability to merge into newer entity should be allowed for users such as administrator, other can request it at page such as Wikidata:Requests for merging into newer entity. Hddty (talk) 14:02, 20 May 2020 (UTC)[reply]
    If most users can only merge into the older entity some will do that when it would be better not to. Peter James (talk) 07:50, 13 June 2020 (UTC)[reply]
  •  Oppose as a hard and fast rule, although I'd accept Hddty's idea of letting only admins etc. to that. Sometimes the old item is vacuous or nearly so, and the new item well fleshed out. - Jmabel (talk) 14:42, 20 May 2020 (UTC)[reply]
    @Jmabel: you support vandals by giving them a chance to break queries by creating a higher ID item. MrProperLawAndOrder (talk) 22:46, 15 June 2020 (UTC)[reply]
    • @MrProperLawAndOrder: Would you care to either expand on that charge or retract it? In particular, are you saying that the admins are likely to be vandals? And if not, precisely what "vandals" am I supporting, and how? - Jmabel (talk) 04:20, 16 June 2020 (UTC)[reply]
      @Jmabel: you opposed disabling the ability to merge into higher IDs. Vandals can create a duplicate under a higher ID and they or others merge into that one, breaking stability of the old IDs. RE "are you saying that the admins are likely to be vandals" - ??? Why do you ask this? MrProperLawAndOrder (talk) 04:41, 16 June 2020 (UTC)[reply]
      • @MrProperLawAndOrder: Apparently, you are not only being insulting, you didn't even bother reading what I wrote. I wrote that I oppose completely disabling the ability to merge into the higher ID, but have no problem with restricting it to admins, etc. If it were restricted to admins (and other similarly privileged users), just how would a non-admin vandal exploit this.
      • But, besides that: you view this as an anti-vandalism measure. Fine. But you are way out of line saying that those who disagree with you therefore "support vandals".
      • So: I'll give you another chance to retract your charge that I (and others: I see you said the same to someone else) "support vandals." - Jmabel (talk) 04:51, 16 June 2020 (UTC)[reply]
      RE "you didn't even bother reading what I wrote" - respect WD:NPA. stop claiming things about others that you have no proof for. You cannot know if I read it or not. And less so can you know if I bothered reading. You are way out of line with such claims.
      RE "you are not only being insulting" - respect WD:NPA. Please substantiate your charge that I was being insulting.
      RE "If it were restricted to admins (and other similarly privileged users), just how would a non-admin vandal exploit this." - by creating a higher ID item, and trick a privileged user to merge the older ID item into the new one.
      RE "But you are way out of line saying that those who disagree with you therefore "support vandals"." - I wrote "you support vandals by giving them a chance to break queries by creating a higher ID item." - I substantiated it and confined it to a very specific case.
      RE "I see you said the same to someone else" - you mean "you support vandals by giving them a chance to break queries by creating a higher ID item."?
      IMO vandals should be deprived as good as possible of chances to disturb this WMF project. It's a moral obligation towards all the users that try to improve Wikidata and all the donors of money.
      Why would "you support vandals by giving them a chance to break queries by creating a higher ID item." or any other statement that I made be an insult? It was already written into my proposal text that I see the proposal as an anti-vandalism measure. If it truly would be an anti-vandalism measure - and since it restricts editing it must - then not implementing it means leaving this attack vector open for vandals. Anyway, I withdraw, since I have no single proof that a vandal is actively using this vector. If it is not used by a vandal, leaving it open cannot be a support of any vandal. MrProperLawAndOrder (talk) 21:27, 16 June 2020 (UTC)[reply]
  •  Oppose, but Special:MergeItems could be changed so the default is to merge to the older item, and a warning could be displayed when merging to a newer item. Peter James (talk) 20:23, 21 May 2020 (UTC)[reply]
    Peter James, where is the substantiation to that oppose vote? MrProperLawAndOrder (talk) 01:13, 13 June 2020 (UTC)[reply]
    The same reasons mentioned above - new items with more use (here and externally) or more accuracy - also that some items are used by the software and should not be deleted or redirected. Newer item isn't the same as new item that has just been created. Peter James (talk) 07:37, 13 June 2020 (UTC)[reply]
    @Peter James: you support vandals by giving them a chance to break queries by creating a higher ID item. What is the relevance of accuracy? How do you measure "use"+ "externally"? How will the merging person know the result of such measurement? How will it be ensured that the merging person merges in the item with more use? MrProperLawAndOrder (talk) 22:54, 15 June 2020 (UTC)[reply]
    It isn't usually vandalism; it is more likely to be editors who don't know this shouldn't usually be done and merge to the new item just because it has a better label or description. Also there are other ways vandals could merge items in ways that are not so easy to fix and in most cases these would not be prevented. An edit filter could warn editors when merging to items that have recently been created. Peter James (talk) 13:20, 16 June 2020 (UTC)[reply]
    Actually I have no proof any vandal is using the ability. So I withdraw my statement regarding vandal support. My error. MrProperLawAndOrder (talk) 21:29, 16 June 2020 (UTC)[reply]
  •  Oppose Wikidata exists for more than 6 years, but still there are some items with low Qs and one sitelink to some small wiki, which can be merged with newer items with more sitelinks or wide usage. e.g. some important items are without sitelinks, but corresponding article should exist in some wiki and have item. JAn Dudík (talk) 07:08, 25 May 2020 (UTC)[reply]
    JAn Dudík, where is the logic in oppose? What does that have to do with 6 years etc? MrProperLawAndOrder (talk) 01:12, 13 June 2020 (UTC)[reply]
    @MrProperLawAndOrder: I meant, there are cases where is better to merge to newer item. So I oppose disallowing. JAn Dudík (talk) 19:27, 14 June 2020 (UTC)[reply]
    @JAn Dudík: you support vandals by giving them a chance to break queries by creating a higher ID item. MrProperLawAndOrder (talk) 22:46, 15 June 2020 (UTC)[reply]
  •  Support, per HddtySlade (talk) 12:47, 25 May 2020 (UTC)[reply]
  •  Oppose both merging into newer and merging into older items can break some queries and provide instability when linking from external sources. While in most cases I would expect a higher chance for damage when merging into a newer item happens, that won't be always be the case.
There's a lot that can theoretically be used for vandalism. If this feature gets actually used for vandalism, we should investigate that practical usage and think about how to reduce that vandalism but I don't think that judgement should be made based on theoretical possibility. We might for example have a page that lists merges that happen into the newer items and undo any vandalism that happens. ChristianKl13:04, 16 June 2020 (UTC)[reply]