Jump to content

Talk:Spam blacklist

From Meta, a Wikimedia project coordination wiki
This is an archived version of this page, as edited by Beetstra (talk | contribs) at 10:20, 3 June 2023 (→‎google.: url */ c). It may differ significantly from the current version.

Latest comment: 1 year ago by Beetstra in topic Discussion
Shortcut:
WM:SPAM
WM:SBL
The associated page is used by the MediaWiki Spam Blacklist extension, and lists regular expressions which cannot be used in URLs in any page in Wikimedia Foundation projects (as well as many external wikis). Any Meta administrator can edit the spam blacklist; either manually or with SBHandler. For more information on what the spam blacklist is for, and the processes used here, please see Spam blacklist/About.

Proposed additions
Please provide evidence of spamming on several wikis. Spam that only affects a single project should go to that project's local blacklist. Exceptions include malicious domains and URL redirector/shortener services. Please follow this format. Please check back after submitting your report, there could be questions regarding your request.
Proposed removals
Please check our list of requests which repeatedly get declined. Typically, we do not remove domains from the spam blacklist in response to site-owners' requests. Instead, we de-blacklist sites when trusted, high-volume editors request the use of blacklisted links because of their value in support of our projects. Please consider whether requesting whitelisting on a specific wiki for a specific use is more appropriate - that is very often the case.
Other discussion
Troubleshooting and problems - If there is an error in the blacklist (i.e. a regex error) which is causing problems, please raise the issue here.
Discussion - Meta-discussion concerning the operation of the blacklist and related pages, and communication among the spam blacklist team.
#wikimedia-external-linksconnect - Real-time IRC chat for co-ordination of activities related to maintenance of the blacklist.
Whitelists
There is no global whitelist, so if you are seeking a whitelisting of a url at a wiki then please address such matters via use of the respective Mediawiki talk:Spam-whitelist page at that wiki, and you should consider the use of the template {{edit protected}} or its local equivalent to get attention to your edit.

Please sign your posts with ~~~~ after your comment. This leaves a signature and timestamp so conversations are easier to follow.


Completed requests are marked as {{added}}/{{removed}} or {{declined}}, and are generally archived quickly. Additions and removals are logged · current log 2024/07.

SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 7 days and sections whose most recent comment is older than 10 days.

Proposed additions

This section is for proposing that a website be blacklisted; add new entries at the bottom of the section, using the basic URL so that there is no link (example.com, not http://www.example.com). Provide links demonstrating widespread spamming by multiple users on multiple wikis. Completed requests will be marked as {{added}} or {{declined}} and archived.

psikologline.com



crosswiki linkspam [1][2] --Johannnes89 (talk) 06:00, 25 May 2023 (UTC)Reply

@Johannnes89: Added Added to Spam blacklist. --Johannnes89 (talk) 06:01, 25 May 2023 (UTC)Reply

Hair product spam





Xwiki ref spam. EstrellaSuecia (talk) 09:33, 26 May 2023 (UTC)Reply

@EstrellaSuecia: Added Added to Spam blacklist. --Johannnes89 (talk) 09:31, 27 May 2023 (UTC)Reply

acemedboards.com



Cross-wiki link spam by different accounts: [3], [4], [5], [6], [7], [8], [9]. ~StyyxTalk? 22:10, 26 May 2023 (UTC)Reply

@Styyx: Added Added to Spam blacklist. --Johannnes89 (talk) 09:31, 27 May 2023 (UTC)Reply

chatgptonline.co



Cross-wiki spam by IPs. Not a legit ChatGPT link. [10], [11], [12], [13]. ~StyyxTalk? 22:44, 28 May 2023 (UTC)Reply

@Styyx: Added Added to Spam blacklist. -- — billinghurst sDrewth 03:55, 29 May 2023 (UTC)Reply

udanarandka.com



xwiki link spam. Actioning this myself...making a section so I can use the script. Vermont (🐿️🏳️‍🌈) 02:37, 29 May 2023 (UTC)Reply

Done Vermont (🐿️🏳️‍🌈) 02:38, 29 May 2023 (UTC)Reply
@Vermont: Added Added to Spam blacklist. --Vermont (🐿️🏳️‍🌈) 02:42, 29 May 2023 (UTC)Reply

onlinebettingapps.org



spambotting  — billinghurst sDrewth 21:47, 29 May 2023 (UTC)Reply

@Billinghurst: Added Added as \bonlinebettinga\b.org to Spam blacklist. -- — billinghurst sDrewth 21:48, 29 May 2023 (UTC)Reply

chinafxj.cn



Spam links that clearly defame or attack race, group, or religious belief. --СлаваУкраїні! 23:48, 29 May 2023 (UTC)Reply

These website are not obvious spam webiste. And please don't Forum shopping, while these website are currently discussed on zhwiki Village pump. Thanks for your understanding. SCP-2000 12:53, 2 June 2023 (UTC)Reply
@Fumikas Sagisavas:  Declined no obvious global misuse; set to be monitored for abuse, report being generated  — billinghurst sDrewth 02:26, 3 June 2023 (UTC)Reply

guangxifxj.cn



Spam links that clearly defame or attack race, group, or religious belief. --СлаваУкраїні! 23:48, 29 May 2023 (UTC)Reply

@Fumikas Sagisavas please provided links demonstrating widespread spamming by multiple users on multiple wikis. If it's just a zhwiki issue, please discuss this locally at zh:MediaWiki talk:Spam-blacklist. Johannnes89 (talk) 14:09, 2 June 2023 (UTC)Reply
@Fumikas Sagisavas:  Declined as no evidence of global misuse; noting that I have added domain to be monitored and ordered a report generated  — billinghurst sDrewth 02:23, 3 June 2023 (UTC)Reply

kaiwind.com



Spam links that clearly defame or attack race, group, or religious belief. --СлаваУкраїні! 23:48, 29 May 2023 (UTC)Reply

@Fumikas Sagisavas kaiwind.com seems to be used across multiple projects, especially at zhwiki [14]. It seems like local blacklisting has been discussed at zhwiki multiple times and declined (most recently in 2021 [15][16]). Considering the global use of this link, I don't think your request can be granted. Johannnes89 (talk) 14:03, 2 June 2023 (UTC)Reply
@Johannnes89: The reason why the discussion on zhwiki about disabling this source failed in the end was that the User:MINQI used threats, intimidation and other means to prevent the administrator from adding it to the blacklist. The Chinese government conducts propaganda campaigns to spread a positive image of the Chinese government and has long threatened to intimidate Wikipedians who disagree with its political views. СлаваУкраїні! 14:20, 2 June 2023 (UTC)Reply
@Johannnes89: Sorry to bother you but do you know what can I do could stop User:Fumikas Sagisavas's defamation on me?
According to his caim, I have attacked and harassed others by using warn-Tempate or reporting somene's long-time edit war. Oh, I am even hired by kaiwind.com, China Anti-Cult Association, Publicity Department of the Chinese Communist Party… and do paied edits. He has done it many times in zh.wiki and meta.wiki. Is that means I can send a global ban request for him because of his behavior? MINQI (talk) 14:42, 2 June 2023 (UTC)Reply
MINQI, this page exists to discuss whether certain links should be added to or removed from the spam blacklist. Your comment seems to be out of scope. Vermont (🐿️🏳️‍🌈) 17:40, 2 June 2023 (UTC)Reply
@Vermont,I know but Fumikas_Sagisavas takes me as a reason that "kaiwind.com" should be in spam blacklist. Which is a defamation of me. I really do not know where can I or who can I report to about this problem in meta.wiki. MINQI (talk) 17:46, 2 June 2023 (UTC)Reply
@Fumikas Sagisavas:  Declined Other issues are not for this forum.  — billinghurst sDrewth 02:21, 3 June 2023 (UTC)Reply

shop links





xwiki linkspam, e.g. [17][18][19][20][21][22] --Johannnes89 (talk) 06:17, 2 June 2023 (UTC)Reply

@Johannnes89: Added Added to Spam blacklist. --Johannnes89 (talk) 06:18, 2 June 2023 (UTC)Reply

Proposed additions (Bot reported)

This section is for domains which have been added to multiple wikis as observed by a bot.

These are automated reports, please check the records and the link thoroughly, it may report good links! For some more info, see Spam blacklist/Help#COIBot_reports. Reports will automatically be archived by the bot when they get stale (less than 5 links reported, which have not been edited in the last 7 days, and where the last editor is COIBot).

Sysops
  • If the report contains links to less than 5 wikis, then only add it when it is really spam
  • Otherwise just revert the link-additions, and close the report; closed reports will be reopened when spamming continues
  • To close a report, change the LinkStatus template to closed ({{LinkStatus|closed}})
  • Please place any notes in the discussion section below the HTML comment

COIBot

The LinkWatchers report domains meeting the following criteria:

  • When a user mainly adds this link, and the link has not been used too much, and this user adds the link to more than 2 wikis
  • When a user mainly adds links on one server, and links on the server have not been used too much, and this user adds the links to more than 2 wikis
  • If ALL links are added by IPs, and the link is added to more than 1 wiki
  • If a small range of IPs have a preference for this link (but it may also have been added by other users), and the link is added to more than 1 wiki.
COIBot's currently open XWiki reports
List Last update By Site IP R Last user Last link addition User Link User - Link User - Link - Wikis Link - Wikis
vrsystems.ru 2023-06-27 15:51:16 COIBot 195.24.68.17 192.36.57.94
193.46.56.178
194.71.126.227
93.99.104.93
2070-01-01 05:00:00 4 4

Proposed removals

This section is for proposing that a website be unlisted; please add new entries at the bottom of the section. Use a suitable 3rd level heading and display the domain name as per this example {{LinkSummary|targetdomain.com}}.

Remember to provide the specific domain blacklisted, links to the articles they are used in or useful to, and arguments in favour of unlisting. Completed requests will be marked as {{removed}} or {{declined}} and archived.

See also recurring requests for repeatedly proposed (and refused) removals.

Notes:

  • The addition or removal of a domain from the blacklist is not a vote; please do not bold the first words in statements.
  • This page is for the removal of domains from the global blacklist, not for removal of domains from the blacklists of individual wikis. For those requests please take your discussion to the pertinent wiki, where such requests would be made at Mediawiki talk:Spam-blacklist at that wiki. Search spamlists — remember to enter any relevant language code

wikirank.net



Czeva approached me directly, stating she wanted to link Wikirank from cs:Wikipedie:Klub/České Budějovice/2023 to record a related discussion. However, this was not possible due to the tool being listed on the blacklist. It was added based on [23] and "Cross-wiki refspamming by multiple throwaway socks", but I don't quite understand how could eg. wikirank.net/en/%C4%8Cesk%C3%A9%20Bud%C4%9Bjovice be interpreted as refspamming. Can this be clarified? Ping JzG as the requestor and Beetstra as the admin performing the blacklisting. Thanks! --Martin Urbanec (talk) 08:00, 17 May 2023 (UTC)Reply

It was a spam campaign by a sockpuppeteer. See Talk:Spam_blacklist/Archives/2020-08. JzG (talk) 17:50, 17 May 2023 (UTC)Reply
I think Wikipedia has chosen 1000cs domains for internal linking to Wikipedia and they don't take domains other than their own. They instead spam blacklist any domain that they try to enter and they falsely claim to be free online encyclopedias and get donations so they can't be seen doing business. Sunosaw (talk) 18:11, 18 May 2023 (UTC)Reply
@Martin Urbanec it is not that a link content is refspamming, it is the collective behaviour of editors we consider refspamming. Spamming is a behaviour that needs to be stopped, we don’t blacklist material that many people consider spam if it has not been spammed. I see this is related to mdpi, which we have seen more in campaigns solely aimed at putting their links wherever they want, and that is not how we build an encyclopedia. Dirk Beetstra T C (en: U, T) 04:17, 21 May 2023 (UTC)Reply

Comment Comment Typically we say to manage it locally first through whitelisting and we can see how it progresses. To me, this looks like a case where a subdomain of the whole may be whitelisted, or the regex for the specific allowed subdomain may suffice.  — billinghurst sDrewth 06:51, 19 May 2023 (UTC)Reply

@Martin Urbanec:  Declined  — billinghurst sDrewth 03:57, 29 May 2023 (UTC)Reply

Manning and Manning



In the w:en:Savile Row tailoring article, in the section "Other companies on Savile Row", there is a block to the website of Manning and Manning. M&M are bespoke tailors and it is a mystery as to why the website has been blocked - it is a normal business with a history of tailoring. I have not been able to find when it was blacklisted. I am not concerned with the company but interested in updating references for that article.Richard Nowell (talk) 12:37, 29 May 2023 (UTC)Reply

@Richard Nowell: Defer to w:en:Mediawiki talk:spam-blacklist This is blocked only at English Wikipedia by the regex \bmanning\.com\b You will need to take it up with them.  — billinghurst sDrewth 22:14, 29 May 2023 (UTC)Reply
OK thankyou for your help :) Richard Nowell (talk) 06:46, 30 May 2023 (UTC)Reply

Troubleshooting and problems

This section is for comments related to problems with the blacklist (such as incorrect syntax or entries not being blocked), or problems saving a page because of a blacklisted link. This is not the section to request that an entry be unlisted (see Proposed removals above).

Discussion

This section is for discussion of Spam blacklist issues among other users.

google.*/url

Hi!
Currently \bgoogle\..*?/url\?.* blocks URLs such as

https://www.google.com/url?url=https%3A%2F%2Fexample.org.

IIRC, the reason was to avoid a circumvention of the SBL, e.g. if "example.org" was a blacklisted domain. But this bug seems to be fixed now.
So I consider to locally whitelist the google.*/url URLs (and replace the results with the original, google-free URL via a bot) in order to reduce the annoyance/frustration on user's side.
What do you think? Does this make sense? -- seth (talk) 22:10, 24 May 2023 (UTC)Reply

Comment Comment Why would we allow the redirects? We don't allow redirects in any other sense? If we allow these what does that say about the existing policy approach. No to anyone else, but okay b/c it is Google? Shouldn't people learn to reference properly? Shouldn't Google not be rewarded for corrupting urls?  — billinghurst sDrewth 22:43, 24 May 2023 (UTC)Reply
Hi!
  • "Why would we allow the redirects?"
It would just be a help for those who are overwhelmed with the SBL messages -- and unfortunately that seem to be quite a lot of people, although w:de:MediaWiki:Spamprotectiontext contains an explicit link to https://url-converter.toolforge.org/ for converting the URLs. For some users that's just to technical, and some give up their editing without saving and without understanding the problem.
  • "We don't allow redirects in any other sense?"
At dewiki we do this already with other URLs. For example f*c*book and yasni redirs and short urls are converted.[24]
  • "If we allow these what does that say about the existing policy approach."
I don't think a real change of the policy is necessary for this. The redirects are still unwanted, but a bot copes with the conversion (in order to eliminate them).
  • "No to anyone else, but okay b/c it is Google?"
It's not about only Google, but about often used redirects where the blocking leads to frustration for users that are not so technically experienced.
  • "Shouldn't people learn to reference properly?"
Indeed that's little disadvantage, when soft allowing those URLs. But the bot could additionally notify the users on their talk pages. The bot does this already e.g. if anybody uses a file:/// link.[25] (See w:de:user:CamelBot/notice-local-file, German)
  • "Shouldn't Google not be rewarded for corrupting urls?"
Definitely those URLs should not remain in the articles, if only because of data protection. The redirects would stay there only temporarily (~30 minutes), because a bot would clean the URLs soon afterwards.
-- seth (talk) 23:37, 24 May 2023 (UTC)Reply
@Lustiger seth: Seems that deWP is proactive and has a solution, they also have the means to fix the problem through a whitelisting. If no one else has the fix in place, then until that occurs, I don't see that removal really helps. If someone comes forward with a xwiki bot, then cool, we can progress.  — billinghurst sDrewth 23:43, 28 May 2023 (UTC)Reply
@billinghurst: Maybe there was a misunderstanding. I did not want to request the global unblocking of the pattern, but only wanted to know whether anybody sees a problem in local whitelisting (at dewiki) with the mentioned workflow. -- seth (talk) 08:17, 29 May 2023 (UTC)Reply
Oh, apologies for misunderstanding. I am not sure that it really is our issue, and I hesitate to have an opinion about another wiki's approach to whitelisting, beyond saying that it is great when someone has the ability to be able to do a pilot scale test and tell us how it went.  — billinghurst sDrewth 12:22, 29 May 2023 (UTC)Reply
Ok, thanks for clarification. (And sorry, if i was unclear at the beginning.)
Let's hope that I am right in the sense that such links really cannot be used for circumventing the blacklist any longer. :-) -- seth (talk) 13:32, 29 May 2023 (UTC)Reply
@Lustiger set: it has been my experience that full domains with http/s qualifiers mentioned in a url redirect have been blocked for a while. I cannot say that it has been the case for where they are just the domain names in the remainder of the uri component. It has been my approach to watch the logs of Special:AbuseFilter/68 and use COIBot to track the targets, and occasionally blacklist them.  — billinghurst sDrewth 22:22, 29 May 2023 (UTC)Reply
I don’t think that these links were blacklisted because they can be used to circumvent the blacklist only, they were blacklisted because they are the core of SEO: clicking this link is showing google that someone follows this link, hence is interested, and hence it improves their google ranking. Now i know that you have a bot removing them ASAP, but the bot then has to deal with the blacklisted targets (eradicate them?), and the bot (and subsequently an admin) has to deal with persistent editors (spammers) who use this to circumvent/SEO.
You also have to take into account possible bot downtime, in which case these links stay for long.
On the other hand, the problem is sizeable, on en.wiki over 10% of the hits is this url …
Maybe we should entertain the idea again of a gadget that executes a script that rewrites certain thing on a page before saving it. That could be used on these links (and e.g. youtu.be redirects (>25%)) to rewrite them before they get saved. Dirk Beetstra T C (en: U, T) 05:53, 3 June 2023 (UTC)Reply
Hi!
  • "but the bot then has to deal with the blacklisted targets [...]"
Is that so? I thought (see above) that you just can not (anymore) bypass the SBL with the Google links.
  • "You also have to take into account possible bot downtime, in which case these links stay for long."
Indeed. But I guess, that's a minor problem. The downtime of CamelBot is quite low and it can cope with replags of the db.
  • "Maybe we should entertain the idea again of a gadget that executes a script that rewrites certain thing on a page before saving it."
I agree, that would be better. The thing is, in my experience, it might take years for something like this to be rolled out.
-- seth (talk) 07:42, 3 June 2023 (UTC)Reply
Ah, yes, I see now, the url parameter is now a full link in itself. It is the inverse of the problem that you cannot point to archive links of about.com pages. I agree with the other two points. —Dirk Beetstra T C (en: U, T) 10:20, 3 June 2023 (UTC)Reply