Jump to content

Talk:Spam blacklist

From Meta, a Wikimedia project coordination wiki
This is an archived version of this page, as edited by Beetstra (talk | contribs) at 04:25, 13 June 2023 (→‎Bots can now override the blacklist on Wikimedia projects?: reply). It may differ significantly from the current version.

Latest comment: 1 year ago by Beetstra in topic Discussion
Shortcut:
WM:SPAM
WM:SBL
The associated page is used by the MediaWiki Spam Blacklist extension, and lists regular expressions which cannot be used in URLs in any page in Wikimedia Foundation projects (as well as many external wikis). Any Meta administrator can edit the spam blacklist; either manually or with SBHandler. For more information on what the spam blacklist is for, and the processes used here, please see Spam blacklist/About.

Proposed additions
Please provide evidence of spamming on several wikis. Spam that only affects a single project should go to that project's local blacklist. Exceptions include malicious domains and URL redirector/shortener services. Please follow this format. Please check back after submitting your report, there could be questions regarding your request.
Proposed removals
Please check our list of requests which repeatedly get declined. Typically, we do not remove domains from the spam blacklist in response to site-owners' requests. Instead, we de-blacklist sites when trusted, high-volume editors request the use of blacklisted links because of their value in support of our projects. Please consider whether requesting whitelisting on a specific wiki for a specific use is more appropriate - that is very often the case.
Other discussion
Troubleshooting and problems - If there is an error in the blacklist (i.e. a regex error) which is causing problems, please raise the issue here.
Discussion - Meta-discussion concerning the operation of the blacklist and related pages, and communication among the spam blacklist team.
#wikimedia-external-linksconnect - Real-time IRC chat for co-ordination of activities related to maintenance of the blacklist.
Whitelists
There is no global whitelist, so if you are seeking a whitelisting of a url at a wiki then please address such matters via use of the respective Mediawiki talk:Spam-whitelist page at that wiki, and you should consider the use of the template {{edit protected}} or its local equivalent to get attention to your edit.

Please sign your posts with ~~~~ after your comment. This leaves a signature and timestamp so conversations are easier to follow.


Completed requests are marked as {{added}}/{{removed}} or {{declined}}, and are generally archived quickly. Additions and removals are logged · current log 2024/07.

SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 7 days and sections whose most recent comment is older than 10 days.

Proposed additions

This section is for proposing that a website be blacklisted; add new entries at the bottom of the section, using the basic URL so that there is no link (example.com, not http://www.example.com). Provide links demonstrating widespread spamming by multiple users on multiple wikis. Completed requests will be marked as {{added}} or {{declined}} and archived.

Yacht spam





Spam creations at ami:Marina, bm:Marina-ye, ch:Marina, chy:Marina, din:Marina, ee:Ƒudzimɔzɔƒe, pih:Marina, ss:Umarina and ti:ጃልባ-ምዝዋር. Note that I have deleted the pages (as they were machine translations of the same text), so only global sysops can verify the content of the pages. ~StyyxTalk? 21:57, 3 June 2023 (UTC)Reply

@Styyx: Added Added to Spam blacklist. --Johannnes89 (talk) 07:33, 10 June 2023 (UTC)Reply

apkmodct.com



Xwiki refspam. EstrellaSuecia (talk) 08:52, 4 June 2023 (UTC)Reply

@EstrellaSuecia: Added Added to Spam blacklist. -- — billinghurst sDrewth 11:05, 5 June 2023 (UTC)Reply

idealecasinos.nl etc.







Cross-wiki spamming, see [1][2][3][4].

Further spamming at documented at w:Wikipedia:Sockpuppet investigations/Foxy Spanks and Talk:Wikiproject:Antispam#AvaTrade, Bitcasino, etc.. MER-C 11:13, 4 June 2023 (UTC)Reply

@MER-C: Added Added to Spam blacklist. -- — billinghurst sDrewth 06:06, 5 June 2023 (UTC)Reply
@Billinghurst the links were not added to the SBL, was that intended or a mistake? Johannnes89 (talk) 06:06, 11 June 2023 (UTC)Reply
@MER-C: Added Added to Spam blacklist. <shrug> what happened -- — billinghurst sDrewth 08:12, 11 June 2023 (UTC)Reply

avada.io





See this on enwiki: https://en.wikipedia.org/wiki/Special:Contributions/Tiachopxanh33 LuGusDeclanBibaElodieBarnaby 03:01, 8 June 2023 (UTC)Reply

Recent xwiki linkspam, e.g. [5][6][7][8][9][10] --Johannnes89 (talk) 07:23, 10 June 2023 (UTC)Reply
@LuGusDeclanBibaElodieBarnaby: Added Added to Spam blacklist. --Johannnes89 (talk) 07:24, 10 June 2023 (UTC)Reply

ilovemarketing.in





See https://en.wikipedia.org/wiki/Special:Contributions/Ilovemarketing LuGusDeclanBibaElodieBarnaby 05:44, 8 June 2023 (UTC)Reply

@LuGusDeclanBibaElodieBarnaby:  Declined additions seem limited to a single wiki and have stopped. A global blacklisting is premature.  — billinghurst sDrewth 04:34, 9 June 2023 (UTC)Reply

Nanașu47, Dellmatron et. al








































































































































Some of the reports are missing data - the key users are Nanașu47 and Dellmatron. See Talk:Wikiproject:Antispam#Nanașu47, Dellmatron et. al. MER-C 10:27, 11 June 2023 (UTC)Reply

Proposed additions (Bot reported)

This section is for domains which have been added to multiple wikis as observed by a bot.

These are automated reports, please check the records and the link thoroughly, it may report good links! For some more info, see Spam blacklist/Help#COIBot_reports. Reports will automatically be archived by the bot when they get stale (less than 5 links reported, which have not been edited in the last 7 days, and where the last editor is COIBot).

Sysops
  • If the report contains links to less than 5 wikis, then only add it when it is really spam
  • Otherwise just revert the link-additions, and close the report; closed reports will be reopened when spamming continues
  • To close a report, change the LinkStatus template to closed ({{LinkStatus|closed}})
  • Please place any notes in the discussion section below the HTML comment

COIBot

The LinkWatchers report domains meeting the following criteria:

  • When a user mainly adds this link, and the link has not been used too much, and this user adds the link to more than 2 wikis
  • When a user mainly adds links on one server, and links on the server have not been used too much, and this user adds the links to more than 2 wikis
  • If ALL links are added by IPs, and the link is added to more than 1 wiki
  • If a small range of IPs have a preference for this link (but it may also have been added by other users), and the link is added to more than 1 wiki.
COIBot's currently open XWiki reports
List Last update By Site IP R Last user Last link addition User Link User - Link User - Link - Wikis Link - Wikis
vrsystems.ru 2023-06-27 15:51:16 COIBot 195.24.68.17 192.36.57.94
193.46.56.178
194.71.126.227
93.99.104.93
2070-01-01 05:00:00 4 4

Proposed removals

This section is for proposing that a website be unlisted; please add new entries at the bottom of the section. Use a suitable 3rd level heading and display the domain name as per this example {{LinkSummary|targetdomain.com}}.

Remember to provide the specific domain blacklisted, links to the articles they are used in or useful to, and arguments in favour of unlisting. Completed requests will be marked as {{removed}} or {{declined}} and archived.

See also recurring requests for repeatedly proposed (and refused) removals.

Notes:

  • The addition or removal of a domain from the blacklist is not a vote; please do not bold the first words in statements.
  • This page is for the removal of domains from the global blacklist, not for removal of domains from the blacklists of individual wikis. For those requests please take your discussion to the pertinent wiki, where such requests would be made at Mediawiki talk:Spam-blacklist at that wiki. Search spamlists — remember to enter any relevant language code

language.metaproject.frl



I'm an occasional contributor to Wikipedia for many years, usually just fixing typos and updating links for projects I follow. I've never had problems before, but now I have triggered a blacklisting of this link, and I feel in over my head. I'm not versed in larger Wikipedia editing, but this time I added a link to a programming language in the lists of other programming languages that have influenced this language.

When I update links, I am used to doing it for all the human languages a lemma has, because it is factual information that I can edit without being a native speaker. This time, however, I saw that several entries were removed, based on triggering a spam detection bot, so I discussed it with the editor:

https://de.wikipedia.org/wiki/Benutzer_Diskussion:מקף#Your_removal_of_the_Meta_programming_language_as_influenced_by_the_Logo_programming_language

I was helpfully advised to enter the links as a footnote in a reference. Today, I wanted to do that, but found the link to be blacklisted. Following the references in the info, I had this discussion:

https://en.wikipedia.org/wiki/MediaWiki_talk:Spam-whitelist

Finally, I was referred here.

I can understand that Wikipedia prefers to have links in specific places, I just needed to learn how to do that. However, Wikipedia format allowed the links there, so I didn't expect anything like a global blacklisting.

I can understand that a spam detection would see my pattern as suspect, however, it is relevant encyclopedia information. Programming languages are usually influenced by multiple other programming languages, and Wikipedia tracks this, in a formalised way in the info box. Just a reference is the same in all human languages, so it seems useful to me to enter it in all translations of a lemma.

Notability wasn't an issue in the first advise I got, and I didn't add a lemma, just entries of influence. The Meta programming language is certainly notable in the REBOL community as a modern successor:

https://en.wikipedia.org/wiki/Rebol

Several other descended programming languages are mentioned there that don't have a lemma, or not even a website anymore.

The Meta programming language is more like Logo than REBOL was:

https://en.wikipedia.org/wiki/Logo_(programming_language)

The lemma on Logo lists several other influenced programming languages or even just programming environments without a lemma, so it seemed okay to me to add the Meta programming language.

Further, the same site hosts other projects that have had lemmas and links on Wikipedia for many years:

https://en.wikipedia.org/wiki/Syllable_Desktop

https://syllable.metaproject.frl

https://de.wikipedia.org/wiki/AtheOS

https://atheos.metaproject.frl

Even if the Meta programming language is not notable enough now for influence entries, it is likely to become notable in a few years, and it is relevant now in the lemmas related to REBOL. I am sorry I did not foresee the global blacklisting, but I don't think it is warranted.

I would like to know how to proceed. If the blacklisting is lifted, is it okay if I follow the advice I was given first, to enter links as references with footnotes, or will that also trigger spam detection? If not okay, I will refrain from adding links.

Is it okay if I add mentions as a non-existing future lemma? — The preceding unsigned comment was added by 2a02:a420:67:84e3:b4dd:9fff:fe0f:9cb5 (talk)

Comment Note that this is the IP range that canvassed this link across wikis which resulted in the blacklisting. Ohnoitsjamie (talk) 19:42, 5 June 2023 (UTC)Reply
Yes, I explain above that I made the edits and why they are filling in encyclopedic information that Wikipedia routinely tracks in articles on programming languages. 2A02:A420:67:84E3:B4DD:9FFF:FE0F:9CB5 21:01, 5 June 2023 (UTC)Reply
There's no indication that your project meets notability criteria, and links to it certainly don't need to be spammed across all of our projects. Ohnoitsjamie (talk) 22:20, 5 June 2023 (UTC)Reply
It's not my project, and as I explained, I understand that a different form is required. I would like to home in on what form is acceptable. I am not deeply familiar with Wikipedia's inner workings and didn't expect this reaction. 2A02:A420:67:84E3:B4DD:9FFF:FE0F:9CB5 22:29, 5 June 2023 (UTC)Reply

Comment Comment As the person that added it to the blacklist, some advice. I would suggest that you work with a community to learn how to edit appropriately, and especially considering the addition of links. Your linking in that format is just link spam and does not clearly add value to the projects nor to the articles. Pick one language project and work with them, as your additions across many language look like typical vested interest editing. When working with that language wiki, they can locally whitelist the domain and see how it can be a useful contribution.

If you are going to be working tactically on such a project having an account where someone can contact you and have conversations is important. When we face a range of spam-like edits, we have no point of contact to have a conversation to ask what is happening, whether you have a conflict of interest, etc. So in the end we fall to the binary decision of let it happen or force a discussion in a place like this. Here we are.  — billinghurst sDrewth 22:38, 5 June 2023 (UTC)Reply

Further, there may be more use in looking to populate wikidata if there is a standard reference that is occurring here, rather than the additions that took place.  — billinghurst sDrewth 22:40, 5 June 2023 (UTC)Reply
I always see isolated references when I look at article sources. How would a wikidata reference work? 2A02:A420:67:84E3:B4DD:9FFF:FE0F:9CB5 23:02, 5 June 2023 (UTC)Reply
Thank you for your reply. That's why I am monitoring what happens to my edits and discussing it here.
The way I have been contributing to Wikipedia for about two decades is to look what is already there, and correct or extend that example. Concretely, in this list of projects descended from REBOL:
https://en.wikipedia.org/wiki/Rebol#Legacy
There is an entry with a direct external link, and an entry without any reference, with a webstie that's gone. Can you tell me what the criterium is for allowing those, but removing my addition?
Would it be acceptable if I make the addition without a link? (It seemed to me that a reference would be better than none.) 2A02:A420:67:84E3:B4DD:9FFF:FE0F:9CB5 22:55, 5 June 2023 (UTC)Reply
We were collectively reviewing the link addition, not reviewing articles, so they were reset to the status quo prior to editing. If you have specific issues with articles, please address them on local wikis.  — billinghurst sDrewth 06:44, 6 June 2023 (UTC)Reply

greenbalance.se



I hope this message finds you well. My name is Christopher and I am representing Greenbalance.se. Our domain was recently placed on the Wikimedia Spam Blacklist, which we believe happened due to the unscrupulous actions of a third-party SEO agency. We are writing to humbly request the removal of our domain from the blacklist. We regret and apologize for the inappropriate usage of our domain by the SEO company, which was done without our knowledge or consent. We recognize the importance of the integrity of Wikimedia and its content and we respect the measures that you have put in place to protect it. We want to assure you that since discovering this issue, we have terminated our relationship with the said SEO agency. Moreover, we have conducted an internal audit and implemented new policies to prevent such incidents in the future. These actions include stricter vetting of third-party services, as well as enhanced monitoring of all activities pertaining to our brand's online presence. Our domain, Greenbalance.se, serves as a comprehensive platform focused on the enlightening subjects of environmental wellness and consciousness, with a specialized focus on the field of essential oils and cannabinoids. Many articles of which are very brief (even marked as snippets on the Swedish version of Wikipedia in which we operate). Our commitment to these areas of research is evident in the breadth and depth of content we regularly publish. All these articles, carefully researched and meticulously presented, showcase our commitment to enhancing knowledge around these important topics. We believe this rich content could greatly contribute to and enhance various Wikimedia projects, particularly those related to holistic wellness, botanical sciences, and environmental sustainability. Thank you for your understanding and consideration. We look forward to a positive resolution of this issue and we are ready to take any further steps required. Snuffleupagus33 (talk) 10:35, 6 June 2023 (UTC)Reply

@Snuffleupagus33:  Declined per Spam blacklist/About#Requests for delisting we „de-blacklist sites when trusted, high-volume editors request the use of blacklisted links because of their value in support of our projects“ and typically „do not remove domains from the spam blacklist in response to site-owners' requests“. There was significant crosswiki spam [11] and your website doesn't seem to add value to Wikipedia. No matter if it was you or another company spamming these links, they are unwanted and therefore blacklisted (if local communities decide to use the link, it could still be locally whitelisted). --Johannnes89 (talk) 15:40, 6 June 2023 (UTC)Reply
Not certain how or why you are even here Snuffleupagus33. This is your first edit, and you are here to talk about our spamblacklist??? This just sounds like a total conflict of interest in your approach, esp. to how we manage our sites with regard to problematic editing. It would be worthwhile you reading some COI guidance on our sites.

FWIW these are our internally managed blacklists for our internal use, and none of our sites appear to use your domain as a reliable reference, and yet here you are. How does that look? <shrug>  — billinghurst sDrewth 03:21, 7 June 2023 (UTC)Reply

Troubleshooting and problems

This section is for comments related to problems with the blacklist (such as incorrect syntax or entries not being blocked), or problems saving a page because of a blacklisted link. This is not the section to request that an entry be unlisted (see Proposed removals above).

Discussion

This section is for discussion of Spam blacklist issues among other users.

google.*/url

Hi!
Currently \bgoogle\..*?/url\?.* blocks URLs such as

https://www.google.com/url?url=https%3A%2F%2Fexample.org.

IIRC, the reason was to avoid a circumvention of the SBL, e.g. if "example.org" was a blacklisted domain. But this bug seems to be fixed now.
So I consider to locally whitelist the google.*/url URLs (and replace the results with the original, google-free URL via a bot) in order to reduce the annoyance/frustration on user's side.
What do you think? Does this make sense? -- seth (talk) 22:10, 24 May 2023 (UTC)Reply

Comment Comment Why would we allow the redirects? We don't allow redirects in any other sense? If we allow these what does that say about the existing policy approach. No to anyone else, but okay b/c it is Google? Shouldn't people learn to reference properly? Shouldn't Google not be rewarded for corrupting urls?  — billinghurst sDrewth 22:43, 24 May 2023 (UTC)Reply
Hi!
  • "Why would we allow the redirects?"
It would just be a help for those who are overwhelmed with the SBL messages -- and unfortunately that seem to be quite a lot of people, although w:de:MediaWiki:Spamprotectiontext contains an explicit link to https://url-converter.toolforge.org/ for converting the URLs. For some users that's just to technical, and some give up their editing without saving and without understanding the problem.
  • "We don't allow redirects in any other sense?"
At dewiki we do this already with other URLs. For example f*c*book and yasni redirs and short urls are converted.[12]
  • "If we allow these what does that say about the existing policy approach."
I don't think a real change of the policy is necessary for this. The redirects are still unwanted, but a bot copes with the conversion (in order to eliminate them).
  • "No to anyone else, but okay b/c it is Google?"
It's not about only Google, but about often used redirects where the blocking leads to frustration for users that are not so technically experienced.
  • "Shouldn't people learn to reference properly?"
Indeed that's little disadvantage, when soft allowing those URLs. But the bot could additionally notify the users on their talk pages. The bot does this already e.g. if anybody uses a file:/// link.[13] (See w:de:user:CamelBot/notice-local-file, German)
  • "Shouldn't Google not be rewarded for corrupting urls?"
Definitely those URLs should not remain in the articles, if only because of data protection. The redirects would stay there only temporarily (~30 minutes), because a bot would clean the URLs soon afterwards.
-- seth (talk) 23:37, 24 May 2023 (UTC)Reply
@Lustiger seth: Seems that deWP is proactive and has a solution, they also have the means to fix the problem through a whitelisting. If no one else has the fix in place, then until that occurs, I don't see that removal really helps. If someone comes forward with a xwiki bot, then cool, we can progress.  — billinghurst sDrewth 23:43, 28 May 2023 (UTC)Reply
@billinghurst: Maybe there was a misunderstanding. I did not want to request the global unblocking of the pattern, but only wanted to know whether anybody sees a problem in local whitelisting (at dewiki) with the mentioned workflow. -- seth (talk) 08:17, 29 May 2023 (UTC)Reply
Oh, apologies for misunderstanding. I am not sure that it really is our issue, and I hesitate to have an opinion about another wiki's approach to whitelisting, beyond saying that it is great when someone has the ability to be able to do a pilot scale test and tell us how it went.  — billinghurst sDrewth 12:22, 29 May 2023 (UTC)Reply
Ok, thanks for clarification. (And sorry, if i was unclear at the beginning.)
Let's hope that I am right in the sense that such links really cannot be used for circumventing the blacklist any longer. :-) -- seth (talk) 13:32, 29 May 2023 (UTC)Reply
@Lustiger set: it has been my experience that full domains with http/s qualifiers mentioned in a url redirect have been blocked for a while. I cannot say that it has been the case for where they are just the domain names in the remainder of the uri component. It has been my approach to watch the logs of Special:AbuseFilter/68 and use COIBot to track the targets, and occasionally blacklist them.  — billinghurst sDrewth 22:22, 29 May 2023 (UTC)Reply
I don’t think that these links were blacklisted because they can be used to circumvent the blacklist only, they were blacklisted because they are the core of SEO: clicking this link is showing google that someone follows this link, hence is interested, and hence it improves their google ranking. Now i know that you have a bot removing them ASAP, but the bot then has to deal with the blacklisted targets (eradicate them?), and the bot (and subsequently an admin) has to deal with persistent editors (spammers) who use this to circumvent/SEO.
You also have to take into account possible bot downtime, in which case these links stay for long.
On the other hand, the problem is sizeable, on en.wiki over 10% of the hits is this url …
Maybe we should entertain the idea again of a gadget that executes a script that rewrites certain thing on a page before saving it. That could be used on these links (and e.g. youtu.be redirects (>25%)) to rewrite them before they get saved. Dirk Beetstra T C (en: U, T) 05:53, 3 June 2023 (UTC)Reply
Hi!
  • "but the bot then has to deal with the blacklisted targets [...]"
Is that so? I thought (see above) that you just can not (anymore) bypass the SBL with the Google links.
  • "You also have to take into account possible bot downtime, in which case these links stay for long."
Indeed. But I guess, that's a minor problem. The downtime of CamelBot is quite low and it can cope with replags of the db.
  • "Maybe we should entertain the idea again of a gadget that executes a script that rewrites certain thing on a page before saving it."
I agree, that would be better. The thing is, in my experience, it might take years for something like this to be rolled out.
-- seth (talk) 07:42, 3 June 2023 (UTC)Reply
Ah, yes, I see now, the url parameter is now a full link in itself. It is the inverse of the problem that you cannot point to archive links of about.com pages. I agree with the other two points. —Dirk Beetstra T C (en: U, T) 10:20, 3 June 2023 (UTC)Reply

FYI: Replacing the SBL with Special:BlockedExternalDomains

Tracked in Phabricator:
Task T337431

There is work in progress in phab:T337431 which is thought to eventually lead to the replacement of the SBL with just an exclusion list of domains. Anything regexp-based would need to be handled in abuse filters. --Count Count (talk) 11:03, 6 June 2023 (UTC)Reply

how can i unblock my website

how can i unblock my website in wikipedia Mj985 (talk) 13:18, 7 June 2023 (UTC)Reply

Mj985 If you are asking that question here, then you are in the wrong place, and you are probably doomed for failure. Please ask the question at the Wikipedia where you are trying to add a url. Also, and more importantly, please read pages about "paid contributors", "conflict of interest editing" and general editing.  — billinghurst sDrewth 23:58, 7 June 2023 (UTC)Reply

Bots can now override the blacklist on Wikimedia projects?

You may have already seen this MediaWiki change but in case not, see:

I think this is a bad idea. --A. B. (talkcontribsglobal count) 02:25, 11 June 2023 (UTC)Reply

I replied there, this is a very bad idea. @Lustiger seth: however, if implemented correctly, it could help with the redirect issue discussed above, give users with given rights the right to override SBL rules for redirects and have a bot replace them with expanded links (and there weed out blacklisted domains as redirect targets). LiWa3/XLinkBot can already detect redirect sites and revert based on that. Dirk Beetstra T C (en: U, T) 04:25, 13 June 2023 (UTC)Reply

xoso.cc



Spam backlink, cross-wiki spam [14] TuanUt (talk) 03:33, 13 June 2023 (UTC)Reply