Content deleted Content added

Inline

Revision as of 04:16, 14 June 2023

Welcome here! The Security team is calling for feedback on a proposed Third-party resources policy from June 05 to July 17, 2023. Your suggestions and comments are warmly encouraged below.

05 June 2023: Start of the policy conversation

Hello, feedback regarding the Third-party resources policy is welcome below this message. Feel free to use the initial questions below as a starting point for the conversation or bring your own. Thank you!

On behalf of the Wikimedia Foundation’s Security team — Samuel (WMF) (talk) 00:50, 5 June 2023 (UTC)Reply

Are risks sufficiently explained and relevant?

Latest comment: 1 year ago5 comments2 people in discussion

A security-wise less educated user can probably not understand the scope of the risks from these explanations alone (it is e.g. not obvious what you can do through harvested cookies and session tokens). I suggest adding links to somewhere where the concepts can be discussed a bit more in depth (perhaps a tailored set of security pages). It could also be worth pointing out that anything the owner of the script can do can be done by an attacker who succeeds in bypassing their defences – a benevolent owner may still use naïvely coded building blocks for their own scripts or otherwise neglect security.

The section on user privacy and safety does not differentiate between what a tool does and what it could do when interacting with a third-party resource. I think the section should concentrate on information that leaks regardless on how the tool is coded, as that's the (implicit) rationale for the proposal. Other leakage can be mentioned, but as this may include anything that the third party could do on a direct connection, it doesn't need to be thoroughly covered here. The latter theme could be covered on a page on WWW safety.

–LPfi (talk) 14:25, 5 June 2023 (UTC)Reply

Hi @LPfi: and thanks for these suggestions. I am taking good note of your point about including explanatory resources for less security-savvy and illustrating how code used "naïvely" can lead to security issues. On the other hand, the portion about leakage is not that clear to me. When you suggest that the section should emphasize "information that leaks regardless on how the tool is coded", do you mean that the content should describe what personal information could be leaked or are you suggesting something else? — Samuel (WMF) (talk) 17:41, 5 June 2023 (UTC)Reply

My point is that the gadget forwarding session tokens, user names and the like is likely pure bad programming, while you cannot avoid sharing your IP address, operating system etc. if the gadget makes your browser connect to a third-party resource (previously stored third-party cookies are also an issue, I don't know how those typically are or could be handled). The claim that "scripts connecting to third-party resources may also share [...] the device they are using, their browser information, and location" seems wrong. Scripts run in one's browser usually cannot connect to the third-party resource other than directly, which means these titbits are shared automatically, regardless of how the script is coded (I assume also one's location is automatically shared through the IP in most cases).

Now, those facts cleared, what are the consequences? Is the device fingerprint enough to pair you with a user account on Google or Facebook? Does Google get that fingerprint through common designs of such resource interfaces? Does that mean that Google can (and does?) get to know your Wikimedia username by comparing your interaction with third-party resources to actions at Wikimedia sites? More generally: what should the user be afraid of when using such third-party resources (knowingly or unknowingly)? (Few sites have enough info on their own that leaked IP and device fingerprint would be an issue.)

–LPfi (talk) 08:41, 6 June 2023 (UTC)Reply

@LPfi: many thanks for taking the time to illustrate your previous comment. I think those are good ideas for making that section of the policy much easier to understand and informative. — Samuel (WMF) (talk) 20:44, 6 June 2023 (UTC)Reply

Do the definitions and required precautions make sense?

Latest comment: 1 year ago5 comments2 people in discussion

It seems the proposed policy would require all interaction with third-party resources to be done through interfaces offered by WMF production servers. For it not to cause disruption, these need to be offered. My understanding is that OSM tiles are nowadays fetched and cached by these. Are there other resources needing similar treatment?

For the restrictions to make sense, WMF needs to guarantee privacy when using production servers or possible external resources needed e.g. for participating in elections and surveys. I am afraid WMF shouldn't trust third parties in such matters, but instead guarantee that no sensitive information is sent to third parties.

–LPfi (talk) 14:25, 5 June 2023 (UTC)Reply

@LPfi: I hear your point and I agree that, generally speaking, safeguards should be in place to ensure the privacy of users. Regarding OSM, I'd like to note that the scope of this policy is purposely limited to User Scripts and Gadgets that load third-party scripts. Unless I am wrong, OpenStreeMap does not fall within those categories. — Samuel (WMF) (talk) 19:09, 5 June 2023 (UTC)Reply

Yes. On OSM and the like: if there is a valuable resource out there, which is not available through official means, there is a big temptation to use it through gadgets and user scripts, which may be used by many and become a de facto standard (such as many of the Toolserver tools). If those tools are disallowed, there needs to be a mechanism to include access to the resources or a substitute through official means. I don't know what third-party resources are in such use, but this proposal suggests there is a non-trivial amount of such scripts and gadgets – unless the third-party resources are used just out of sloppiness.

Loading third-party scripts, fonts and stylesheets is a different issue. I detest that the practice is common on the web, and I have often argued that using free equivalents hosted on the web site itself should be the norm. If there is a real need for such resources, effort should be put in replacing them, probably in cooperation with the free software/open source movement in any non-trivial cases.

–LPfi (talk) 09:32, 6 June 2023 (UTC)Reply

@LPfi: echoing some observations made on Phabricator, external resources that are most frequently loaded in User Scripts and Gadgets are translation-related resources (including Google Translate and Yandex), fonts, and a variety WMCS-hosted applications. —Samuel (WMF) (talk) 17:15, 6 June 2023 (UTC)Reply

How do you think the policy should be enforced?

Latest comment: 1 year ago2 comments2 people in discussion

I have no idea how to enforce it, because it's not clear what this proposal tries to achieve. Judging from the discussions below, most people are confused about it as well, so this proposal won't be very effective if adopted as is. Since forever, stewards and global interface admins have been removing on sight any occurrence of code which would transfer our users' IP address to non-WMF servers in violation of their privacy.

By the way, if the Wikimedia Foundation is suddenly very interested in doing something to defend Wikimedia users' privacy from a growing influence of third-party software, it could start by stopping its reliance on proprietary software and SaaSS, such as GAFAM's and Interpol's mass surveillance/upload filtering software. Nemo 17:44, 7 June 2023 (UTC)Reply

When would it make sense to start enforcing the policy?

Latest comment: 1 year ago5 comments3 people in discussion

What would that even mean? If you delete the fluff, all this policy says is "Gadgets and user scripts should not load third-party resources". There are no requirements. Maybe this is just a language misunderstanding and you don't mean should in the rfc 2119 sense, but regardless it doesn't seem like there is anything to enforce here. Bawolff (talk) 16:39, 5 June 2023 (UTC)Reply

@Bawolff: "should" here is meant as a requirement rather than a suggestion. So, I understand how this may be confusing. As asked a bit below, I'm definitely open to considering language tweaks to make it sound more forceful. With respect to enforcement, there are at least a few things worth considering. How do we make sure all the gadgets and user scripts currently involved in privacy issues comply with the policy requirements? What would enforcement look like (eg: page blanking, CSP automatic block, etc)? Should there be a grace period before enforcement? Should enforcement be done in a phased way, with most critical gadgets and scripts having a longer grace period etc? Those are some questions that could be explored. As a side note, I don't believe that referring to most of the draft policy as "fluff" brings any improvement to the content. Instead, it is just dismissive to the efforts put in proposing something to be improved :). — Samuel (WMF) (talk) 19:37, 5 June 2023 (UTC)Reply
I agree that explaining risks and suggesting best practices is worthwhile. As the actual policy is already determined through the terms of use, one could indeed see all this as fluff, but I don't think we should nitpick on the status of this document.

For the timing, I think the first thing to do is to check what scripts and gadgets there are in widespread use. Is there any sense in them not being MediaWiki features instead (Commons has a big problem in essential features whose maintainers would like to retire)? Does the WMF have the resources to take responsibility for their maintenance, and in what timeframe can the code be reviewed and any offending parts be fixed? Truly personal ones are probably less of a problem, even when they might compromise the privacy of their author – it is difficult to protect people from themselves.

The scripts and gadgets that this policy should target are the ones that spread like DOS/Windows viruses: users sharing a sloppily written piece of code that nobody cares (or knows how) to check. In Windows this is a huge social problem rather than a technological one – best practices aren't socially acceptable. Here I hope the best practices are or can be made part of the culture.

–LPfi (talk) 09:58, 6 June 2023 (UTC)Reply
@LPfi: I am following up here on my earlier Special:Diff/25119150 where I shared that fonts and translation-utilities are among the external resources that are loaded the most in UserScript and Gadgets. There are also smaller categories like Facebook Connect, Google Analytics, etc and some WMCS-hosted tools (see tables). For most of those resources, there are reasons why some of they are not currently part of MediaWiki core (if this is what you meant by Mediawiki features). For example, fonts have been associated with performance issues in the past (Phab:T166138#7223384), Facebook Connect and Google Analytics are associated with user tracking and would be incompatible with Wikimedia policies. Of course, not all external resources fall within those top categories, in particular WMCS-hosted resources. In that sense, what you mentioned about enforcing the policy by focusing first on the most harmful external resources sounds quite interesting. In line with that, would it make sense to have some criteria for establishing what makes an external resource more harmful than another? . — Samuel (WMF) (talk) 17:58, 6 June 2023 (UTC)Reply

Should WMCS-hosted resources be considered third-party resources?

Latest comment: 1 year ago22 comments7 people in discussion

I assume that some of the resources are essential tools for part of the community, while coding tools isn't restricted to highly trusted users. For the proposed policy to make sense, tools that aren't scrutinised and controlled by trusted users must be regarded as third-party ones, while regarding all the tools as untrusted would cause severe disruption. –LPfi (talk) 14:25, 5 June 2023 (UTC)Reply

I think you need to draw a distinction between tools that are on a separate website (that you choose to go to) and tools that get embedded into some sort of user script. The former is 99% of tools and the privacy analysis between the two situations are quite different. Bawolff (talk) 17:19, 5 June 2023 (UTC)Reply

True. There is a big difference. The problem is where there is no real choice, as you need to use the tools to get things done, which means that the editors who use them aren't protected by the policy. That exposure should be minimised, and the true extent of exposure you get by using a tool should be made clear. If I make a Google search, I know Alphabet will use the info. When I search for a book on the library website, I know I tell the library, but I don't like them having me load scripts from Google just because they couldn't be bothered to use a free platform. –LPfi (talk) 10:08, 6 June 2023 (UTC)Reply

@Bawolff and LPfi: I agree that the distinction is important here. Tools that are on a separate website but not embedded in gadgets or user scripts are outside the scope of this policy. For external resources that are embedded into gadgets and user scripts the "scrutinised and controlled by trusted users" seems like an interesting point (others will be needed as well) to identify resources that are not too risky, hence policed potentially differently. That being said, I am not aware of any process for actively monitoring user scripts and gadgets. Maybe Special:Gadgets? — Samuel (WMF) (talk) 19:49, 6 June 2023 (UTC)Reply

Disallowing all WMCS tools would kill innovation. There are things that cannot be coded directly in gadgets (e.g. those that require access to the Wiki Replicas or can be released only under a free license incompatible with CC BY-SA), or are much harder to do in gadgets (e.g. those that need libraries that are written in a language different from JavaScript). Getting these be extensions is probably prohibitively hard in most cases. Maybe there could be a site where only “trusted users” (e.g. all local interface admins and global interface editors, as well as users considered trusted who don’t happen to have one of these two rights) can put code, but in a quick self-service manner, similarly to how gadgets and Toolforge tools work. Or instead of relying on people being trusted, it could be behind a WMF-controlled proxy that strips all data that could breach users’ privacy (e.g. don’t include the X-Forwarded-For header, don’t forward the User-Agent header and cookies). (Being a new site, maybe it could be required that all code MUST be on gitlab.wikimedia.org to provide an edit history similar to wiki pages, but with code review being only OPTIONAL part of the process.) —Tacsipacsi (talk) 10:05, 7 June 2023 (UTC)Reply

There is wikitech:Wikitech:Cloud Services Terms of use which imposes restrictions on what tool developers are allowed to do on WMCS. Has this been taken into account? —MisterSynergy (talk) 23:31, 7 June 2023 (UTC)Reply

Hi @MisterSynergy: and thanks for joining this conversation. Although wikitech:Wikitech:Cloud Services Terms of use is a distinct policy and scope, its context was taken into account as part of the work on this TPR policy. — Samuel (WMF) (talk) 23:50, 7 June 2023 (UTC)Reply

Okay thanks.

I am still trying to figure out why WMCS is considered to be a third-party resource in this context. If WMCS is used properly, minimal to no PII is being exposed to tools hosted on WMCS; furthermore, WMF clearly has control over that platform and can, at least theoretically, monitor what tool developers are doing over there.

It seems difficult to draw a line that separates harmful from acceptable usage of WMCS via onwiki JS scripts, particularly as long as it is not clear what the actual problem with WMCS is. Given how important this platform is for editing Wikimedia projects, there should definitely be a path to make use of it. —MisterSynergy (talk) 08:14, 8 June 2023 (UTC)Reply

@MisterSynergy: There are many reasons why WMCS-hosted apps are not viewed as production sites by this policy. I could probably mention that (a) they are not subjected to the same level of control as sites in all.dblist (eg: production sites go through a CI/CD pipeline with strict tests and checks), (b) the level of scrutiny and dedicated maintenance isn't the same as production websites, (c) the code ownership isn’t on par, which means security risks and issues aren’t necessarily owned and mitigated in a timely fashion. Of course not all WMCS-hosted resources collect tons of PII but we should acknowledge that they can at least capture IP addresses and User Agents (eg: using server variables for example). Also, because a WMCS-hosted resource is usually less secure than production, any malicious actors compromising that external resource could repurpose it, modify the DOM to display a UI prompting users for sensitive PII such real name, ethnicity, financial details. When combined, these details can identify very precisely users and give ways to all sorts of real-life abuse — harassment, identity theft, physical harm (cf. bawolff example). Not to mention that this kind of compromise can turn the victim’s wiki account into a bot, result in privileged accounts takeover, etc. So, unless we decide on some criteria for what resources could be allowed or not, the status quo would be to allow all existing and future external resources, no matter how harmful they can be to the security of users and the platform. — Samuel (WMF) (talk) 19:24, 8 June 2023 (UTC)Reply

Thank you for the detailed answer. Let me address some aspects:

In the beginning, you are assuming inferior code quality at WMCS compared to production sites. This is very likely correct, but this policy is not about code quality. It is surely a poor user experience if things break and not be fixed quickly, but this itself is not per-se a threat. Any real threat can also be immediately mitigated by deactivating the gadget onwiki.
Hostile tool takeover situations are indeed something to consider that I have not thought of until now. Has this already happened in the past?
In order to turn a victim's account into a bot, the user needs to actively authorize the tool via a metawiki (?) dialogue that requests detailed access. This might be a criterium to consider indeed. However, a gadget that e.g. just requests data from an API hosted on WMCS without authorization cannot take over the account.
I am not familiar with all of WMCS's services, but at least for Toolforge, some PII is systematically being hidden via a proxy. PHP server variables do not contain IP addresses, for instance; some PII such as user agents and language settings of the user's browser are probably exposed to tools, though.

My impression is that this policy proposal grossly underestimates the importance of WMCS for many editorial workflows. At the same time, some of your reasoning seems a bit far-fetched to be honest, and not directly applicable for the implementation of such a restrictive policy with regards to WMCS. —MisterSynergy (talk) 20:22, 8 June 2023 (UTC)Reply

I think people are talking past each other due to ambigious language. My impression is that samuel is referencing the situation where a gadget doesn't just access an api but loads javascript hosted on toolforge. In such a situation a hostile take over could take over the user's account on wiki. However i think it is fairly rare (but not unheard of) for such setups to be used. I think the oauth take over, well a potential security risk, is not really a privacy risk, as the tool's website is the primary entry point, so toolforge's privacy policy should be controlling instead of the main privacy policy. Bawolff (talk) 21:09, 8 June 2023 (UTC)Reply

MisterSynergy, bawolff is right that I am talking about gadgets and UserJS that loads javascript hosted on Toolforge. Sorry for any confusion. To answer your question about account takeover, there have been serious issues involving UserJS and third-parties leading to account compromises of privileged users. Obviously, I am not allowed to discuss specifics publicly. However, don't get me wrong, as a TF maintainer myself, I highly value the importance of WMCS within the realms of Gadgets, UserJS, as well as the broader Wikimedia ecosystem. — Samuel (WMF) (talk) 19:26, 9 June 2023 (UTC)Reply

What am I missing here? The policy proposal is not restricted to executable scripts from third-party resources (including WMCS). To my understanding, accessing any sort of third-party resources is about to be forbidden due to potential privacy issues, and this should by default also include WMCS. Am I wrong here? —MisterSynergy (talk) 20:07, 9 June 2023 (UTC)Reply

I would agree with MisterSynergy that it is confusing to deal with supply-chain risk in user scripts/gadgets, in the same breath as privacy risks of loading external resources. I think these should be dealt with separately and it should be clear which type of risk we are trying to deal with. (As an aside, is the serious issue we are indirectly referencing without talking about T194204 or something else? Because i dont think that had much to do with third party resources, but maybe that is not what you are referencing) Bawolff (talk) 23:01, 9 June 2023 (UTC)Reply

Bawolff, T194204 is not what I was referring to. On the other hand, T296855 is an example of security incidents that were caused by TPR being embedded in a UserJS or Gadget. — Samuel (WMF) (talk) 18:42, 13 June 2023 (UTC)Reply

I disagree with your characterization of T296855. AntiCompositeNumber (talk) 21:29, 13 June 2023 (UTC)Reply

I don't have access to T296855, but the publicly available data is: There was a user (3 actually) who were banned for something related to T296855. One of those users created a user-space page, they then sent a message to an admin asking for advice on that page. 40 minutes later the admin's account is taken over and edits common.js. Over the next several hours, the user's subpage is deleted for having "Potentially dangerous code", the user is blocked for "Attempting to hijack other people's accounts", and one of the admin's local user scripts that they use, which previously hasn't been touched in months, is rewritten out of the blue, which happens to fix an XSS vulnerability in that script. I don't have all the details, but this seems to paint a pretty clear picture of a situation that would not be fixed by banning third party resources. Bawolff (talk) 21:54, 13 June 2023 (UTC)Reply

Regarding T296855, step 2 within the task description very clearly indicates how a third-party resource (hosted at github.io) was used as part of that attack. That really isn't up for debate IMO. One could potentially argue that it wasn't the _primary_ method used in the attack, but to argue that it wasn't a key means of obfuscating a payload incidental in the attack is incorrect IMO. And I personally believe it is dubious to suggest that simply because an attack _could possibly_ be carried out via other means, that attempting to better contain certain avenues for abuse - which we've literally witnessed as methods within real-world attacks against the projects - isn't worthwhile. @Bawolff - I would imagine the events of the flea incident from several years ago are still somewhat fresh in your mind, and many of those attacks involved the usage of external resources to obfuscate certain attack vectors, so there are indeed other real-world examples. SBassett (WMF) (talk) 01:35, 14 June 2023 (UTC)Reply

I agree that previous attackers have utilized externally hosted resources for their second stage command & control infrastructure. However i'm not aware of any that have used that as their way in. I'm also mindful that the industry in general has seen an uptick in supply chain attacks (solarwinds, the npm ecosystem issues come to mind). However, i don't think as written this policy does a good job of preventing those previous attacks wikimedia has experienced. They weren't the way in but things people did after they took control. By definition, malicious people are malicious, and don't follow policies. I don't see why they would follow this one. Now, perhaps the plan for this policy is to be a lead in to restrictive CSP. Which if implemented properly would reduce risk of external command & control. If so, then i think that should be talked about more in the policy (Just look at how many words have been spent over whether this policy should apply to user scripts, site js, MW extensions, WMCS etc. If the real goal here is to enforce technical measures to prevent external communication, then that settles all of those issues). Ultimately i don't think the aforementioned attacks are the most compelling motivating examples - once the attacker is in, making their life a tad bit more difficult, well nice if free, doesn't quite seem worth all this. At the very least it seems like a much better first thing to focus on for gadget security would "unsafe-inline" removal. As far as third-party resources go, I personally find arguments related to privacy or even supply chain risk much more compelling as it directly impacts those risks. p.s. i should clarify that I'm not opposed to this policy in general, i just think the phrasing of it is so ambigious that there is no shared understanding of what it all means. Bawolff (talk) 04:02, 14 June 2023 (UTC)Reply

@MisterSynergy: Yes, the current scope of the policy covers all UserJS and gadgets that connect to non-production websites, including WMCS-hosted tools since those have traditionally not been considered part of “production”. However, I acknowledge the many concerns raised about forbidding all external resources and the burden it would place on the community. This is why I am calling for ideas about exemptions. For WMCS specifically, the point is specifying under what conditions we could allow gadgets and user scripts to connect to WMCS-hosted resources. What would those conditions (aka, exemptions criteria) be? Do those criteria provide users of Gadgets and UserJS with adequate security and privacy safeguards? As said earlier, the aim with those questions is to identify exemptions that could become part of the policy. — Samuel (WMF) (talk) 18:44, 13 June 2023 (UTC)Reply

IMO the special situation about WMCS is that things over there are pretty transparent. It is a hosting platform where we (WMF staff as well as more than 2000 volunteer tool developers) can see which data is collected and how it is being processed. The ToU of WMCS that restricts data collection is also under our control. This is a fundamental difference to actual third-party resources outside Wikimedia servers where we really have no clue what is going on.
What matters most in my opinion is that the tool sources at WMCS are readable for other tool developers. This enables a large group of users to access tool sources for a review. By default this is the case (at Toolforge), but tool developers can restrict read access to themselves using chmod if they want to. However, except from secret files, logfiles, and possibly a few more cases, this should really be discouraged and possibly forbidden if the tool is accessed by an onwiki script/gadget. Source control is nice to have, but since there is no guarantee that the actual tool is using the exact sources from the repo, I am not sure whether this is an actual criterium here. —MisterSynergy (talk) 19:20, 13 June 2023 (UTC)Reply

Examples of what this will end

Latest comment: 1 year ago10 comments7 people in discussion

Would be good to clarify what features this policy will end. For example we have the option for relief maps on WikiVoyage which notifies users that to activate it will share details with an external service.[1] I imagine this policy would end that? These relief maps are excellent especially for hiking.

Additionally we have a great deal of functionality on the wmcloud which could than become less usable, which IMO would be unfortunate. Doc James (talk · contribs · email) 10:57, 5 June 2023 (UTC)Reply

Yes, I'm confused on how OpenStreetMap falls into the "Risks" category. SHB2000 (talk | contribs) 11:48, 5 June 2023 (UTC)Reply

Hello @Doc James and SHB2000: and thanks for joining the conversation. The scope of the policy is purposely limited to User Scripts and Gadgets that load third-party scripts. It is my understanding that the Wikivoyage and OpenStreeMap examples you mentioned do not fall within those categories but I am glad to be proven wrong. That being said, it's worth noting that gadgets and scripts sharing user information with third-parties has always been a practice in conflict with both the Terms of Use and Privacy Policy (see Purpose section of the draft policy).

With respect to Wikimedia Cloud Services-hosted resources, you are raising a valid concern. Recent data-driven observations have surfaced that many of those resources conflict with the Wikimedia Privacy policy and Terms of use but are also helpful to various editing activities. This is why this policy conversation also explores the question of whether some WMCS-hosted resources should still be considered "third-parties" (see WMCS section above). The aim there is to evaluate avenues to enable some of those important editing-related resources while providing decent security and privacy guarantees to end-users. Any thoughts you may have in that direction are obviously welcome.

Samuel (WMF) (talk) 15:20, 5 June 2023 (UTC)Reply

i think it is a mistake to only include user scripts and gadgets here (presumably you are also including common.js which is neither a user script nor a gadget). Rules for thee but not for me type policies tend to trigger resentment and the privacy concerns don't change depending on who is writing the code. Bawolff (talk) 16:27, 5 June 2023 (UTC)Reply

@Samuel (WMF) I read the tickets and the collected statistics do not seem to differentiate between site resources, gadgets and user scripts. Notably MediaWiki:Kartographer.js used on Wikivoyage exactly as described, seems to be included in this collected dataset. —TheDJ (talk • contribs) 09:02, 6 June 2023 (UTC)Reply

Hi @TheDJ: as part of the methodology used to gather that data, gadgets were looked up within the User and MediaWiki namespaces. Due to the multilingual nature of Wikimedia projects, filtering by namespace and looking up the page content were the most functional approaches found. This can explain why some noise may show up in the data. Also, the point was to highlight the external resources involved in top recurrent CSP violations that involve gadgets and user scripts. So, a certain level of noise was acceptable. Would you have any tips for differentiating between those types of scripts? — Samuel (WMF) (talk) 16:33, 6 June 2023 (UTC)Reply

Exactly what passage of the Privacy policy or the Terms of use do gadgets and scripts conflict with if they contact with third-party services, but only with explicit user consent? —Tacsipacsi (talk) 10:09, 7 June 2023 (UTC)Reply

Hi @Tacsipacsi: and thanks for joining the conversation. When you are talking about gadgets and scripts that "contact third-party services, but only with explicit user consent?" are you referring to user scripts and gadgets that display a type of privacy notice to users? — Samuel (WMF) (talk) 14:13, 7 June 2023 (UTC)Reply

Yes. —Tacsipacsi (talk) 21:41, 8 June 2023 (UTC)Reply

Thanks for clarifying Tacsipacsi. What "explicit user consent" means and whether displaying a privacy notice satisfies ToU requirements is something Legal is best positioned to answer. I have escalated your specific question accordingly and will keep you up-to-date as soon as I have an answer. — Samuel (WMF) (talk)

This will likely break community grown tools that reach externally such as w:en:Wikipedia:RedWarn/w:en:Wikipedia:Ultraviolet/Install, possibly JWB. — xaosflux ^Talk 17:50, 13 June 2023 (UTC)Reply

This is a step backwards on our strategic goals

Latest comment: 1 year ago10 comments6 people in discussion

I'm concerned with this proposal, and I have some reasons. Graphs have been broken for more than a month now, and it doesn't seem that they are going to recover, losing opportunities to show things in a more interesting way. Some efforts we have made to show Ourworldindata maps and graphs are also stopped at "security review". Some years ago, we launched (and invested lot of money on) a system that would read the articles in some languages but it never succeeded because of the same reason.

And I would accept that having third-party resources is not the best option if we had product team working on those features. But this isn't happening and doesn't seem it will happen in the future. Some years ago we were obsolete. Now we are paleolithic, a relic for archaeologists of the Internet. Meanwhile, other platforms are advancing by the day on new features. Every day we are further from our 2030 strategic goal. Theklan (talk) 11:18, 5 June 2023 (UTC)Reply

I don't know what the article reading thing was, but security was hardly the only reason ourworldindata didn't happen, just the first hurdle it had to face. Besides technical concerns, the elephant in the room is that it was very unclear if enough users wanted it to make it worth it. Bawolff (talk) 17:00, 5 June 2023 (UTC)Reply

Let alone who was going to do the amount of work to make it viable within the WMF setting (ddos strengthening, privacy isolation, security, parsoid and VE support, cache consistency and purging). We are further away from 2030 goals, but this is not something unexpected. I and many others have been warning from the start of that whole process that we were already overstretched and that implementing the 2030 goals would require at least tripling the tech department, if not more. This was ignored for a long time and we have had several crisis with code maintenance as of late that clearly surfaced how fragile many elements were. —TheDJ (talk • contribs) 19:41, 5 June 2023 (UTC)Reply

@Theklan: thanks for voicing your concerns through this discussion. However, OWID review and the Security team's workflow more generally are outside the scope of this policy. — Samuel (WMF) (talk) 20:14, 6 June 2023 (UTC)Reply

The problem is that, as TheDJ points, most of the things we need are not developed. If third party content is not possible and, at the same time, we are not creating those things, the obsolescence will only grow. Theklan (talk) 20:42, 6 June 2023 (UTC)Reply

@Theklan: this is a valid concern. For the specific context of user scripts and gadgets, there is an exploratory reflection on if and when some WMCS-hosted resources could be treated differently than the other third-party resources. If you have some ideas in that area, please share under Should WMCS-hosted resources be considered third-party resources?. — Samuel (WMF) (talk) 20:58, 6 June 2023 (UTC)Reply

There should be a Product department just making the product better. Is not what we consider as third-party resource: is how we do to have first-hand resources for our needs. Theklan (talk) 21:01, 6 June 2023 (UTC)Reply

Ok, I'm going to be frank here.... Other than Sam, most of the security team knows jack shit about Wikipedia and the people who built it. Most of the issues this proposals deals with, would not be an issue if the foundation had invested properly in tools and content that people had been asking for. Instead now we DEPEND on the external stuff and you are taking it away. This is not a technology or security issue you now have, but a social issue that you need to negotiate. Saying "sure but that's not our problem, its another team" is NOT the appropriate response. —TheDJ (talk • contribs) 08:04, 7 June 2023 (UTC)Reply

To be clear, by "Sam" are you referring to Reedy? Nardog (talk) 09:26, 7 June 2023 (UTC)Reply

Yes. --MZMcBride (talk) 16:16, 7 June 2023 (UTC)Reply

This doesn't read like a policy

Latest comment: 1 year ago8 comments4 people in discussion

Policies are rules that people are obliged to follow or operating procedures. This just has some suggestions and best practises. If the document has no rules it is a guideline not a policy. I think this makes the intended purpose of this document confused.

Other comments:

i think the way this is using the term third party is confusing and misleading. The word third party makes people view this through the wrong lense. Often in other documents that refers to the authorship of the resource not the hosting provider (consider also external resources that are proxied or locally made resources hosted on github). I would suggest using the word "externally hosted" instead.
- Additionally, this desperately needs to clarify where non-production resources (toolforge) falls (and any other things hosted by wmf that are not covered by the main privacy policy). Normally these would be considered external since the normal privacy policy doesn't apply.
- it should maybe clarify if wmf can use such resources if contractual obligations are in place to ensure the privacy policy is followed.
it should clarify if the user is allowed to consent to such services (say if enabling a gadget) and under what circumstances consent is allowed as an escape hatch (my 2 cents is it should be acceptable in optionally enabled gadgets/user scripts but not in anything global enabled. But there is already an exception at wikivoyage)
i would suggest replacing the term end-user with user. End-user sounds a bit dehumanizing, but maybe that is just me.
this should probably emphasize not just explicit persinal info but incidental collection. E.g. when the external resource doesnt collect anything but their hosting provider logs ip addresses, etc.
re xss. This probably underplays the risks of xss a bit. However at the same time it is way too close to implying that xss is the only privacy risk here, which i think is misleading.
i think the "consider alternatives" section should be removed. I dont think it fits with this document; how to make a user script belongs elsewhere. Second it sort of implies that once you have considered alternatives you can use third party resources if no first party stuff does what you need. Third - realistically it is going to be pretty rare that that there is a local resource doing the exact same thing, especially when it comes to proprietary apis.

Bawolff (talk) 16:23, 5 June 2023 (UTC)Reply

I'd like to second the first point. In my reading, despite being labelled a "policy", this document does not require anyone to do anything, it only offers suggestions. It needs to be rephrased or renamed, depending on what was intended, before we can really discuss its contents. Matma Rex (talk) 18:29, 5 June 2023 (UTC)Reply

Hi @Bawolff and Matma Rex: and thanks for joining the discussion. I'd like to address some of the points shared while continuing to give the others some thoughts.

While I get the part about the text not reading like a policy, it is my understanding that the Required precautions section contains firm requirements such as "Gadgets and user scripts should not load third-party resources". Do you have any suggestions to tweak the language in a more assertive way perhaps?
I am not sure I get the portion about underplaying XSS. Do you think those risks should be emphasized, explained differently?
The "third-party" term was chosen to align more with the terminology used across the industry, especially within the Third-party risk management ecosystem.
WMCS-hosted resources are currently considered third-party resources. That being said, recent data-driven observations make it worth asking if and when some of those resources should be considered differently, hence the open questions at the top about WMCS.

— Samuel (WMF) (talk) 18:55, 5 June 2023 (UTC)Reply

I suggest "must" instead of "should" and "could" if these are indeed requirements. Matma Rex (talk) 19:24, 5 June 2023 (UTC)Reply

Yeah, "avoid" is weak. It's a recommendation. It's a form of feedback from the Security team to the community, not a matter that requires feedback in return. ToBeFree (talk) 19:56, 5 June 2023 (UTC)Reply

re "aligning with industry" - i think that is the major problem here. Most of the parts of this policy that don't make sense here would make total sense as a part of a corporate policy designed to mitigate risk of third party components and supply chain attacks. The weaker language would make total sense in a corporate context where the assumption is that everyone has the same goals, people get fired if they play too fast and loose with doing what they "should" be doing, but exceptions still happen and there is a management chain to accept risk when things deviate from their ideal form. However that is not the context we are in, a top down policy from wmf is more like a law. If it doesn't set out specific binding requirements nobody will listen.
Similarly i think "third-party" is confusing here because the context and threat model is subtley different from how most of the industry talks about such things. Not to mention the number of parties involved and their relationships.
regarding wmcs - i dont think that data is very meaningful by itself. Clearly there is an apetite to access external resources, but just from the data we don't know in what context the external api was called, what consent was obtained. More importantly we don't know what expectations the users have or how that varries between different groups, including vulnerable groups. Something being popular is just one part of the analysis, we should also analyze what the risks are, which are acceptable, which can be mitigated, etc.
re xss. To a lay person stealing session token sounds like some obscure piece of metadata, better to focus on actual consequences (e.g. turn your account into a vandal bot). But i would also talk about how for certain vulnerable people leaking the wrong person's ip to the wrong people could in the extreme case result in people dying. Privacy needs are extremely person specific.

Bawolff (talk) 20:50, 5 June 2023 (UTC)Reply

@Bawolff: I have taken good note of the points you mentioned and will keep them in mind alongside the other comments shared in this thread while preparing the next update of the policy, thanks. re data - Sure beyond the popularity criterion, a risk tiering of these resources is important. This is the point in these questions about WMCS exemptions and the criteria attached to those exemptions. Having criteria that help establish why some resources are Low risk while others are High can help identify mitigations for a whole set of resources, both existing and future, depending on their risk tier — Samuel (WMF) (talk) 17:52, 8 June 2023 (UTC)Reply

i disagree that risk tiering is important or even viable for this policy (beyond treating executable content different from non-executable content, but that is really more a security thing than a privacy thing). It might make for interesting data but i'm not sure what we would take away from that. Risk is context specific and cannot be evaluated outside of a specific context. Often there is an implicit context that makes sense, but it is unclear what that would be here. Bawolff (talk) 18:21, 8 June 2023 (UTC)Reply

Exemptions

Latest comment: 1 year ago14 comments4 people in discussion

I think there should be some exemptions: — xaosflux ^Talk 16:25, 5 June 2023 (UTC)Reply

Personal exemptions

There should be an exemption for user's personal script files (e.g. Special:MyPage/vector-2022.css), these are self-managed and only apply to yourself. — xaosflux ^Talk 16:25, 5 June 2023 (UTC)Reply

Perhaps you get an on/off to accept these in your prefs... — xaosflux ^Talk 18:54, 5 June 2023 (UTC)Reply

The problem is that users use eachothers css/js files.. and only one of those users needs to have sysop, interface admin or checkuser rights to create complete chaos. —TheDJ (talk • contribs) 19:43, 5 June 2023 (UTC)Reply

which is probably a good argument to not conflate non-security privacy issues (non-xss) and security (xss) together in policies. Users can opt in to lower privacy and only affect themselves. They can't opt in to low security without affecting others. Bawolff (talk) 19:49, 5 June 2023 (UTC)Reply

Users use other people's browser extensions too, but in both cases they are choosing to do that themselves, to themselves. — xaosflux ^Talk 19:59, 5 June 2023 (UTC)Reply

Regarding users loading other user's scripts: that doesn't require a third party to cause a potential problem; if you load someone else's script you are subject to anything they put in there, the bad code can be local (as this discussion is about third party resources, not about prohibiting all local scripts as well). — xaosflux ^Talk 20:04, 5 June 2023 (UTC)Reply

Hey @Xaosflux: and happy to read your inputs here. I may be wrong but it seems to me that the personal exemption you described does not fall within the remit of this policy since the scope is gadgets and user scripts hosted outside Wikimedia production websites, not personal on-wiki scripts. So, there is no need to include any "personal exemptions" in the policy. Is there a point I am missing? — Samuel (WMF) (talk) 21:52, 5 June 2023 (UTC)Reply

@Samuel (WMF) no, what I'm saying is that personal userscripts should be exempt from this whole thing. They are self-managed, and your self-management can only impact yourself. If you choose to trust another that should be up to you - just like you can skip this whole thing and just put your userscript in your browser directly -- this just makes it more annoying for anyone that works on multiple browsers. Someone above made the argument that User:X could load User:Y's script, and User:Y could be loading EternalSite:x's script as some sort of special worry-- but user:x can just install SpywareBrowserExtension- if they choose to trust someone else that's what happens. — xaosflux ^Talk 22:17, 5 June 2023 (UTC)Reply

Project exemptions

There should be a process for a project to determine if a TPR may be acceptable (such as by publishing a gadget), as per-user opt-in only. Some projects innovate with external resources such a mapping or translation services that there are no sufficient internal replacements for. — xaosflux ^Talk 16:25, 5 June 2023 (UTC)Reply

In a world with SUL it is very difficult to effectively isolate these sorts of things per-project Bawolff (talk) 17:01, 5 June 2023 (UTC)Reply

@Bawolff there are no global gadgets, so if these are opt-in only, you'd have to opt in at each project. Agree that a default-on gadget would affect unknowing users. — xaosflux ^Talk 18:50, 5 June 2023 (UTC)Reply

if each individual user is opting in, i think that is workable. For default enabled though, there is enough cross-wiki integration in wikimedia (not to mention everything being cors linked) i would worry that the user's privacy could potentially be compromised in some circumstances even if they never intentionally use the project in question. I guess it would matter a lot on specifics. Bawolff (talk) 18:58, 5 June 2023 (UTC)Reply

@Bawolff 100%! I'll clarify that above. — xaosflux ^Talk 19:02, 5 June 2023 (UTC)Reply

Proxy servers

Latest comment: 1 year ago6 comments4 people in discussion

Currently, https://fontcdn.toolforge.org/ is a proxy server for the Google fonts server. Would anonymizing proxy servers such as these fall under the category of third-party resources that could not be used? isaacl (talk) 17:33, 5 June 2023 (UTC)Reply

i don't know what this policy's take on it is, but the historical answer is that it is not allowed e.g. see discussion at https://phabricator.wikimedia.org/T209998 Bawolff (talk) 17:43, 5 June 2023 (UTC)Reply

Thanks for the pointer to the Phabricator ticket. I'm unclear then why the proxy server is hosted on toolforge, if using it for Wikimedia-hosted pages or tools isn't currently considered appropriate. isaacl (talk) 20:52, 5 June 2023 (UTC)Reply

it is considered appropriate for toolforge tools but not the main site. The main site is covered by a much stricter privacy policy than toolforge is. The historical view is that private data should not be transfered from main site to toolforge, as otherwise it would be a pretty big loophole if one could get around the privacy policy by just transfering the data to a wikimedia site not covered by the (main) privacy policy. Bawolff (talk) 21:06, 5 June 2023 (UTC)Reply

Hi @Isaacl: and thanks for taking part in the discussion. Since WMCS is not part of Wikimedia production websites, the WMCS-hosted resources are currently considered third-party resources. However, this conversation is part of an exploratory process and thoughts on if and when WMCS-hosted resources should be considered differently are welcome. — Samuel (WMF) (talk) 22:37, 5 June 2023 (UTC)Reply

One of the many flaws of this proposal is that it lists Complete list of Wikimedia projects as production websites. Based on that definition https://fontcdn.toolforge.org/ and https://maps.wikimedia.org/ fall in the third-party category, but it's no problem to include content from http://www.wikimedia.org.ph/ and https://translatewiki.net/wiki/. Multichill (talk) 21:28, 7 June 2023 (UTC)Reply

Localhost

Latest comment: 1 year ago5 comments2 people in discussion

Please do not use CSP headers to block http://localhost and http://127.0.0.1; this "external" site is under the user's control and is useful for script development. Suffusion of Yellow (talk) 18:17, 5 June 2023 (UTC)Reply

Hello @Suffusion of Yellow:, and thanks for raising this concern here. However, CSP rules are outside the scope of this policy discussion. You can look at this Phabricator ticket to see if it is a more suitable avenue for your suggestion. — Samuel (WMF) (talk) 21:05, 5 June 2023 (UTC)Reply

Ok, so is this is going to be a "social" policy, with no technical measures to prevent the fetching of external resources? Or with this be prevented by some other means? Either way, please don't define localhost as a "third-party resource" even though it's "located outside Wikimedia production websites". This would be equivalent to policing what software I'm allowed to have on my computer. Suffusion of Yellow (talk) 17:04, 6 June 2023 (UTC)Reply

@Suffusion of Yellow: by "social" do you mean that the policy will be enforceable by the community, as here for example? Then yes, although it is good to note there are ongoing plans for technical measures to prevent the fetching of external resources (see phab:T135963). That being said, ideas for enforcing this policy more specifically are welcome. I hear your other point about localhost URLs being part of your personal space. In that case it makes sense to think more broadly about those locations since other users may use custom domain names, local IP addresses, etc. In the event that exemptions are considered as part of this policy, what might be a good way to identify those local resources? — Samuel (WMF) (talk) 18:49, 6 June 2023 (UTC)Reply

Expanding on a suggestion by Bawolff in that task, a user preference, maybe? That is, I go to Special:AllowedThirdPartySites, get asked for my password, then add "127.0.0.1", "foohostname.barnetwork", and so on. Suffusion of Yellow (talk) 19:06, 6 June 2023 (UTC)Reply

New wiki for code incompatible with the CC-BY-SA?

Latest comment: 1 year ago5 comments4 people in discussion

Any thought of creating a new SUL-connected wiki, e.g. https://code.wikimedia.org/, not subject to the CC-BY-SA (but still requiring some sort of free license) where we can store GPL-"contaminated" user scripts? Then there would be no need to choose between reinventing the wheel, and storing such code on an "third-party" site. Suffusion of Yellow (talk) 18:33, 5 June 2023 (UTC)Reply

Personally i think we should have a specific gerrit (or gitlab) repo for this, not subject to the normal CR requirements (that is then automatically deployed to the site). MediaWiki isn't a fun source code management tool. Bawolff (talk) 18:38, 5 June 2023 (UTC)Reply

@Suffusion of Yellow: would this be place where users scripts would be "scrutinised and controlled by trusted users" as in LPfi's point? — Samuel (WMF) (talk) 20:32, 6 June 2023 (UTC)Reply

If we go with my suggestion, then it would handled just like we do know; a few "trusted" users can manage scripts like https://code.wikimedia.org/MediaWiki:example.js, but anyone can put what they want in their userspace, e.g. https://code.wikimedia.org/User:Suffusion_of_Yellow/example.js. The only difference would be with the copyright disclaimers. Suffusion of Yellow (talk) 20:45, 6 June 2023 (UTC)Reply

I like this idea, as I think it would mostly serve as a CDN or sorts, thus consolidating "external" code and making such resources easier to find, manage and audit. I don't like the idea of using SCM like Gitlab or whatever as a CDN in this case; I feel like those interests should be kept separate when feasible. Of course we should encourage anyone developing userJS or Gadget code to use modern SCM and appropriate CR processes. SBassett (WMF) (talk) 17:10, 8 June 2023 (UTC)Reply

This is fundamentally flawed

Latest comment: 1 year ago4 comments4 people in discussion

Sorry, but it seems clear from what you've written here and your replies above that you don't really understand how things work on the wikis. You wrote "should" but claim you meant "must", you don't seem to have a clear understanding of what a "user script" is, you've made contradictory statements about CSP, and so on. IMO you should scrap this, get some advice from people who understand the ecosystem you're trying to change and the language used by people in that ecosystem, and then come back with a fresh proposal once you better understand exactly what you're proposing. Anomie (talk) 02:58, 6 June 2023 (UTC)Reply

Hi @Anomie:, the policy draft and the broader consultation process come after preliminary rounds of discussions during which "advice from people who understand the ecosystem" was gathered: interface admins, gadget and user script authors, Wikimedia developers, staff, and various long-term contributors. The aim of this discussion is to improve the draft by exposing it to a much larger audience and benefitting from a bigger pool of inputs. If you have specific areas of improvement such as the language and explanations that seem contradictory, feel free to elaborate on them. — Samuel (WMF) (talk) 10:20, 6 June 2023 (UTC)Reply

I'm not qualified to speak on the technical side, but I am sceptical that a sufficient consultation was done with those who know the ecosystem and how to write these things with the should/must split and much of the language reading as suggestions not obligations. Those are fundamental concepts of any wikimedia policy writing, not merely a matter of better language. Nosebagbear (talk) 15:24, 6 June 2023 (UTC)Reply

Judging from the removal of jquery.tipsy’s removal, I believe no one in WMF development knows what a user script is. I agree this should be scrapped. Al12si (talk) 21:36, 9 June 2023 (UTC)Reply

Day 1 thank you

Latest comment: 1 year ago3 comments2 people in discussion

Hi everyone, I want to say thank you for the initial comments! There were some good recommendations around the policy language, questions on the scope of the policy, and ideas about exemptions, especially regarding some WMCS-hosted resources. The Security team will continue to review those inputs and the new ones as they come in with the aim of updating the policy draft iteratively. — Samuel (WMF) (talk) 09:53, 6 June 2023 (UTC)Reply

This is such a confusing way to structure a talk page. If you're preemptively creating sections, please comment and sign right under each heading. Otherwise it looks like the first commenter created that section. Nardog (talk) 12:35, 6 June 2023 (UTC)Reply

Hi @Nardog: thanks for taking the time to improve the page structure and headings. This is much appreciated. — Samuel (WMF) (talk) 20:17, 6 June 2023 (UTC)Reply

Please do not take away our ability to use third-party scripts

Latest comment: 1 year ago1 comment1 person in discussion

There are many legitimate reasons to load scripts from third-party sources. While I understand the potential risks, I believe they do not warrant the creation of such a heavy burden on us MW script developers.

For understandable reasons, useful third-party JS libraries are repeatly being removed from MW core, often without satisfactory alternatives. For example, just recently the jQuery tipsy library was removed, and the extremely useful jQuery.ui library is likely to face a similar fate. For volunteers like me who do not have enough time to learn the OOUI library in depth, taking away our ability to remotely load trusted & stable JS libraries would be disastraous.

As of 2018, the ability to edit user scripts that affect other users was heavily restricted to a very tiny group of editors, interface administrators. These editors are required to use a strong password for their accounts, as well as enable 2FA. I believe this restriction is tight enough, since interface-admins are usually experienced MW script writers. We should count on them that they do not load external scripts from unfamiliar or untrusted sources.

TL;DR: As a MediaWiki user script developer, I believe that imposing a restriction on the usage of external JS scripts and libraries would be a very big step backwards. I have written numerous user scripts that I do not see myself writing without relying on secure, trusted third-party sources. If the WMF wants to improve the security of users, I really hope a different approach will be taken, one that does not make the work of volunteers harder. Guycn2 (talk) 18:17, 7 June 2023 (UTC)Reply

08 June 2023: Summary of the discussion so far

Hi everyone, and thank you for your continued contributions to this conversation. In addition to the responses already given here, I’d like to acknowledge some of the key points you’ve raised so far, offer a few clarifications, and propose some next steps.

Concerns were raised about the policy language, some requesting that the policy be written using the conventions of RFC 2119, replacing "should" with "must", etc. Also, there were detailed suggestions about ways to make the policy content more accurate, clear, and understandable, especially around the risks facing users. The feedback on the language and content will inform the next update of the policy draft.
Confusion was expressed with respect to the scope of the policy, in particular whether the policy means the end of all non-production tools. It’s good to note that the policy scope is deliberately limited to user scripts and gadgets loading non-production resources.
It was repeatedly flagged that disallowing all external resources in gadgets and user scripts would place a significant burden on the community. Some external resources, in particular, those hosted on WMCS, are important for community autonomy and usually lack alternatives on production websites. As a result, suggestions around exemptions were shared: exempting WMCS resources based on how harmful they are, allowing anonymizing proxies, exemptions on an opt-in basis, user-level or project-level exemptions, etc.

In line with the last point, I am adding a new section labeled Exemptions criteria. The aim there is to identify exemptions that could become part of the policy. Under what conditions could external resources be embedded in gadgets and user scripts? When could the exemption be revoked? How to ensure that exempted resources do not compromise users' security and privacy? Based on this data, are some resources outright not acceptable? Those are initial questions in that direction. Let me know what you thoughts are. Also, please tell me if some points were not captured in the summary or should be moved down here. — Samuel (WMF) (talk) 00:35, 8 June 2023 (UTC)Reply

Exemptions criteria

Latest comment: 1 year ago14 comments7 people in discussion

Here are initial questions for the section discussion: Under what conditions could external resources be embedded in gadgets and user scripts? When could the exemption be revoked? How to ensure that exempted resources do not compromise users' security and privacy? Based on this data, are some resources outright not acceptable? Samuel (WMF) (talk) 00:35, 8 June 2023 (UTC)Reply

I think a rather clear case of what shouldn't be exempt would be any Default Gadget, or Site Script (e.g. skin.js, common.js) - and perhaps instead of carving out exemptions the TPR is just scope limited to those areas? — xaosflux ^Talk 17:17, 8 June 2023 (UTC)Reply

@Xaosflux: Why must external resources be banned from Default Gadgets? Is it because they impact a large number of users? If that is the case, it’s good to note that some non-default gadgets loading external resources are often widely used too. When they load external resources, default and non-default gadgets both expose users and the platform to a wide range of risks. Instead of allowing all external resources in non-default gadgets and UserJS, shouldn't we explore the route of exempting resources that meet criteria X, Y, Z? —Samuel (WMF) (talk) 20:26, 8 June 2023 (UTC)Reply

Default gadgets are opt-out, not opt-in. Taavi (talk!) 20:41, 8 June 2023 (UTC)Reply

While this is mostly true, not always: there may be parts of default gadgets and site scripts that ask for explicit user consent, explaining the privacy consequences; and there may be non-default gadgets that get loaded without explicit user opt-in (e.g. conditional loading through site scripts or withgadget= URL parameter). Therefore, the criterion should be “anything that gets loaded without explicit user opt-in or consent”, not “any default gadget or site script”. —Tacsipacsi (talk) 21:47, 8 June 2023 (UTC)Reply

@Tacsipacsi @Samuel (WMF) - basically I'm calling out anything that can get foisted upon readers without any choice. We know that in reality, registered webui users are a minority of the connections we serve every day - and I think that it would be a good start for TPR controls. By leaving out all the manual opt-in things, and the single-user-only things we should be able to avoid much of the friction with the userbases, while helping to ensure that the privacy of our readers is increased. At that point, we could look in to what may be exceptions to just that scope - and how that could be managed first. — xaosflux ^Talk 21:55, 8 June 2023 (UTC)Reply

Agree with @Xaosflux. Project-wide exemptions should be simply not possible. Each gadget can define its own third part resource which each user must explicitly consent before enabling the gadget (and be able to revoke it at any time via going to a special page like Special:ManageThirdPartyResources or such, someone should also be allowed at add any more they want in that special page). Any attempt to load a third party resource with `withJS=` or `withgadget=` argument must simply fail by CSP not allowing it. The whole point of CSP is to avoid users sensitive data leaking if some JS gets compromised on wiki or PC, allowing arguments such as `withJS=` to bypass that would defy the whole purpose of CSP.

There should be several types of CSP exemptions as well. Loading font, loading scripts, loading media, etc. That is possible via fetch-directives.

Random example: One of my tools for my home wiki is a gadget that when someone tries to create an article, loads the English version and builds some basic info. It's done serverside in tofawiki.wmcloud.org that provides a json (an example). I want users to be able to load the data but if my server gets compromised, I don't it to be able to execute code. I'm not sure if that's possible though. Amir (talk) 00:52, 9 June 2023 (UTC)Reply

oh and to emphasize, Splitting third party resources to "good" so it's loadable under any condition and "bad" would simply be infeasible. The default must always be "wikimedia production only" and let users decide what they want to explicitly allow on top (similar to adding a bot password). This means there is no point in discussing whether toolforge should be considered third party or not, or what about this website, or that website. This is not really the point. Amir (talk) 01:02, 9 June 2023 (UTC)Reply

Hey @Ladsgroup: it is my understanding that you are making the following points:

All external resources must be disallowed in Gadgets and UserJ by default, irrespective of whether they are WMCS-hosted resources or not.
The technical measure disallowing those resources would be CSP, though it’s in report-only mode only for now.
Gadgets and UserJS must only load external resources if they are rewritten to be compatible with some yet-to-be implemented solution that collects user consent and allow exemptions at the user level (eg: CSP opt-out, interstitial, etc, cf phab:T208188).
Is my understanding correct? — Samuel (WMF) (talk) 16:47, 9 June 2023 (UTC)Reply

Generally yes but I'm not sure about the rewrite part, it can be simply added to gadgets definition, e.g. w:MediaWiki:Gadgets-definition, you define what rights needed for a gadget, what dependencies it has. It's natural to think you need to add the domains you'd need for that gadgets and users will be prompted to accept it if they are enabling it. For user scripts, I guess any person enabling it should go to a special page and add the domain or something like that. It doesn't need to change one line of code, only definitions of gadgets. Of course there are complexities on our side like how to handle domain added to a gadget that is already enabled for many users but none of the challenges are impossible to fix. Amir (talk) 21:07, 9 June 2023 (UTC)Reply

It’s a good idea if it’s feasible using CSP, but there are some details that need to be clarified / I’d like to have improved:

Gadgets-definition is a good solution for gadgets, but what about user scripts and JS loaded from other (WMF production) wikis?
What if the gadget is loaded using mw.loader.load('ext.gadget.*')? In this case, the Gadgets extension doesn’t really have control over what happens, only the core ResourceLoader has (unless Gadgets conditionally registers those gadgets, but that would be a step back from a performance perspective after removing the targets system improves performance).
Having to go to a different page is cumbersome. What if a popup appeared on Special:Preferences#mw-prefsection-gadgets when one tries to enable a such gadget, asking for explicit consent but avoiding navigating to another page? When turning off the last gadget that asks for permission for a given site, another popup could appear, asking the user whether they want to revoke the permission as well. (Remember that no on-wiki styles and scripts load on Special:Gadgets, so users cannot be tricked into anything.)

—Tacsipacsi (talk) 17:56, 10 June 2023 (UTC)Reply

It's a misconception: If the CSP exemption entry is not showing up in the startup module (in which it won't), it won't impact the performance. Amir (talk) 00:06, 12 June 2023 (UTC)Reply

Just noting that phab:T36958 exists to bring gadget-like features to user scripts, allowing them to list the CSP domains in a definition, like for site gadgets. SD0001 (talk) 18:14, 12 June 2023 (UTC)Reply

Google should generally be considered as evil, they make money from raping privacy. They are the very anathema of privacy. Google analytiocs shgould be completely banned from here. Grüße vom Sänger ♫^(Reden) 20:17, 8 June 2023 (UTC)Reply

Nitpick

Latest comment: 1 year ago2 comments2 people in discussion

Please change "utilize" to "use", there's no advantage in using wordier language. Thanks! Frostly (talk) 03:03, 9 June 2023 (UTC)Reply

Hi @Frostly: and thanks for joining the conversation. Well noted. I don't think this is nitpicking if it can improve the clarity of the policy, especially as the text is translated in various languages :). Thank you for your feedback. — Samuel (WMF) (talk) 12:33, 9 June 2023 (UTC)Reply

What are we going to achieve?

Latest comment: 1 year ago1 comment1 person in discussion

I see the general point of the proposed document. However, in my opinion it's going to cause more troubles than benefits.

Asking users to type a list of URLs that they trust to is in my opinion a very bad idea. Why user should trust that particular domain? It's very likely it has only an API or some static main page. Even if the tool's code is open-source, will the user spot any evident pitfall in the code? No, they won't even open the repo. If the gadget or userscript description seems useful, they will turn it on.

Ensuring, that a script or gadget is safe, is the job of the technical people in the community (that is, mainly script authors and interface admins). Out of wiki users, only they do know, what can be considered appropriate for use on wiki. All in all, that was the reason to form interface-admin group in 2018. And this policy should not push the responsibility to unconscious users but rather should provide those technical people with guidelines, what to do if they want to incorporate not-first-party resources and services in their scripts and gadgets.

There are actually three kinds of resources with different properties:

first-party – these include code and data on WMF wikis and are likely to be secure and contain no threats to the privacy,
second-party – user-submitted code to WMCS that's under theoretical WMF's control, however are not pre-checked for vulnerabilities and threats,
third-party – services run completely independently from WMF where there's no control over the resources served and the processing that's done.

Is splitting those between first- and second- party really the best option?

Even without use of second- and third- party services, malicious scripts and gadgets can leak data (for example by saving a session token or user agent to a wiki page). Such cases, when occur on-wiki, may be relatively easily discovered (provided that it's not on Wikidata) and taken down – but, on the other hand, they have much bigger visibility and likelihood of being archived. Privacy threats by other services will likely be less disruptive, however they may be undetected for a longer time.

Gadgets and WMCS tools have much in common. They both are run by community members, using WMF infrastructure, where Foundation staff can perform administrative tasks. The main difference is that they don't have to be open source, meaning that the community may not have control over the actual tool logic and resources it uses. But WMCS offer several advantages that could improve gadgets and their users' experience in a way that wouldn't be possible with only the on-wiki code. For tools that are intended to serve from-wiki API requests, I'd set up a network policy on Kubernetes/VM that would restrict/block outgoing traffic to the third-party services and/or anonymize the incoming traffic (like stripping User-Agent and other sensitive data) – this could make them be marked like "safe for gadgets", which could serve as a recommended way of using WMCS in gadgets. Similar scenario might be possible for third-party resources: a proxy on WMCS that'd ensure there are no sensitive fields in the request.

The general guidelines I'd consider are:

Make the tool code open source
If it's a gadget: Make it clear in the gadget description (visible in preferences) that the gadget uses external services and link to an on-wiki rationale why the code isn't solely hosted on wiki
Provide the same info at the top of the source code in the wiki language with a link to the said rationale
Provide a basic and up-to-date documentation on what data you process with the tool and a contact in case of a privacy threat (either on-wiki or on the tool's webpage)
List the tool on some global list (e.g. on metawiki) – for use of people that might inspect these tools
If restrictions in WMCS in- and outbound traffic are available, enable them or explain why you keep them disabled

Scripts that serve solely their original author (like Special:MyPage/common.js) might be exempt from those criteria because it's hard to protect users from themselves. Msz2001 (talk) 08:52, 12 June 2023 (UTC)Reply

CDN under WMF control

Latest comment: 1 year ago3 comments2 people in discussion

Within Required precautions at footnote level https://cdnjs.toolforge.org/ should be mentioned. This has been established exactly for that purpose.

ceterum censeo no offered software must connect to external websites.
If ever, this requires completely open source and explicit opt-in agreement.

PerfektesChaos (talk) 16:18, 12 June 2023 (UTC)Reply

Hi PerfektesChaos and thanks for joining the conversation. I am making a note of your point that by default UserJS and Gadgets must not connect to external an website unless that website's code is open source and the UserJS or Gadget provides "explicit opt-in agreement". However, what you mentioned about the footnote is a bit unclear to me. Are you suggesting that https://cdnjs.toolforge.org/ be mentioned somewhere in the "Consider alternative scripts" sub-section? — Samuel (WMF) (talk) 13:34, 13 June 2023 (UTC)Reply

Yes, I do mean it should be mentioned in the policy. However, it is not really a policy, but an example how it is possible to avoid contacting external domains when executing some tool or gadget business. Therefore it is not mandatory, just mentioned, at footnote level. If all tools and gadgets retrieve such resources from a WMF server privacy is a bit increased. Greetings --PerfektesChaos (talk) 19:35, 13 June 2023 (UTC)Reply

Ah, thanks for the explanation, PerfektesChaos. Since it's a quite precise example of best practices, don't you think it would be best to have it within the page Recommendations for gadget developers? Btw, that page is referenced directly in the policy under the required precautions. — Samuel (WMF) (talk)

Toolforge

Latest comment: 1 year ago5 comments4 people in discussion

Are toolforge.org and wmcloud.org considered a third party resource for the purposes of this policy proposal? If it is, then this policy proposal may be too broad. We should think very carefully before we try to start forbidding users from loading scripts and resources that are essentially hosted within our ecosystem. –Novem Linguae (talk) 16:57, 12 June 2023 (UTC)Reply

They should be permitted, if all code is open and the presented sources are those in effect.

If code is hidden I can write a logfile without notice from a broad community.
I can forward a message to an external site on every request including details. They can record the events out of WMF control.
By comparison of timestamp and details and public onwiki activities I can connect user account with technical details at CU level, even more verbose.

I can present a clean source at github etc. but the executed code is something different.

If all code of a task is public and under WMF control, community stuff at toolforge, wmcloud etc. is fine.

If hidden things may happen inside a toolforge resource this is as unsafe as a third party implementation and should be refused.

Greetings --PerfektesChaos (talk) 17:32, 12 June 2023 (UTC)Reply

As-written, yes, they would be. This is a problem, as a number of useful scripts load data from Toolforge APIs. AntiCompositeNumber (talk) 02:32, 13 June 2023 (UTC)Reply

Yeap. Breaking a bunch of tools that use toolforge APIs is a huge problem, and makes me disinclined to support this change. –Novem Linguae (talk) 15:28, 13 June 2023 (UTC)Reply

@Novem Linguae: that's a valid concern. As mentioned here, I want to stress that exploring exemptions, including for WMCS-hosted tools, is one of the goals of the current discussion. So I appreciate what you shared, especially about the tool's code being public as a potential requirement for exemption. For WMCS-hosted tools, you raised an interesting point regarding code being public yet different from what is deployed at say xyz.toolforge.org URL. To address that, the idea of keeping a list of tools and their repo on a central meta-wiki page to increase scrutiny was shared earlier. Do you have any opinion about some of this idea and the others shared by Msz2001 under this section — Samuel (WMF) (talk) 19:57, 13 June 2023 (UTC)Reply

Ratification process

Latest comment: 1 year ago1 comment1 person in discussion

What will the ratification process for this proposed policy be? Is there some manager at WMF that makes the final decision, will there be a community vote, etc? –Novem Linguae (talk) 16:58, 12 June 2023 (UTC)Reply

Any evidence of a problem so far?

Latest comment: 1 year ago11 comments5 people in discussion

Have there been any specific privacy incidents that led to the creation of this proposal, such as a malicious tool that was discovered to be harvesting user data? In other words, is it trying to fix an actual, demonstrable problem? –Novem Linguae (talk) 05:52, 13 June 2023 (UTC)Reply

@Novem Linguae Security/privacy should be premptive and not reactive imo. We probably shouldn't wait for something (anything) to happen. As it stands, we have no CSP which means that a compromised interface administrator or a malicious script developer can/could deanonymize anyone with a few targetted lines of JS/CSS (which unfortunately is fairly easy to write). The third-party scripts policy (and the associated CSP changes which I assume will be implemented) should help in making it harder for such a event to ever happen (at all).

Note: I do not agree with the changes in their entirety especially the exclusion of toolforge/wmcloud domains and localhost etc, but opposing/questioning the validity of this on the grounds that "nothing has happened yet" is a somewhat circular and dangerous argument. Sohom (talk) 15:08, 13 June 2023 (UTC)Reply

The whole user script system seems like it would be a security nightmare on the surface, but the number of actual incidents (counter-intuitively) seems very small (I've never heard of anything). These things are tradeoffs: usefulness of tools vs risk of problems. Since I love tools and they make me so much more efficient, I'd really like to see demonstrable problems before we go proactively interfering with them. –Novem Linguae (talk) 15:34, 13 June 2023 (UTC)Reply

@Novem Linguae there have been some instances, last I recall was with a compromised admin account before int-admin split and required 2FA for int-admins. — xaosflux ^Talk 15:16, 13 June 2023 (UTC)Reply

Thanks for that background. I wonder if there were any incidents involving third party tools breaking our privacy policy, such as a toolforge tool getting caught with code to log IPs. While admin account compromises are serious, I speculate this policy would not be able to fix those. –Novem Linguae (talk) 15:26, 13 June 2023 (UTC)Reply

@Novem Linguae I'm not sure about malicious ones, but external sites simply have different policies; for example on the English Wikipedia there is a gadget that sends information to Google Translate. It is not a default gadget, and it labeled as (this gadget uses) code from external (third party) systems not subject to the WMF Privacy Policy on the opt-in section. With external links, even if they had a good privacy policy when added - they could change their policies or practices whenever they wanted to. — xaosflux ^Talk 16:03, 13 June 2023 (UTC)Reply

@Novem Linguae: thanks for joining the discussion. Yes, on multiple occasions, UserJS and Gadgets loading external resources have resulted in security incidents (accounts of privileged users being compromised) and privacy issues (UserJS and gadgets sharing user information with external parties). While security incidents such as the one mentioned here are not detailed publicly for obvious reasons, you can get an idea of the external parties collecting user information through UserJS and Gadgets here. Also, I'll second what Xaosflux mentioned earlier, especially about the importance of preventing security and privacy risks, rather than trying to address them once they have already caused damage. More broadly, while it can seem far-fetched, it's good to keep in mind that privacy violations often mean real-life consequences for some individuals — harassment, identity theft, physical harm. So yes, the policy is trying to address an existing problem. While I agree with you that the policy alone may not fix that problem completely, by formalizing how TPR must be used in UserJS and Gadgets and promoting best-practices, we will end up in a much safer place than where we're right now — Samuel (WMF) (talk) 19:28, 13 June 2023 (UTC)Reply

@Samuel (WMF) while we have some different ideas where to start on scope, I think that Site scripts should absolutely be in scope for TPR. — xaosflux ^Talk 19:31, 13 June 2023 (UTC)Reply

Hey Xaosflux, I am sure you can tell that one of the challenges with this policy is avoiding scope creep. I mean, just look at the range of complexities that comes with regulating just UserJS and Gadgets relationship with TPR

. More seriously, TPR use in the Wikimedia ecosystem is a wide domain so it's crucial to tackle some specific areas rather than going for a one-size-fits-all approach. So, I hear your point but I am curious to know your rationale for expanding the scope to include Site Scripts. Are there particular reasons to that particular expansion? Thanks — Samuel (WMF) (talk) 23:02, 13 June 2023 (UTC)Reply

@Samuel (WMF) so to be clear, the intent is to have this apply to every personal user script, to every gadget, but it can just be ignored project-wide via Mediawiki:Common.js, Mediawiki:Vector.js or the like?? That seems like a much more important place to enforce something like this as it is force loaded to every user. — xaosflux ^Talk 23:20, 13 June 2023 (UTC)Reply

There is a hot war in Europe, a Wikipedia editor was put in jail for Wikipedia editing, people may be sentenced to death in some countries when their opinion is not shared by government.

Not only commercial data collectors, but some countries with direct connection of telecommunication providers and the police are active. If they know the IP address used for an edit a few minutes later they can tell you your bank account and the location of your flat.

We should not wait until people in Wiki environment are looking into gun muzzles to start improving security of our tools.

Apparently close to river Moskva some hundred excellent technicians are developing tools for various purposes.

Greetings --PerfektesChaos (talk) 19:54, 13 June 2023 (UTC)Reply

@@ Line 86: / Line 86: @@
 ::::::::::I don't have access to T296855, but the publicly available data is: There was a user (3 actually) who were banned for something related to T296855. One of those users created a user-space page, they then sent a message to an admin asking for advice on that page. 40 minutes later the admin's account is taken over and edits common.js. Over the next several hours, the user's subpage is deleted for having "Potentially dangerous code", the user is blocked for "Attempting to hijack other people's accounts", and one of the admin's local user scripts that they use, which previously hasn't been touched in months, is rewritten out of the blue, which happens to fix an XSS vulnerability in that script. I don't have all the details, but this seems to paint a pretty clear picture of a situation that would not be fixed by banning third party resources. [[User:Bawolff|Bawolff]] ([[User talk:Bawolff|talk]]) 21:54, 13 June 2023 (UTC)
 ::::::::::Regarding T296855, step 2 within the task description very clearly indicates how a third-party resource (hosted at github.io) was used as part of that attack.  That really isn't up for debate IMO.  One could potentially argue that it wasn't the _primary_ method used in the attack, but to argue that it wasn't a key means of obfuscating a payload incidental in the attack is incorrect IMO.  And I personally believe it is dubious to suggest that simply because an attack _could possibly_ be carried out via other means, that attempting to better contain certain avenues for abuse - which we've literally witnessed as methods within real-world attacks against the projects - isn't worthwhile.  @[[User:Bawolff|Bawolff]] - I would imagine the events of the flea incident from several years ago are still somewhat fresh in your mind, and many of those attacks involved the usage of external resources to obfuscate certain attack vectors, so there are indeed other real-world examples. [[User:SBassett (WMF)|SBassett (WMF)]] ([[User talk:SBassett (WMF)|talk]]) 01:35, 14 June 2023 (UTC)
-:::::::::::I agree that previous attackers have utilized externally hosted resources for their second stage command & control infrastructure. However i'm not aware of any that have used that as their way in. I'm also mindful that the industry in general has seen an uptick in supply chain attacks (solarwinds, the npm ecosystem issues come to mind). However, i don't think as written this policy does a good job of preventing those previous attacks wikimedia has experienced. They weren't the way in but things people did after they took control. By definition, malicious people are malicious, and don't follow policies. I don't see why they would follow this one. Now, perhaps the plan for this policy is to be a lead in to restrictive CSP. Which if implemented properly would reduce risk of external command & control. If so, then i think that should be talked about more in the policy. Ultimately i don't think these are the most compelling motivating examples - once the attacker is in, making their life a tad bit more difficult, well nice if free, doesn't quite seem worth all this. At the very least it seems like a much better first thing to focus on for gadget security would "unsafe-inline" removal. As far as third-party resources go, I personally find arguments related to privacy or even supply chain risk much more compelling as it directly impacts those risks. p.s. i should clarify that i'm not opposed to this policy in general, i just think the phrasing of it is so ambigious that there is no shared understanding of what it all means. [[User:Bawolff|Bawolff]] ([[User talk:Bawolff|talk]]) 04:02, 14 June 2023 (UTC)
+:::::::::::I agree that previous attackers have utilized externally hosted resources for their second stage command & control infrastructure. However i'm not aware of any that have used that as their way in. I'm also mindful that the industry in general has seen an uptick in supply chain attacks (solarwinds, the npm ecosystem issues come to mind). However, i don't think as written this policy does a good job of preventing those previous attacks wikimedia has experienced. They weren't the way in but things people did after they took control. By definition, malicious people are malicious, and don't follow policies. I don't see why they would follow this one. Now, perhaps the plan for this policy is to be a lead in to restrictive CSP. Which if implemented properly would reduce risk of external command & control. If so, then i think that should be talked about more in the policy (Just look at how many words have been spent over whether this policy should apply to user scripts, site js, MW extensions, WMCS etc. If the real goal here is to enforce technical measures to prevent external communication, then that settles all of those issues). Ultimately i don't think the aforementioned attacks are the most compelling motivating examples - once the attacker is in, making their life a tad bit more difficult, well nice if free, doesn't quite seem worth all this. At the very least it seems like a much better first thing to focus on for gadget security would "unsafe-inline" removal. As far as third-party resources go, I personally find arguments related to privacy or even supply chain risk much more compelling as it directly impacts those risks. p.s. i should clarify that I'm not opposed to this policy in general, i just think the phrasing of it is so ambigious that there is no shared understanding of what it all means. [[User:Bawolff|Bawolff]] ([[User talk:Bawolff|talk]]) 04:02, 14 June 2023 (UTC)
 :::::::{{ping|MisterSynergy}} Yes, the [[Third-party resources policy#Scope|current scope]] of the policy covers all UserJS and gadgets that connect to non-production websites, including WMCS-hosted tools since those have traditionally not been considered part of “production”. However, I [[Special:Diff/25124490|acknowledge the many concerns raised]] about forbidding all external resources and the burden it would place on the community. This is why I am calling for ideas about exemptions. For WMCS specifically, the point is specifying under what conditions we could allow gadgets and user scripts to connect to WMCS-hosted resources. What would those conditions (aka, exemptions criteria) be? Do those criteria provide users of Gadgets and UserJS with adequate security and privacy safeguards? As said earlier, the aim with those questions is to identify exemptions that could become part of the policy. — [[User:Samuel (WMF)|Samuel (WMF)]] ([[User talk:Samuel (WMF)|talk]]) 18:44, 13 June 2023 (UTC)
 ::::::::IMO the special situation about WMCS is that things over there ''are'' pretty transparent. It is a hosting platform where we (WMF staff as well as more than 2000 volunteer tool developers) can see which data is collected and how it is being processed. The ToU of WMCS that restricts data collection is also under our control. This is a fundamental difference to actual third-party resources outside Wikimedia servers where we really have no clue what is going on.<br>What matters most in my opinion is that the tool sources at WMCS are readable for other tool developers. This enables a large group of users to access tool sources for a review. By default this is the case (at Toolforge), but tool developers can restrict read access to themselves using chmod if they want to. However, except from secret files, logfiles, and possibly a few more cases, this should really be discouraged and possibly forbidden if the tool is accessed by an onwiki script/gadget. Source control is nice to have, but since there is no guarantee that the actual tool is using the exact sources from the repo, I am not sure whether this is an actual criterium here. —[[User:MisterSynergy|MisterSynergy]] ([[User talk:MisterSynergy|talk]]) 19:20, 13 June 2023 (UTC)