Jump to content

Talk:Third-party resources policy: Difference between revisions

From Meta, a Wikimedia project coordination wiki
Latest comment: 1 year ago by Bawolff in topic This doesn't read like a policy
Content deleted Content added
re bawolff: risk tiering of WMCS resources
Tag: 2017 source edit
Line 134: Line 134:
::[[User:Bawolff|Bawolff]] ([[User talk:Bawolff|talk]]) 20:50, 5 June 2023 (UTC)
::[[User:Bawolff|Bawolff]] ([[User talk:Bawolff|talk]]) 20:50, 5 June 2023 (UTC)
:::{{ping|bawolff}} I have taken good note of the points you mentioned and will keep them in mind alongside the other comments shared in this thread while preparing the next update of the policy, thanks. re data - Sure beyond the popularity criterion, a risk tiering of these resources is important. This is the point in these questions about WMCS exemptions and the criteria attached to those exemptions. Having criteria that help establish why some resources are Low risk while others are High can help identify mitigations for a whole set of resources, both existing and future, depending on their risk tier — [[User:Samuel (WMF)|Samuel (WMF)]] ([[User talk:Samuel (WMF)|talk]]) 17:52, 8 June 2023 (UTC)
:::{{ping|bawolff}} I have taken good note of the points you mentioned and will keep them in mind alongside the other comments shared in this thread while preparing the next update of the policy, thanks. re data - Sure beyond the popularity criterion, a risk tiering of these resources is important. This is the point in these questions about WMCS exemptions and the criteria attached to those exemptions. Having criteria that help establish why some resources are Low risk while others are High can help identify mitigations for a whole set of resources, both existing and future, depending on their risk tier — [[User:Samuel (WMF)|Samuel (WMF)]] ([[User talk:Samuel (WMF)|talk]]) 17:52, 8 June 2023 (UTC)
::::i disagree that risk tiering is important or even viable (beyond treating executable content different from non-executable content, but that is really more a security thing than a privacy thing). Risk is context specific and cannot be evaluated outside of a specific context. Often there is an implicit context that makes sense, but it is unclear what that would be here. [[User:Bawolff|Bawolff]] ([[User talk:Bawolff|talk]]) 18:21, 8 June 2023 (UTC)


== Exemptions ==
== Exemptions ==

Revision as of 18:21, 8 June 2023

05 June 2023: Start of the policy conversation

Hello, feedback regarding the Third-party resources policy is welcome below this message. Feel free to use the initial questions below as a starting point for the conversation or bring your own. Thank you!

On behalf of the Wikimedia Foundation’s Security team — Samuel (WMF) (talk) 00:50, 5 June 2023 (UTC)Reply

Are risks sufficiently explained and relevant?

This section heading was created by Samuel (WMF) at 00:50, 5 June 2023 (UTC).Reply

A security-wise less educated user can probably not understand the scope of the risks from these explanations alone (it is e.g. not obvious what you can do through harvested cookies and session tokens). I suggest adding links to somewhere where the concepts can be discussed a bit more in depth (perhaps a tailored set of security pages). It could also be worth pointing out that anything the owner of the script can do can be done by an attacker who succeeds in bypassing their defences – a benevolent owner may still use naïvely coded building blocks for their own scripts or otherwise neglect security.

The section on user privacy and safety does not differentiate between what a tool does and what it could do when interacting with a third-party resource. I think the section should concentrate on information that leaks regardless on how the tool is coded, as that's the (implicit) rationale for the proposal. Other leakage can be mentioned, but as this may include anything that the third party could do on a direct connection, it doesn't need to be thoroughly covered here. The latter theme could be covered on a page on WWW safety.

LPfi (talk) 14:25, 5 June 2023 (UTC)Reply

Hi @LPfi: and thanks for these suggestions. I am taking good note of your point about including explanatory resources for less security-savvy and illustrating how code used "naïvely" can lead to security issues. On the other hand, the portion about leakage is not that clear to me. When you suggest that the section should emphasize "information that leaks regardless on how the tool is coded", do you mean that the content should describe what personal information could be leaked or are you suggesting something else? — Samuel (WMF) (talk) 17:41, 5 June 2023 (UTC)Reply
My point is that the gadget forwarding session tokens, user names and the like is likely pure bad programming, while you cannot avoid sharing your IP address, operating system etc. if the gadget makes your browser connect to a third-party resource (previously stored third-party cookies are also an issue, I don't know how those typically are or could be handled). The claim that "scripts connecting to third-party resources may also share [...] the device they are using, their browser information, and location" seems wrong. Scripts run in one's browser usually cannot connect to the third-party resource other than directly, which means these titbits are shared automatically, regardless of how the script is coded (I assume also one's location is automatically shared through the IP in most cases).
   Now, those facts cleared, what are the consequences? Is the device fingerprint enough to pair you with a user account on Google or Facebook? Does Google get that fingerprint through common designs of such resource interfaces? Does that mean that Google can (and does?) get to know your Wikimedia username by comparing your interaction with third-party resources to actions at Wikimedia sites? More generally: what should the user be afraid of when using such third-party resources (knowingly or unknowingly)? (Few sites have enough info on their own that leaked IP and device fingerprint would be an issue.)
LPfi (talk) 08:41, 6 June 2023 (UTC)Reply
@LPfi: many thanks for taking the time to illustrate your previous comment. I think those are good ideas for making that section of the policy much easier to understand and informative. — Samuel (WMF) (talk) 20:44, 6 June 2023 (UTC)Reply

Do the definitions and required precautions make sense?

This section heading was created by Samuel (WMF) at 00:50, 5 June 2023 (UTC).Reply

It seems the proposed policy would require all interaction with third-party resources to be done through interfaces offered by WMF production servers. For it not to cause disruption, these need to be offered. My understanding is that OSM tiles are nowadays fetched and cached by these. Are there other resources needing similar treatment?

For the restrictions to make sense, WMF needs to guarantee privacy when using production servers or possible external resources needed e.g. for participating in elections and surveys. I am afraid WMF shouldn't trust third parties in such matters, but instead guarantee that no sensitive information is sent to third parties.

LPfi (talk) 14:25, 5 June 2023 (UTC)Reply

@LPfi: I hear your point and I agree that, generally speaking, safeguards should be in place to ensure the privacy of users. Regarding OSM, I'd like to note that the scope of this policy is purposely limited to User Scripts and Gadgets that load third-party scripts. Unless I am wrong, OpenStreeMap does not fall within those categories. — Samuel (WMF) (talk) 19:09, 5 June 2023 (UTC)Reply
Yes. On OSM and the like: if there is a valuable resource out there, which is not available through official means, there is a big temptation to use it through gadgets and user scripts, which may be used by many and become a de facto standard (such as many of the Toolserver tools). If those tools are disallowed, there needs to be a mechanism to include access to the resources or a substitute through official means. I don't know what third-party resources are in such use, but this proposal suggests there is a non-trivial amount of such scripts and gadgets – unless the third-party resources are used just out of sloppiness.
Loading third-party scripts, fonts and stylesheets is a different issue. I detest that the practice is common on the web, and I have often argued that using free equivalents hosted on the web site itself should be the norm. If there is a real need for such resources, effort should be put in replacing them, probably in cooperation with the free software/open source movement in any non-trivial cases.
LPfi (talk) 09:32, 6 June 2023 (UTC)Reply
@LPfi: echoing some observations made on Phabricator, external resources that are most frequently loaded in User Scripts and Gadgets are translation-related resources (including Google Translate and Yandex), fonts, and a variety WMCS-hosted applications. —Samuel (WMF) (talk) 17:15, 6 June 2023 (UTC)Reply

How do you think the policy should be enforced?

This section heading was created by Samuel (WMF) at 00:50, 5 June 2023 (UTC).Reply

I have no idea how to enforce it, because it's not clear what this proposal tries to achieve. Judging from the discussions below, most people are confused about it as well, so this proposal won't be very effective if adopted as is. Since forever, stewards and global interface admins have been removing on sight any occurrence of code which would transfer our users' IP address to non-WMF servers in violation of their privacy.

By the way, if the Wikimedia Foundation is suddenly very interested in doing something to defend Wikimedia users' privacy from a growing influence of third-party software, it could start by stopping its reliance on proprietary software and SaaSS, such as GAFAM's and Interpol's mass surveillance/upload filtering software. Nemo 17:44, 7 June 2023 (UTC)Reply

When would it make sense to start enforcing the policy?

This section heading was created by Samuel (WMF) at 00:50, 5 June 2023 (UTC).Reply
  • What would that even mean? If you delete the fluff, all this policy says is "Gadgets and user scripts should not load third-party resources". There are no requirements. Maybe this is just a language misunderstanding and you don't mean should in the rfc 2119 sense, but regardless it doesn't seem like there is anything to enforce here. Bawolff (talk) 16:39, 5 June 2023 (UTC)Reply
  • @Bawolff: "should" here is meant as a requirement rather than a suggestion. So, I understand how this may be confusing. As asked a bit below, I'm definitely open to considering language tweaks to make it sound more forceful. With respect to enforcement, there are at least a few things worth considering. How do we make sure all the gadgets and user scripts currently involved in privacy issues comply with the policy requirements? What would enforcement look like (eg: page blanking, CSP automatic block, etc)? Should there be a grace period before enforcement? Should enforcement be done in a phased way, with most critical gadgets and scripts having a longer grace period etc? Those are some questions that could be explored. As a side note, I don't believe that referring to most of the draft policy as "fluff" brings any improvement to the content. Instead, it is just dismissive to the efforts put in proposing something to be improved :). — Samuel (WMF) (talk) 19:37, 5 June 2023 (UTC)Reply
    I agree that explaining risks and suggesting best practices is worthwhile. As the actual policy is already determined through the terms of use, one could indeed see all this as fluff, but I don't think we should nitpick on the status of this document.
     For the timing, I think the first thing to do is to check what scripts and gadgets there are in widespread use. Is there any sense in them not being MediaWiki features instead (Commons has a big problem in essential features whose maintainers would like to retire)? Does the WMF have the resources to take responsibility for their maintenance, and in what timeframe can the code be reviewed and any offending parts be fixed? Truly personal ones are probably less of a problem, even when they might compromise the privacy of their author – it is difficult to protect people from themselves.
     The scripts and gadgets that this policy should target are the ones that spread like DOS/Windows viruses: users sharing a sloppily written piece of code that nobody cares (or knows how) to check. In Windows this is a huge social problem rather than a technological one – best practices aren't socially acceptable. Here I hope the best practices are or can be made part of the culture.
     –LPfi (talk) 09:58, 6 June 2023 (UTC)Reply
    @LPfi: I am following up here on my earlier Special:Diff/25119150 where I shared that fonts and translation-utilities are among the external resources that are loaded the most in UserScript and Gadgets. There are also smaller categories like Facebook Connect, Google Analytics, etc and some WMCS-hosted tools (see tables). For most of those resources, there are reasons why some of they are not currently part of MediaWiki core (if this is what you meant by Mediawiki features). For example, fonts have been associated with performance issues in the past (Phab:T166138#7223384), Facebook Connect and Google Analytics are associated with user tracking and would be incompatible with Wikimedia policies. Of course, not all external resources fall within those top categories, in particular WMCS-hosted resources. In that sense, what you mentioned about enforcing the policy by focusing first on the most harmful external resources sounds quite interesting. In line with that, would it make sense to have some criteria for establishing what makes an external resource more harmful than another? . — Samuel (WMF) (talk) 17:58, 6 June 2023 (UTC)Reply

Should WMCS-hosted resources be considered third-party resources?

This section heading was created by Samuel (WMF) at 00:50, 5 June 2023 (UTC).Reply

I assume that some of the resources are essential tools for part of the community, while coding tools isn't restricted to highly trusted users. For the proposed policy to make sense, tools that aren't scrutinised and controlled by trusted users must be regarded as third-party ones, while regarding all the tools as untrusted would cause severe disruption. –LPfi (talk) 14:25, 5 June 2023 (UTC)Reply

I think you need to draw a distinction between tools that are on a separate website (that you choose to go to) and tools that get embedded into some sort of user script. The former is 99% of tools and the privacy analysis between the two situations are quite different. Bawolff (talk) 17:19, 5 June 2023 (UTC)Reply
True. There is a big difference. The problem is where there is no real choice, as you need to use the tools to get things done, which means that the editors who use them aren't protected by the policy. That exposure should be minimised, and the true extent of exposure you get by using a tool should be made clear. If I make a Google search, I know Alphabet will use the info. When I search for a book on the library website, I know I tell the library, but I don't like them having me load scripts from Google just because they couldn't be bothered to use a free platform. –LPfi (talk) 10:08, 6 June 2023 (UTC)Reply
@Bawolff and LPfi: I agree that the distinction is important here. Tools that are on a separate website but not embedded in gadgets or user scripts are outside the scope of this policy. For external resources that are embedded into gadgets and user scripts the "scrutinised and controlled by trusted users" seems like an interesting point (others will be needed as well) to identify resources that are not too risky, hence policed potentially differently. That being said, I am not aware of any process for actively monitoring user scripts and gadgets. Maybe Special:Gadgets? — Samuel (WMF) (talk) 19:49, 6 June 2023 (UTC)Reply
Disallowing all WMCS tools would kill innovation. There are things that cannot be coded directly in gadgets (e.g. those that require access to the Wiki Replicas or can be released only under a free license incompatible with CC BY-SA), or are much harder to do in gadgets (e.g. those that need libraries that are written in a language different from JavaScript). Getting these be extensions is probably prohibitively hard in most cases. Maybe there could be a site where only “trusted users” (e.g. all local interface admins and global interface editors, as well as users considered trusted who don’t happen to have one of these two rights) can put code, but in a quick self-service manner, similarly to how gadgets and Toolforge tools work. Or instead of relying on people being trusted, it could be behind a WMF-controlled proxy that strips all data that could breach users’ privacy (e.g. don’t include the X-Forwarded-For header, don’t forward the User-Agent header and cookies). (Being a new site, maybe it could be required that all code MUST be on gitlab.wikimedia.org to provide an edit history similar to wiki pages, but with code review being only OPTIONAL part of the process.) —Tacsipacsi (talk) 10:05, 7 June 2023 (UTC)Reply

There is wikitech:Wikitech:Cloud Services Terms of use which imposes restrictions on what tool developers are allowed to do on WMCS. Has this been taken into account? —MisterSynergy (talk) 23:31, 7 June 2023 (UTC)Reply

Hi @MisterSynergy: and thanks for joining this conversation. Although wikitech:Wikitech:Cloud Services Terms of use is a distinct policy and scope, its context was taken into account as part of the work on this TPR policy. — Samuel (WMF) (talk) 23:50, 7 June 2023 (UTC)Reply
Okay thanks.
I am still trying to figure out why WMCS is considered to be a third-party resource in this context. If WMCS is used properly, minimal to no PII is being exposed to tools hosted on WMCS; furthermore, WMF clearly has control over that platform and can, at least theoretically, monitor what tool developers are doing over there.
It seems difficult to draw a line that separates harmful from acceptable usage of WMCS via onwiki JS scripts, particularly as long as it is not clear what the actual problem with WMCS is. Given how important this platform is for editing Wikimedia projects, there should definitely be a path to make use of it. —MisterSynergy (talk) 08:14, 8 June 2023 (UTC)Reply

Examples of what this will end

Would be good to clarify what features this policy will end. For example we have the option for relief maps on WikiVoyage which notifies users that to activate it will share details with an external service.[1] I imagine this policy would end that? These relief maps are excellent especially for hiking.

Additionally we have a great deal of functionality on the wmcloud which could than become less usable, which IMO would be unfortunate. Doc James (talk · contribs · email) 10:57, 5 June 2023 (UTC)Reply

Yes, I'm confused on how OpenStreetMap falls into the "Risks" category. SHB2000 (talk | contribs) 11:48, 5 June 2023 (UTC)Reply
Hello @Doc James and SHB2000: and thanks for joining the conversation. The scope of the policy is purposely limited to User Scripts and Gadgets that load third-party scripts. It is my understanding that the Wikivoyage and OpenStreeMap examples you mentioned do not fall within those categories but I am glad to be proven wrong. That being said, it's worth noting that gadgets and scripts sharing user information with third-parties has always been a practice in conflict with both the Terms of Use and Privacy Policy (see Purpose section of the draft policy).
With respect to Wikimedia Cloud Services-hosted resources, you are raising a valid concern. Recent data-driven observations have surfaced that many of those resources conflict with the Wikimedia Privacy policy and Terms of use but are also helpful to various editing activities. This is why this policy conversation also explores the question of whether some WMCS-hosted resources should still be considered "third-parties" (see WMCS section above). The aim there is to evaluate avenues to enable some of those important editing-related resources while providing decent security and privacy guarantees to end-users. Any thoughts you may have in that direction are obviously welcome.
Samuel (WMF) (talk) 15:20, 5 June 2023 (UTC)Reply
i think it is a mistake to only include user scripts and gadgets here (presumably you are also including common.js which is neither a user script nor a gadget). Rules for thee but not for me type policies tend to trigger resentment and the privacy concerns don't change depending on who is writing the code. Bawolff (talk) 16:27, 5 June 2023 (UTC)Reply
@Samuel (WMF) I read the tickets and the collected statistics do not seem to differentiate between site resources, gadgets and user scripts. Notably MediaWiki:Kartographer.js used on Wikivoyage exactly as described, seems to be included in this collected dataset. —TheDJ (talkcontribs) 09:02, 6 June 2023 (UTC)Reply
Hi @TheDJ: as part of the methodology used to gather that data, gadgets were looked up within the User and MediaWiki namespaces. Due to the multilingual nature of Wikimedia projects, filtering by namespace and looking up the page content were the most functional approaches found. This can explain why some noise may show up in the data. Also, the point was to highlight the external resources involved in top recurrent CSP violations that involve gadgets and user scripts. So, a certain level of noise was acceptable. Would you have any tips for differentiating between those types of scripts? — Samuel (WMF) (talk) 16:33, 6 June 2023 (UTC)Reply
Exactly what passage of the Privacy policy or the Terms of use do gadgets and scripts conflict with if they contact with third-party services, but only with explicit user consent? —Tacsipacsi (talk) 10:09, 7 June 2023 (UTC)Reply
Hi @Tacsipacsi: and thanks for joining the conversation. When you are talking about gadgets and scripts that "contact third-party services, but only with explicit user consent?" are you referring to user scripts and gadgets that display a type of privacy notice to users? — Samuel (WMF) (talk) 14:13, 7 June 2023 (UTC)Reply

This is a step backwards on our strategic goals

I'm concerned with this proposal, and I have some reasons. Graphs have been broken for more than a month now, and it doesn't seem that they are going to recover, losing opportunities to show things in a more interesting way. Some efforts we have made to show Ourworldindata maps and graphs are also stopped at "security review". Some years ago, we launched (and invested lot of money on) a system that would read the articles in some languages but it never succeeded because of the same reason.

And I would accept that having third-party resources is not the best option if we had product team working on those features. But this isn't happening and doesn't seem it will happen in the future. Some years ago we were obsolete. Now we are paleolithic, a relic for archaeologists of the Internet. Meanwhile, other platforms are advancing by the day on new features. Every day we are further from our 2030 strategic goal. Theklan (talk) 11:18, 5 June 2023 (UTC)Reply

I don't know what the article reading thing was, but security was hardly the only reason ourworldindata didn't happen, just the first hurdle it had to face. Besides technical concerns, the elephant in the room is that it was very unclear if enough users wanted it to make it worth it. Bawolff (talk) 17:00, 5 June 2023 (UTC)Reply
Let alone who was going to do the amount of work to make it viable within the WMF setting (ddos strengthening, privacy isolation, security, parsoid and VE support, cache consistency and purging). We are further away from 2030 goals, but this is not something unexpected. I and many others have been warning from the start of that whole process that we were already overstretched and that implementing the 2030 goals would require at least tripling the tech department, if not more. This was ignored for a long time and we have had several crisis with code maintenance as of late that clearly surfaced how fragile many elements were. —TheDJ (talkcontribs) 19:41, 5 June 2023 (UTC)Reply
@Theklan: thanks for voicing your concerns through this discussion. However, OWID review and the Security team's workflow more generally are outside the scope of this policy. — Samuel (WMF) (talk) 20:14, 6 June 2023 (UTC)Reply
The problem is that, as TheDJ points, most of the things we need are not developed. If third party content is not possible and, at the same time, we are not creating those things, the obsolescence will only grow. Theklan (talk) 20:42, 6 June 2023 (UTC)Reply
@Theklan: this is a valid concern. For the specific context of user scripts and gadgets, there is an exploratory reflection on if and when some WMCS-hosted resources could be treated differently than the other third-party resources. If you have some ideas in that area, please share under Should WMCS-hosted resources be considered third-party resources?. — Samuel (WMF) (talk) 20:58, 6 June 2023 (UTC)Reply
There should be a Product department just making the product better. Is not what we consider as third-party resource: is how we do to have first-hand resources for our needs. Theklan (talk) 21:01, 6 June 2023 (UTC)Reply
Ok, I'm going to be frank here.... Other than Sam, most of the security team knows jack shit about Wikipedia and the people who built it. Most of the issues this proposals deals with, would not be an issue if the foundation had invested properly in tools and content that people had been asking for. Instead now we DEPEND on the external stuff and you are taking it away. This is not a technology or security issue you now have, but a social issue that you need to negotiate. Saying "sure but that's not our problem, its another team" is NOT the appropriate response. —TheDJ (talkcontribs) 08:04, 7 June 2023 (UTC)Reply
To be clear, by "Sam" are you referring to Reedy? Nardog (talk) 09:26, 7 June 2023 (UTC)Reply
Yes. --MZMcBride (talk) 16:16, 7 June 2023 (UTC)Reply

This doesn't read like a policy

Policies are rules that people are obliged to follow or operating procedures. This just has some suggestions and best practises. If the document has no rules it is a guideline not a policy. I think this makes the intended purpose of this document confused.

Other comments:

  • i think the way this is using the term third party is confusing and misleading. The word third party makes people view this through the wrong lense. Often in other documents that refers to the authorship of the resource not the hosting provider (consider also external resources that are proxied or locally made resources hosted on github). I would suggest using the word "externally hosted" instead.
    • Additionally, this desperately needs to clarify where non-production resources (toolforge) falls (and any other things hosted by wmf that are not covered by the main privacy policy). Normally these would be considered external since the normal privacy policy doesn't apply.
    • it should maybe clarify if wmf can use such resources if contractual obligations are in place to ensure the privacy policy is followed.
  • it should clarify if the user is allowed to consent to such services (say if enabling a gadget) and under what circumstances consent is allowed as an escape hatch (my 2 cents is it should be acceptable in optionally enabled gadgets/user scripts but not in anything global enabled. But there is already an exception at wikivoyage)
  • i would suggest replacing the term end-user with user. End-user sounds a bit dehumanizing, but maybe that is just me.
  • this should probably emphasize not just explicit persinal info but incidental collection. E.g. when the external resource doesnt collect anything but their hosting provider logs ip addresses, etc.
  • re xss. This probably underplays the risks of xss a bit. However at the same time it is way too close to implying that xss is the only privacy risk here, which i think is misleading.
  • i think the "consider alternatives" section should be removed. I dont think it fits with this document; how to make a user script belongs elsewhere. Second it sort of implies that once you have considered alternatives you can use third party resources if no first party stuff does what you need. Third - realistically it is going to be pretty rare that that there is a local resource doing the exact same thing, especially when it comes to proprietary apis.

Bawolff (talk) 16:23, 5 June 2023 (UTC)Reply

I'd like to second the first point. In my reading, despite being labelled a "policy", this document does not require anyone to do anything, it only offers suggestions. It needs to be rephrased or renamed, depending on what was intended, before we can really discuss its contents. Matma Rex (talk) 18:29, 5 June 2023 (UTC)Reply
Hi @Bawolff and Matma Rex: and thanks for joining the discussion. I'd like to address some of the points shared while continuing to give the others some thoughts.
  • While I get the part about the text not reading like a policy, it is my understanding that the Required precautions section contains firm requirements such as "Gadgets and user scripts should not load third-party resources". Do you have any suggestions to tweak the language in a more assertive way perhaps?
  • I am not sure I get the portion about underplaying XSS. Do you think those risks should be emphasized, explained differently?
  • The "third-party" term was chosen to align more with the terminology used across the industry, especially within the Third-party risk management ecosystem.
  • WMCS-hosted resources are currently considered third-party resources. That being said, recent data-driven observations make it worth asking if and when some of those resources should be considered differently, hence the open questions at the top about WMCS.
Samuel (WMF) (talk) 18:55, 5 June 2023 (UTC)Reply
I suggest "must" instead of "should" and "could" if these are indeed requirements. Matma Rex (talk) 19:24, 5 June 2023 (UTC)Reply
Yeah, "avoid" is weak. It's a recommendation. It's a form of feedback from the Security team to the community, not a matter that requires feedback in return. ToBeFree (talk) 19:56, 5 June 2023 (UTC)Reply
  • re "aligning with industry" - i think that is the major problem here. Most of the parts of this policy that don't make sense here would make total sense as a part of a corporate policy designed to mitigate risk of third party components and supply chain attacks. The weaker language would make total sense in a corporate context where the assumption is that everyone has the same goals, people get fired if they play too fast and loose with doing what they "should" be doing, but exceptions still happen and there is a management chain to accept risk when things deviate from their ideal form. However that is not the context we are in, a top down policy from wmf is more like a law. If it doesn't set out specific binding requirements nobody will listen.
  • Similarly i think "third-party" is confusing here because the context and threat model is subtley different from how most of the industry talks about such things. Not to mention the number of parties involved and their relationships.
  • regarding wmcs - i dont think that data is very meaningful by itself. Clearly there is an apetite to access external resources, but just from the data we don't know in what context the external api was called, what consent was obtained. More importantly we don't know what expectations the users have or how that varries between different groups, including vulnerable groups. Something being popular is just one part of the analysis, we should also analyze what the risks are, which are acceptable, which can be mitigated, etc.
  • re xss. To a lay person stealing session token sounds like some obscure piece of metadata, better to focus on actual consequences (e.g. turn your account into a vandal bot). But i would also talk about how for certain vulnerable people leaking the wrong person's ip to the wrong people could in the extreme case result in people dying. Privacy needs are extremely person specific.
Bawolff (talk) 20:50, 5 June 2023 (UTC)Reply
@Bawolff: I have taken good note of the points you mentioned and will keep them in mind alongside the other comments shared in this thread while preparing the next update of the policy, thanks. re data - Sure beyond the popularity criterion, a risk tiering of these resources is important. This is the point in these questions about WMCS exemptions and the criteria attached to those exemptions. Having criteria that help establish why some resources are Low risk while others are High can help identify mitigations for a whole set of resources, both existing and future, depending on their risk tier — Samuel (WMF) (talk) 17:52, 8 June 2023 (UTC)Reply
i disagree that risk tiering is important or even viable (beyond treating executable content different from non-executable content, but that is really more a security thing than a privacy thing). Risk is context specific and cannot be evaluated outside of a specific context. Often there is an implicit context that makes sense, but it is unclear what that would be here. Bawolff (talk) 18:21, 8 June 2023 (UTC)Reply

Exemptions

I think there should be some exemptions: — xaosflux Talk 16:25, 5 June 2023 (UTC)Reply

Personal exemptions

There should be an exemption for user's personal script files (e.g. Special:MyPage/vector-2022.css), these are self-managed and only apply to yourself. — xaosflux Talk 16:25, 5 June 2023 (UTC)Reply

Perhaps you get an on/off to accept these in your prefs... — xaosflux Talk 18:54, 5 June 2023 (UTC)Reply
The problem is that users use eachothers css/js files.. and only one of those users needs to have sysop, interface admin or checkuser rights to create complete chaos. —TheDJ (talkcontribs) 19:43, 5 June 2023 (UTC)Reply
which is probably a good argument to not conflate non-security privacy issues (non-xss) and security (xss) together in policies. Users can opt in to lower privacy and only affect themselves. They can't opt in to low security without affecting others. Bawolff (talk) 19:49, 5 June 2023 (UTC)Reply
Users use other people's browser extensions too, but in both cases they are choosing to do that themselves, to themselves. — xaosflux Talk 19:59, 5 June 2023 (UTC)Reply
Regarding users loading other user's scripts: that doesn't require a third party to cause a potential problem; if you load someone else's script you are subject to anything they put in there, the bad code can be local (as this discussion is about third party resources, not about prohibiting all local scripts as well). — xaosflux Talk 20:04, 5 June 2023 (UTC)Reply
Hey @Xaosflux: and happy to read your inputs here. I may be wrong but it seems to me that the personal exemption you described does not fall within the remit of this policy since the scope is gadgets and user scripts hosted outside Wikimedia production websites, not personal on-wiki scripts. So, there is no need to include any "personal exemptions" in the policy. Is there a point I am missing? — Samuel (WMF) (talk) 21:52, 5 June 2023 (UTC)Reply
@Samuel (WMF) no, what I'm saying is that personal userscripts should be exempt from this whole thing. They are self-managed, and your self-management can only impact yourself. If you choose to trust another that should be up to you - just like you can skip this whole thing and just put your userscript in your browser directly -- this just makes it more annoying for anyone that works on multiple browsers. Someone above made the argument that User:X could load User:Y's script, and User:Y could be loading EternalSite:x's script as some sort of special worry-- but user:x can just install SpywareBrowserExtension- if they choose to trust someone else that's what happens. — xaosflux Talk 22:17, 5 June 2023 (UTC)Reply

Project exemptions

There should be a process for a project to determine if a TPR may be acceptable (such as by publishing a gadget), as per-user opt-in only. Some projects innovate with external resources such a mapping or translation services that there are no sufficient internal replacements for. — xaosflux Talk 16:25, 5 June 2023 (UTC)Reply

In a world with SUL it is very difficult to effectively isolate these sorts of things per-project Bawolff (talk) 17:01, 5 June 2023 (UTC)Reply
@Bawolff there are no global gadgets, so if these are opt-in only, you'd have to opt in at each project. Agree that a default-on gadget would affect unknowing users. — xaosflux Talk 18:50, 5 June 2023 (UTC)Reply
if each individual user is opting in, i think that is workable. For default enabled though, there is enough cross-wiki integration in wikimedia (not to mention everything being cors linked) i would worry that the user's privacy could potentially be compromised in some circumstances even if they never intentionally use the project in question. I guess it would matter a lot on specifics. Bawolff (talk) 18:58, 5 June 2023 (UTC)Reply
@Bawolff 100%! I'll clarify that above. — xaosflux Talk 19:02, 5 June 2023 (UTC)Reply

Proxy servers

Currently, https://fontcdn.toolforge.org/ is a proxy server for the Google fonts server. Would anonymizing proxy servers such as these fall under the category of third-party resources that could not be used? isaacl (talk) 17:33, 5 June 2023 (UTC)Reply

i don't know what this policy's take on it is, but the historical answer is that it is not allowed e.g. see discussion at https://phabricator.wikimedia.org/T209998 Bawolff (talk) 17:43, 5 June 2023 (UTC)Reply
Thanks for the pointer to the Phabricator ticket. I'm unclear then why the proxy server is hosted on toolforge, if using it for Wikimedia-hosted pages or tools isn't currently considered appropriate. isaacl (talk) 20:52, 5 June 2023 (UTC)Reply
it is considered appropriate for toolforge tools but not the main site. The main site is covered by a much stricter privacy policy than toolforge is. The historical view is that private data should not be transfered from main site to toolforge, as otherwise it would be a pretty big loophole if one could get around the privacy policy by just transfering the data to a wikimedia site not covered by the (main) privacy policy. Bawolff (talk) 21:06, 5 June 2023 (UTC)Reply
Hi @Isaacl: and thanks for taking part in the discussion. Since WMCS is not part of Wikimedia production websites, the WMCS-hosted resources are currently considered third-party resources. However, this conversation is part of an exploratory process and thoughts on if and when WMCS-hosted resources should be considered differently are welcome. — Samuel (WMF) (talk) 22:37, 5 June 2023 (UTC)Reply
One of the many flaws of this proposal is that it lists Complete list of Wikimedia projects as production websites. Based on that definition https://fontcdn.toolforge.org/ and https://maps.wikimedia.org/ fall in the third-party category, but it's no problem to include content from http://www.wikimedia.org.ph/ and https://translatewiki.net/wiki/. Multichill (talk) 21:28, 7 June 2023 (UTC)Reply

Localhost

Please do not use CSP headers to block http://localhost and http://127.0.0.1; this "external" site is under the user's control and is useful for script development. Suffusion of Yellow (talk) 18:17, 5 June 2023 (UTC)Reply

Hello @Suffusion of Yellow:, and thanks for raising this concern here. However, CSP rules are outside the scope of this policy discussion. You can look at this Phabricator ticket to see if it is a more suitable avenue for your suggestion. — Samuel (WMF) (talk) 21:05, 5 June 2023 (UTC)Reply
Ok, so is this is going to be a "social" policy, with no technical measures to prevent the fetching of external resources? Or with this be prevented by some other means? Either way, please don't define localhost as a "third-party resource" even though it's "located outside Wikimedia production websites". This would be equivalent to policing what software I'm allowed to have on my computer. Suffusion of Yellow (talk) 17:04, 6 June 2023 (UTC)Reply
@Suffusion of Yellow: by "social" do you mean that the policy will be enforceable by the community, as here for example? Then yes, although it is good to note there are ongoing plans for technical measures to prevent the fetching of external resources (see phab:T135963). That being said, ideas for enforcing this policy more specifically are welcome. I hear your other point about localhost URLs being part of your personal space. In that case it makes sense to think more broadly about those locations since other users may use custom domain names, local IP addresses, etc. In the event that exemptions are considered as part of this policy, what might be a good way to identify those local resources? — Samuel (WMF) (talk) 18:49, 6 June 2023 (UTC)Reply
Expanding on a suggestion by Bawolff in that task, a user preference, maybe? That is, I go to Special:AllowedThirdPartySites, get asked for my password, then add "127.0.0.1", "foohostname.barnetwork", and so on. Suffusion of Yellow (talk) 19:06, 6 June 2023 (UTC)Reply

New wiki for code incompatible with the CC-BY-SA?

Any thought of creating a new SUL-connected wiki, e.g. https://code.wikimedia.org/, not subject to the CC-BY-SA (but still requiring some sort of free license) where we can store GPL-"contaminated" user scripts? Then there would be no need to choose between reinventing the wheel, and storing such code on an "third-party" site. Suffusion of Yellow (talk) 18:33, 5 June 2023 (UTC)Reply

Personally i think we should have a specific gerrit (or gitlab) repo for this, not subject to the normal CR requirements (that is then automatically deployed to the site). MediaWiki isn't a fun source code management tool. Bawolff (talk) 18:38, 5 June 2023 (UTC)Reply
@Suffusion of Yellow: would this be place where users scripts would be "scrutinised and controlled by trusted users" as in LPfi's point? — Samuel (WMF) (talk) 20:32, 6 June 2023 (UTC)Reply
If we go with my suggestion, then it would handled just like we do know; a few "trusted" users can manage scripts like https://code.wikimedia.org/MediaWiki:example.js, but anyone can put what they want in their userspace, e.g. https://code.wikimedia.org/User:Suffusion_of_Yellow/example.js. The only difference would be with the copyright disclaimers. Suffusion of Yellow (talk) 20:45, 6 June 2023 (UTC)Reply
I like this idea, as I think it would mostly serve as a CDN or sorts, thus consolidating "external" code and making such resources easier to find, manage and audit. I don't like the idea of using SCM like Gitlab or whatever as a CDN in this case; I feel like those interests should be kept separate when feasible. Of course we should encourage anyone developing userJS or Gadget code to use modern SCM and appropriate CR processes. SBassett (WMF) (talk) 17:10, 8 June 2023 (UTC)Reply

This is fundamentally flawed

Sorry, but it seems clear from what you've written here and your replies above that you don't really understand how things work on the wikis. You wrote "should" but claim you meant "must", you don't seem to have a clear understanding of what a "user script" is, you've made contradictory statements about CSP, and so on. IMO you should scrap this, get some advice from people who understand the ecosystem you're trying to change and the language used by people in that ecosystem, and then come back with a fresh proposal once you better understand exactly what you're proposing. Anomie (talk) 02:58, 6 June 2023 (UTC)Reply

Hi @Anomie:, the policy draft and the broader consultation process come after preliminary rounds of discussions during which "advice from people who understand the ecosystem" was gathered: interface admins, gadget and user script authors, Wikimedia developers, staff, and various long-term contributors. The aim of this discussion is to improve the draft by exposing it to a much larger audience and benefitting from a bigger pool of inputs. If you have specific areas of improvement such as the language and explanations that seem contradictory, feel free to elaborate on them. — Samuel (WMF) (talk) 10:20, 6 June 2023 (UTC)Reply
I'm not qualified to speak on the technical side, but I am sceptical that a sufficient consultation was done with those who know the ecosystem and how to write these things with the should/must split and much of the language reading as suggestions not obligations. Those are fundamental concepts of any wikimedia policy writing, not merely a matter of better language. Nosebagbear (talk) 15:24, 6 June 2023 (UTC)Reply

Day 1 thank you

Hi everyone, I want to say thank you for the initial comments! There were some good recommendations around the policy language, questions on the scope of the policy, and ideas about exemptions, especially regarding some WMCS-hosted resources. The Security team will continue to review those inputs and the new ones as they come in with the aim of updating the policy draft iteratively. — Samuel (WMF) (talk) 09:53, 6 June 2023 (UTC)Reply

This is such a confusing way to structure a talk page. If you're preemptively creating sections, please comment and sign right under each heading. Otherwise it looks like the first commenter created that section. Nardog (talk) 12:35, 6 June 2023 (UTC)Reply
Hi @Nardog: thanks for taking the time to improve the page structure and headings. This is much appreciated. — Samuel (WMF) (talk) 20:17, 6 June 2023 (UTC)Reply

Please do not take away our ability to use third-party scripts

There are many legitimate reasons to load scripts from third-party sources. While I understand the potential risks, I believe they do not warrant the creation of such a heavy burden on us MW script developers.

For understandable reasons, useful third-party JS libraries are repeatly being removed from MW core, often without satisfactory alternatives. For example, just recently the jQuery tipsy library was removed, and the extremely useful jQuery.ui library is likely to face a similar fate. For volunteers like me who do not have enough time to learn the OOUI library in depth, taking away our ability to remotely load trusted & stable JS libraries would be disastraous.

As of 2018, the ability to edit user scripts that affect other users was heavily restricted to a very tiny group of editors, interface administrators. These editors are required to use a strong password for their accounts, as well as enable 2FA. I believe this restriction is tight enough, since interface-admins are usually experienced MW script writers. We should count on them that they do not load external scripts from unfamiliar or untrusted sources.

TL;DR: As a MediaWiki user script developer, I believe that imposing a restriction on the usage of external JS scripts and libraries would be a very big step backwards. I have written numerous user scripts that I do not see myself writing without relying on secure, trusted third-party sources. If the WMF wants to improve the security of users, I really hope a different approach will be taken, one that does not make the work of volunteers harder. Guycn2 (talk) 18:17, 7 June 2023 (UTC)Reply

08 June 2023: Summary of the discussion so far

Hi everyone, and thank you for your continued contributions to this conversation. In addition to the responses already given here, I’d like to acknowledge some of the key points you’ve raised so far, offer a few clarifications, and propose some next steps.

  • Concerns were raised about the policy language, some requesting that the policy be written using the conventions of RFC 2119, replacing "should" with "must", etc. Also, there were detailed suggestions about ways to make the policy content more accurate, clear, and understandable, especially around the risks facing users. The feedback on the language and content will inform the next update of the policy draft.
  • Confusion was expressed with respect to the scope of the policy, in particular whether the policy means the end of all non-production tools. It’s good to note that the policy scope is deliberately limited to user scripts and gadgets loading non-production resources.
  • It was repeatedly flagged that disallowing all external resources in gadgets and user scripts would place a significant burden on the community. Some external resources, in particular, those hosted on WMCS, are important for community autonomy and usually lack alternatives on production websites. As a result, suggestions around exemptions were shared: exempting WMCS resources based on how harmful they are, allowing anonymizing proxies, exemptions on an opt-in basis, user-level or project-level exemptions, etc.

In line with the last point, I am adding a new section labeled Exemptions criteria. The aim there is to identify exemptions that could become part of the policy. Under what conditions could external resources be embedded in gadgets and user scripts? When could the exemption be revoked? How to ensure that exempted resources do not compromise users' security and privacy? Based on this data, are some resources outright not acceptable? Those are initial questions in that direction. Let me know what you thoughts are. Also, please tell me if some points were not captured in the summary or should be moved down here. — Samuel (WMF) (talk) 00:35, 8 June 2023 (UTC)Reply

Exemptions criteria

Here are initial questions for the section discussion: Under what conditions could external resources be embedded in gadgets and user scripts? When could the exemption be revoked? How to ensure that exempted resources do not compromise users' security and privacy? Based on this data, are some resources outright not acceptable? Samuel (WMF) (talk) 00:35, 8 June 2023 (UTC)Reply

I think a rather clear case of what shouldn't be exempt would be any Default Gadget, or Site Script (e.g. skin.js, common.js) - and perhaps instead of carving out exemptions the TPR is just scope limited to those areas? — xaosflux Talk 17:17, 8 June 2023 (UTC)Reply