Jump to content

Artificial intelligence/Bellagio 2024

From Meta, a Wikimedia project coordination wiki
This is an archived version of this page, as edited by Guillaume (WMF) (talk | contribs) at 16:28, 22 February 2024 (add). It may differ significantly from the current version.

On February 19–23, 2024, a group of 21 Wikimedians, academics, and practitioners met at the Rockefeller Foundation’s Bellagio Center to draft an initial research agenda on the implications of artificial intelligence (AI) for the knowledge commons. We aimed to focus attention (and therefore resources) on the vital questions volunteer contributors have raised, including the promise, as well as risks and negative impacts, of AI systems on the open Internet.

We are optimistic that the use of machine learning and other AI approaches can improve the efficiency of operations on these platforms and the work of volunteers, and can support efforts to reach a new generation of readers and contributors. At the same time, we are concerned about the potential negative impact of the use of AI on the motivations of contributors as well as the misuse of these technologies to discourage volunteers and disrupt their work in the peer-produced knowledge commons ecosystem.

Below, we published the initial thinking on potential research directions that may eventually become a shared research agenda. Our hope is that many researchers across industry, government, and nonprofit organizations will adopt the final research agenda to help support and guide their own research. By focusing research efforts on topics that benefit the knowledge commons and help volunteers, our goal is to help inform and guide product development, public policy, and public opinion about the future direction of AI-related technology.

A note on AI ethics

The development, evaluation, deployment, and measurement of AI tools raise many ethical concerns—both in general and in the context of the knowledge commons. The project of articulating these risks and developing principles and guidelines to shape research in this area reflects both a significant effort and a critically important aspect of every part of the research agenda outlined here. Efforts to develop these principles and guidelines should be made in parallel with the research outlined here. Researchers engaged in any aspect of the work described here have a responsibility to consider the harms and impacts of their research. As ethical principles and guidelines are developed, they should be used to critically assess and shape all the work outlined below. As the work below is conducted, we hope that the results will also shape our knowledge of ethical research.

Research areas

This section is currently a draft.

This is a summary of potential research areas that the research agenda may eventually pursue. It represents some initial brainstorming and work that we are sharing here to gather early feedback and direction from Wikimedians, other knowledge commons communities, and researchers, with the aim of publishing a more stable agenda in March or April 2024.

The four potential research areas are:

  • Characterizing and monitoring use of AI in the knowledge commons over time
  • Developing AI tools for the knowledge commons
  • Evaluating the effect and impact of deploying AI tools
  • Empowering knowledge commons communities in the AI era

Characterizing and monitoring use of AI in the knowledge commons over time

Knowledge commons platforms are one of the greatest success stories of the Internet. As the latest wave of automation is sweeping through people’s digital work and life, there are concerns about the amount of disruption this may cause for knowledge equity around the world, for the communities of volunteers engaged in these initiatives, and the integrity of the knowledge they help create.

Robust current research on the extent of these changes is lacking. This lack of data makes it difficult for these communities (and their broader ecosystem of partners, supporters, and collaborators) to address current and potential harms or make the most of the new capabilities of foundational and frontier models. Used wisely, these hold the promise to address ongoing knowledge commons challenges such as community growth, contributors’ experience and content quality.

Proposed research directions

Current and future uses of AI

AI did not start with the launch of ChatGPT in November 2022. Many AI tools are already deployed in knowledge commons communities, and popular knowledge commons platforms like Wikipedia have employed the use of machine learning tools for more than a decade. However, our understanding of how actively these tools are used and how they can be improved is limited, especially when it comes to newer generative capabilities. We also lack understanding around whether contributors find such capabilities helpful, and around what measures are needed to empower all contributors to use them. We need to explore how AI could potentially lead to new ways for people to contribute, including those who are, for a variety of reasons, not currently part of these communities. Our “State of AI in the knowledge commons” research agenda could include:

  • A review of currently deployed systems, including (where available) quantitative and qualitative evidence of use and impact.
  • A survey of contributors’ experience and opinions of AI assistants, as well as broader issues such as their perceptions of how knowledge commons are used towards the development of AI models and applications.
  • A hub for AI assistants in use, how they work, and what they are for, including datasets, related resources, and ways for the community to provide feedback and contribute to their further development.
Contributors’ motivations

To attract new contributors and help make knowledge commons communities sustainable in the face of ongoing challenges, it is essential to deepen our understanding of the reasons people do or do not contribute. To have real impact, this research will need to be mindful of the diversity of existing and prospective contributors across the world, including countries and demographics that are currently underrepresented. Our assumption, supported by some evidence from platforms such as GitHub and StackOverflow, is that the mainstream availability of tools such as ChatGPT could fundamentally change both levels of participation and contribution practices with mixed effects. This means we will first need to revisit and refresh existing frameworks that have been used to study community motivations to consider the impact of AI assistants, informed by ongoing research in responsible AI, as well as an up-to-date account of contribution profiles. This would inform discussions in the form of workshops and other established community engagement means.

Community values and preferences around AI

AI has been, especially since the launch of ChatGPT, the subject of public debates and controversies in terms of capabilities, harms, and benefits. Experts have argued the importance of ensuring that AI models and tools are fair and equitable, and represent the values of those affected or using them to forge trust and adoption. This includes being mindful of underrepresented voices. While our theoretical and practical understanding of AI ethics is evolving, there are already examples of knowledge commons communities that have reacted to contact with AI in positive and less positive ways. (e.g., Classifying edits on Wikipedia, Reddit mods rebellion, DeviantArt AI policies, ArtStation “No AI Art” protest.) This research theme would study, learn from, and then build on these examples to design a survey to collect (recent) preferences from a range of knowledge commons communities, summarize common values, and identify areas where opinions go in opposite directions (over time). This could help communities manage their own expectations, make better choices on how to engage with the technology, inform policies on terms of use, and decide if and how they would use their position to influence positive change.

In a field as rapidly changing as AI, these activities would be carried out regularly, and the results analyzed over time would allow us to understand changes and future trends, for instance in the form of an observatory. This could be designed in an open, collaborative way to allow for new knowledge commons communities and the wider ecosystem to suggest new areas of inquiry or contribute new data points and research methods.

Develop AI tools for the Knowledge Commons

There are many places where AI can be used to improve knowledge commons processes or outputs. Research can aid in the development of new techniques and tools to do so. Tools can broadly be classified in two groups:

  1. Tools focused on content contribution that make contributing easier or more effective. This is important because maintaining the commons is simply too much work for too few people. AI can help boost productivity and content quality.
  2. Tools focused on content consumption that can improve user experience by making content discoverable etc.

The proposed research areas below are focused on the Wikimedia ecosystem, but can hopefully serve as an inspiration for other knowledge commons projects too.

Participants in the 2024 Bellagio symposium

(Listed in alphabetical order)

Get involved

Questions and comments on the proposed research agenda are encouraged on the talk page.