Human-Content-to-Machine-Data_Final - Flipbook - Page 31
The CC signals are intended to operate in a way that is separate from, and complementary to,
the CC licenses. When a CC signal is applied to CC-licensed content by the copyright holder,
the signal would grant copyright permission for the selected category of machine reuse
under different terms than the CC license. An AI developer could technically rely on either
the CC license or the CC signal if copyright permission is needed in that context or
jurisdiction. However, since the signal is designed speciûcally for machine reuse, it is likely to
be a more accurate reüection of the wishes of the rightsholder(s) in that context.
As this work continues, and before we move ahead with an implementable version of CC
signals, we will undertake an in-depth analysis of the potential interactions with the licenses,
and produce guidance on how they can be used together effectively.
Incentivizing Adherence by AI Developers
We recognize that CC signals will rely on willing participation by AI developers to adhere to it.
There are many reasons to be cynical about adherence, particularly when it is not legally
required, and there are and will always be bad actors. However, we see many reasons to
believe that uptake is likely.
For one thing, there is precedent. Although adherence hasn9t always been perfect, robots.txt
functioned for many years as a way to encode normative expectations about—and help
maintain the social contract for—machine reuse of content on the web. We also see the
success of CC licensing as evidence that voluntary buy-in is possible. While CC licenses are
built atop copyright law and therefore carry the weight of copyright infringement risk, in
reality they work because people have chosen to adhere to them. Litigation involving
enforcement of CC licenses is rare, and much of it involves litigants who are not operating in
good faith.103 Instead, there are now tens of billions of CC-licensed works available in the
commons because they are grounded in intuitive notions about what is fair and prosocial
when it comes to sharing and reuse of knowledge.
There are also clear reasons why rational actors should respect and adhere to preference
signals. As we9ve written earlier in this report, data from across the public web is a key
component in developing large AI models. If those developing AI do not respect the wishes of
creators, they risk eliminating incentives for people to share and widely distribute their works.
Over time, this will compromise the accuracy, safety and currency of the models and
services they build. This will be particularly acute for small ûrms, startups, nonproûts, and
103
Creative Commons. (n.d.). License Enforcement. Creative Commons.
https://creativecommons.org/license-enforcement/
31