OpenAI has been developing a system for watermarking text generated by ChatGPT and a tool to detect such watermarks for about a year. The Wall Street Journal reports that the company is internally divided over whether to release this feature. On one side, watermarking is seen as a responsible measure, particularly to help educators identify AI-generated assignments. On the other hand, concerns arise over potential impacts on the company's profitability and user satisfaction.
The watermarking method involves subtly adjusting the model's word prediction process to create detectable patterns. OpenAI asserts that this technique does not degrade the quality of the generated text. Survey data commissioned by the company shows a global preference for an AI detection tool, with a four-to-one margin of support.
OpenAI confirmed its watermarking efforts in a blog post, noting its method's high accuracy (99.9% effective) and resistance to tampering, including paraphrasing. However, the company acknowledges that rewording by another model could easily circumvent the watermark. Additionally, there is concern that watermarking could stigmatize the use of AI tools, particularly for non-native speakers who benefit from them.
User sentiment is another significant consideration, with nearly 30% of surveyed users indicating they would use ChatGPT less if watermarking was implemented. Despite this, some employees believe in the efficacy of watermarking. Alternative methods, such as embedding cryptographically signed metadata, are being explored as potentially less controversial and more reliable in preventing false positives.
In summary, OpenAI is weighing the benefits of watermarking for responsible AI use against potential downsides, including user deterrence and circumvention by bad actors. The exploration of metadata embedding is ongoing, with the company still assessing its effectiveness.
Comment: