ACS Responds to an OSTP Request for Information on the Future of AI Regulation

Docket ID No. OSTP-TECH-2025-0067

 

Introduction

The American Chemical Society (ACS) is a congressionally chartered not-for-profit organization and one of the world's largest scientific societies with more than 230,000 individuals in our membership community. ACS advances knowledge and research through scholarly publishing, scientific conferences, information resources for education and business, and professional development efforts. Our commitment is to improve all lives through the transforming power of chemistry. ACS strives to advance scientific knowledge, empower a global community, and champion scientific integrity.  We thank the Office of Science & Technology Policy (OSTP) for the opportunity to comment on the regulatory frameworks that will guide the development and usage of Artificial Intelligence (AI). 

In the rapidly evolving field of chemistry, AI, particularly large language models, has generated both enthusiasm and caution. We support and are actively engaged in pursuing the promise AI holds to help accelerate discoveries and foster innovation in the chemical sciences for the benefit of human health, welfare, and economic growth.

The effectiveness of AI in chemistry hinges on rigorous validation and the quality of its training data. Many tools are developed rapidly to meet publication demands, offering only marginal improvements or lacking thorough benchmarking. This contributes to a literature landscape containing incremental studies that may not meaningfully advance the field. Moreover, if an AI model is trained on data that is too narrow, outdated, or irrelevant to a specific domain, such as using pharmaceutical data for materials science, it may yield misleading results. Transparency in data sources and training methodologies is therefore essential for building trust and ensuring scientific rigor. Chemists must evaluate whether an AI tool is truly applicable to their research needs, rather than relying on its novelty or popularity.

We are also aware that many experts warn against overstating AI’s capabilities and the dangers of ignoring the risks associated with AI systems that include a tendency to hallucinate or deceive, can be prone to bias, may lack reliability and security, and lack attention to reasonable legal protections, standards and norms that would be observed by human beings. An additional, significant concern is the lack of reproducibility in AI-generated results, often exacerbated by the academic and industrial pressure to publish frequently. This environment can lead to superficial advancements that prioritize novelty over substance. However, when applied thoughtfully and with a clear understanding of high-quality training data, AI tools can provide genuine value, as evidenced by increases in material discoveries and a growing number of patent filings. Chemists are thus encouraged to critically assess these tools, emphasizing practical utility and seeking guidance from experienced voices to navigate the surrounding hype responsibly.

Finally, we note that an essential ingredient in the development of AI to deliver on its promise is the availability of high-quality inputs on which to train. ACS publications, and the publications of similar organizations, are among the most cited and trusted sources in the world to find those inputs. We recognize and embrace the potential of this technology to transform innovation and, with that, the economy of the United States and the welfare of the world. ACS supports the advancement of high-quality AI solutions and advocates for a measured, balanced approach to review and create laws, regulations, agency rules, technology standards and other guidance applicable to the development of AI in science.  ACS appreciates the opportunity to offer its comments to help ensure the tangible benefits of any framework require the use of high-quality inputs to foster the creation of productive, trustworthy AI application in the advancement of science and the economy.

Comment
A vital component of AI’s ability to realize the benefits it promises and avoid the pitfalls it risks is the availability of high-quality data for the training and validation of AI models. Simply put, high-quality data is needed to train high-quality models for them to generate high-quality, trustworthy outputs. In the academic publishing field, such data can have different attributes:

  • Research data, either raw or structured into datasets or databases which underpin research findings. ACS has been very proactive in this space in creating data policies to facilitate and encourage making available such data and linking them to the articles reporting on them.
  • Academic articles which explain and contextualize experiments, data collected, and findings. There are different versions of an article: from the author’s original manuscript to the version that is submitted to a journal for publication, to the accepted manuscript after peer-review, to the Version of Record which is the final published version.

Organizations like ACS are already strategically positioned to provide high-quality, curated content, validated by peer review. Investments in editorial and peer review ensure that published content is reliable and valuable and, when used as training data, improves the quality and accuracy of AI tools. ACS invests significant resources to maintain an infrastructure for services and products that build on the tagging of scholarly articles for discovery, on their enrichment with machine-accessible metadata and the structuring of content into standardized formats, essential for machine-use.

A market for the licensing and provision of this data already exists, and this content licensing market for AI has developed because the framework for it already exists, in terms of copyright law and contract law, which allows willing parties to agree to mutually agreeable terms for content licensing deals directly or via intermediaries.

An essential part of the government’s strategy to foster AI should be an expectation that copyright and IP frameworks envisioned by the Founders will continue to incentivize, and reward continued investments to structure, annotate and enrich content as required for machine use and to turn information into high-quality, reliable knowledge. A strategy that prioritizes free circulation of overwhelming amounts of unverified, unreliable information undercuts the very investment in AI and, instead, should invest in building quality, curated, contextualized, reproducible knowledge.

A solid copyright framework underpins a strong research sector in the US, drives American competitiveness and enables researchers and organizations like ACS to establish provenance and attribution. It therefore must remain a core component of the plan to foster AI. Federal support of access to scientific data for AI should focus on strengthening voluntary licensing models and clarifying that use of copyrighted content in AI training must occur under license. Encouraging high-quality data curation and responsible AI development benefits both AI and scientific innovators and content creators.

In addition to the availability of licensed, copyrighted materials to foster the ongoing development of AI and scientific progress, the US should build a robust, secure, and interoperable data ecosystem that connects researchers across disciplines, institutions, and sectors. This ecosystem should leverage existing discipline-specific databases, many of which are already well-established and trusted within their communities. The federal government can amplify impact by supporting efforts to standardize data formats, enable machine-readability, and promote data sharing through incentives, infrastructure, and training. In addition, government should convene regular cross-sectoral roundtables to identify gaps, foster alignment on data-sharing standards, and coordinate investment priorities.

Overall, when evaluating the legal and regulatory framework to support the development of AI, and minimizing its risks, we recommend the government consider the following principles:

  • Leverage and build upon existing efforts, including those of publishing organizations like ACS, to promote high-quality, accurate information and prevent the introduction and spread of “hallucinations” and misinformation. Those carry high risks for healthy, democratic public discourse and ultimately for the safety of American citizens.
  • Respect and reinforce existing intellectual property protections, including but not limited to copyright law, to ensure the ongoing development and availability of high-quality, vetted information and content, i.e., the collection, handling and storage of works in the training of a commercial AI model is a reproduction which must be authorized by the rights holder. Anything less is an actionable breach of copyright law. In this sense, the current legislative framework already requires the use of licensing copyrighted content.
  • Leverage and build upon an already existing and well-functioning licensing market. The market for licensing content to AI developers has developed because the framework for it already exists, both in terms of copyright law and contract law, which allows willing parties to agree upon mutually acceptable terms for direct licensing or via intermediaries. Licensing also fosters accountability, transparency, and provenance with regard to key information such as training materials used by AI models and systems. This in turn fosters a virtuous circle of trust that further spurs the growth and adoption of AI.
  • Require public transparency and provenance of training data and content by AI developers to enable users to understand and trace back results to their sources – and for users and rightsholders to understand what sources were used in training the system. Transparency obligations are essential for researchers to evaluate whether an AI tool is truly applicable to their needs, rather than relying on its novelty or popularity. In addition, transparency is vital in enabling rightsholders to understand the scale and scope of copyrighted works used in AI training, and to enable them to effectively manage their rights and engage in licensing, fostering the growth and development of AI in a legal, ethical and sustainable manner.
  • Clarify that accountability lies with all providers and contributors of AI solutions, including generative AI models, and throughout the whole life cycle of the AI. Providers should keep all records, documentation, and associated metadata for the duration of the existence of the AI system to support AI accountability throughout the value chain.
  • Work to support and fund efforts to protect the integrity of education and communications about research advances, both of which may be particularly vulnerable to misinformation or misinterpretation of AI outputs. In particular, any US regulation, law, or policy should support the development and implementation of protective mechanisms to identify products created by AI – whether quantitative, visual, or textual. In the scholarly sector, these would include tools to detect fake (synthetic) or manipulated images and data, publications generated by paper mills, and the use of inaccurate or illegally sourced content.

Conclusion

As the government pursues the further development of AI in America, we ask it to preserve the foundational role of laws, regulations, agency rules, and guidance that fosters the ongoing creation of high-quality data and publications which are essential to trustworthy, high-quality AI. AI systems that rely on high quality data, particularly vast inputs of rigorously developed data and properly licensed content, learn and generate insights that are inherently more reliable. For AI to be trusted with vital scientific advancement, it is imperative that the input materials reflect the rigor and integrity of peer-reviewed research findings. High-quality, vetted scientific content not only ensures that AI outputs are accurate and reliable, but also safeguards the contributions of researchers whose work underpins these technologies. Upholding copyright and other intellectual property protections not only reinforces the value of scholarly work, but it also promotes a responsible AI ecosystem that advances science without compromising ethical or legal standards and ultimately helps ensure legal protection and economic interests of the innovation gains generated using AI.

As a uniquely situated scientific society, the American Chemical Society again thanks OSTP for the opportunity to provide comments on the regulatory frameworks that will guide AI development and usage. We welcome the opportunity to discuss our views further – either individually or in collaboration with other key stakeholders.