Uncategorized

Researchers need to tame artificial intelligence now that it will change science

Synthesizing evidence and drafting briefing papers about artificial intelligence tools for policy guidance: the US National Artificial Intelligence Research Resource Task Force

There are questions that need answers very quickly. Powerful language models are already widely used in research and technological development and are expected to be available to more users in the future. Policymakers are using publicly available generative AI tools. Legislative staff members in the United States are testing out different artificial intelligence tools, including one that is not always reliable. This led administrators in the US House of Representatives to impose limits on chatbot use in June (see go.nature.com/3rrhm67).

Collaboration will be needed to build AI tools for science advice in a responsible way. Government can only meet the demands for robust governance, transparency and accountability, as technical know-how can come from academia and technology companies. The US National Artificial Intelligence Research Resource Task Force is an example of a relationship between academia, business and government.

Here we explore two tasks for which generative AI tools hold promise for policy guidance — synthesizing evidence and drafting briefing papers — and highlight areas needing closer attention.

Current evidence searches can be time- consuming and involve judgement. Science advisers must take what they can get. But what if the searches could be more algorithmic?

To find the best answer to a question, systematic reviews can be used to locate and analyse all relevant studies. For example, one recent review examined evidence on whether healthy-eating initiatives were successful in young children, finding that they can be, although uncertainties remain2.

It is possible to read the literature at scale3 and use the alternative approach of subject-wide evidence. For example, as part of a biodiversity project called Conservation Evidence, some 70 people spent the equivalent of around 50 person-years reading more than 1.5 million conservation papers in 17 languages, and summarized all 3,689 tested interventions. The summaries were then read by an expert panel, which assessed the effectiveness of each intervention. The synopses on topics ranging from bat conservent to sustainable agriculture were published online. A parallel tool, Metadataset, allows users to tailor meta-analyses to their own needs4.

The number of research fields touched by artificial intelligence (AI) is rising all the time. By the day, the list adds more and more, including weather forecasting, medical diagnostics, and science communication. An analysis by Nature shows that the proportion of papers in the Scopus database that mentioned artificial intelligence in their title has gone up from 2% a decade ago.

There are automated processes that could help decision-making. A solution scanning process can use artificial intelligence to create a list of options. Take, for example, policies for reducing shoplifting. To identify topics like employee training and store design when asked to list potential policy options, it is recommended that you first list potential policy options. Advisers can then collate and synthesize the relevant evidence in these areas. Such rapid assessments will inevitably miss some options, although they might also find others that conventional approaches would not. Policy questions, as well as context, can affect the dimensions of credibility.

The most efficient way to prompt LLMs to produce the required outputs will be trained by science-advice professionals. Even minor shifts in tone and context in a prompt can alter the probabilities used by the LLM to generate a response. Advisers also need to be trained to avoid inappropriate over-reliance on AI systems — such as when drafting advice on emerging topics for which information is needed rapidly. These might be areas in which LLMs perform poorly, because of a lack of relevant training data. Science advisers will require a nuanced understanding of such risks.

Automated Publication of a POST Note on Life-Science and COVID-19 Research with a Scientific Reference Guide for Policy Discussions

Many academic journals use standardized formats for reporting study results, but there is great variation across disciplines. International agencies, non-governmental organizations and industry are some of the more disparate sources of information. Such diversity in presentation makes it difficult to develop fully automated methods to identify specific findings and study criteria. The importance of knowing what period an effect was measured or how large the sample was cannot be quantified, but can be covered up in the text. Presenting the research methodology and results in a more consistent manner could help. For instance, in medical and life-sciences research, journals published by Cell Press use a structured reporting format called STAR Methods (see go.nature.com/3ptjqcf).

Systematic reviews currently have to look across proprietary databases to identify relevant scientific literature. The choice of database matters and can have a substantial impact on the outcome. Government requirements for funding research to be published as open access could make it easier to get the results. For research topics that governments deem as funding priorities, eliminating paywalls will enable the creation of evidence databases and ensure alignment with copyright laws.

In one experiment by the publisher Elsevier, an LLM system was constructed that referenced only published, peer-reviewed research. Although the system managed to produce a policy paper on lithium batteries, challenges remain. The text was dull and pitched at a level of understanding far from the briefs needed, just like the language in the papers it was derived from rather than an original synthesis. The system showed important design principles. For instance, forcing it to generate only text that refers to scientific sources ensured that the resulting advice credited the scientists who were cited.

Say a POSTnote was commissioned by the UK Parliament to summarize the latest research on COVID-19 vaccines. Instead of a single publication, POST could produce a multilayered document that automatically tailored itself to different politicians. For example, a politician might receive a version that highlighted how people in their constituency made contributions to the science of COVID-19 or to vaccine manufacturing. They can be given information on their own infection rates.

Another dimension might be the level of scientific explanation of how vaccines work. Politicians with a science background could receive specialized knowledge, while those who didn’t have a scientific background could get a lay version. The level of technical detail might be dialled up or down by the reader themselves.

Source: AI tools as science policy advisers? The potential and the pitfalls

Artificial Intelligence in Language Models: Where, where and how does it come from? How institutions, ministries, and governments can work together to promote Open Science and Open Science

The researchers found that the different language models had different political leanings. Some of these biases are picked up from the data that models are trained on. Biases can have implications for how models perform on certain tasks. Other forms of bias include race, religion, gender and more.

Such processes would be best conducted by institutions that have clear mechanisms in place to ensure robust governance, broad participation, public accountability and transparency. The UK What Works Network and the US What Works Clearinghouse could be built upon by national governments. Alternatively, international bodies, such as the United Nations scientific and cultural organization UNESCO, could develop these tools in alignment with open science goals. Care needs to be taken to seek international collaboration between countries. It is key to ensure not just the availability of these tools and scientific information to low-income countries, but also the consistent development of rigorous, unbiased systems for evidence synthesis that align with national and international policies and priorities.

Policy briefings often contain classified information, like the details of a defence acquisition or a public health study, which needs to remain private until cleared for public dissemination. Advisers might be at risk of revealing restricted information if they use publicly available tools, as this would make it difficult for them to deploy their models in government and the private sector. The best way to establish guidelines for what documents and information can be fed into external LLMs is for institutions to create their own internal models.

Survey respondents said the main factor in the development of an artificial intelligence is corporations. Companies are valuable contributors to science, technology and innovation. But the scale of their ownership of AI, in terms of both the technology and the human data needed to power it, is greater than in the past. Researchers need access to data, code and metadata. Producers of black-box systems need to recognize the necessity of making these available for research if AI claims are to pass verification and reproducibility tests. The regulators are still playing catch-up despite the rapid development of the Artificial Intelligence.

AI is also widely being used in science education around the world. The use of LLM tools by students at schools is one reason that curricula and methods of teaching will need to change.

Meanwhile, AI has also been changing. Whereas the 2010s saw a boom in the development of machine-learning algorithms that can help to discern patterns in huge, complex scientific data sets, the 2020s have ushered in a new age of generative AI tools pre-trained on vast data sets that have much more transformative potential.