Automatic detection of image manipulation - World leading higher education information and services

The development of artificial intelligence (AI) has transformed many industries by enabling machines to perform tasks that traditionally require human intelligence. The research community is just one of the groups exploring the benefits of AI in analysing content, organising data and more. However, as with any new technology, there are ethical considerations we must consider when using these tools.

AI is influencing the way we conduct and share research. Firstly, AI is excellent at analysing complex datasets and identifying patterns that may be difficult to detect when reviewing the information manually. By introducing machine learning algorithms, AI software can learn from this data and be trained to make predictions based on these patterns. For example, researchers could use AI to predict the best potential drug molecules from previous data.

Meanwhile, natural language processing software enables researchers to quickly scan volumes of papers to understand a new topic, and other tools exist to both generate and review written and visual content.

As we explore these innovative technologies, we must also consider how and when to use them to maintain integrity in scholarly publishing. To remain ethical and continue building trust in our research, the community must continue to show transparency, accountability, and credibility.

Preventing misinformation

Two months after its launch in 2022, artificial intelligence chatbot ChatGPT reached 100 million users. When exploring its capabilities, some used the tool for fun, writing poems or asking for advice, while others used it to provide insights on various topics and generate content. In the scientific community, for example, some have used the tool to help generate parts of their research paper.

While innovative, AI has its limitations as the systems are only as good as the data they are trained on — any biases or inaccuracies from the data will show in the generated content. For example, the first AI generated article in Men’s Health was investigated for sharing many inaccuracies and falsehoods — despite the fact the content appeared to have academic-looking citations. In response, many publishers, such as Nature, adapted editorial policies to outline the restriction of using large language models (LLMs), such as ChatGPT, for generating content for scientific manuscripts.

Preventing misuse

The limitations of AI’s performance and transparency mean that humans cannot rely solely on this technology. There is a responsibility on the part of researchers, editors and publishing houses to verify the facts. The Committee on Publication Ethics (COPE) have issued a position statement. Their views and publisher guidelines will need to be updated according to the development of AI capabilities, to cover when it is appropriate and desirable to use AI technology and when is not appropriate.

The resurgence of paper mills — organisations that produce fabricated content — is a common example of AI misuse. While the exact percentage of paper mill articles in circulation is not known, there are significant concerns among publishers that this difficult-to-detect phenomenon undermines the credibility of research publications.

Over the years, various characteristics have been found that may distinguish paper mill articles, both in the texts and in the scientific images they display. These characteristics may assist us in identifying the suspected paper mill articles. However, reviewing scientific papers for suspected images can be a time-consuming task that is not always accurate. In an average manuscript, the number of subimages adds up quickly. A subimage is related to any single image, for example one microscopy picture or one line of western blot bands. There may be hundreds of subimages in a paper.

Considering the growing number of papers submitted each day and the number of subimages they frequently contain, manually reviewing scientific papers for image mistakes is all but impossible. For instance, a manuscript with 350 subimages would require approximately 100,000 separate comparisons if all subimages are to be checked against one another. A person making 42 comparisons every minute, full-time, would still take a whole week to check the images in just one manuscript.

But it’s not just the increasing time investment that puts pressure on publishers. The rapid proliferation of new AI tools, especially those capable of creating or modifying images, has raised significant concerns for publishers and integrity teams about the feasibility of detecting fake content within manuscripts. As AI continues to develop, it will become increasingly difficult to detect these fake images with the naked eye. Even comparing these images against a database of millions of previously published pictures might prove futile, as the AI created images could appear authentic and unique, despite the lack of legitimate data. Integrity experts can no longer rely purely on manual checks and must consider how to develop countermeasures to AI misuse. Identifying these sophisticated fakes requires aid in the form of developments in computer vision technologies and adversarial AI systems.

AI can automate this process to detect instances of misuse or unintentional duplications before publication. Image integrity proofing software Proofig AI, for example, uses computer vision and AI to scan a manuscript and compare images in minutes, flagging any potential issues. Forensic editors can then investigate further, using the tool to find instances of cut and paste, deletions, or other forms of manipulation.

For example, in Figure 1 at the top of this post, the frame highlighted in red (tumor – 2µm) is an overlap duplicate of the image highlighted in green (normal – control). It has been incorrectly labelled as coming from the tumour group and the magnification is also incorrect. Once the software points this out, someone can then investigate and review the issues.

The requirement for forensic editors (the ‘human in the loop’) is a crucial part of this process. One of the most common criticisms of AI and algorithms in general is that they are used to automate decision-making, with unforeseen and sometimes catastrophic consequences. Image checking software is not yet 100 per cent accurate, nor is it designed to be the judge of whether image integrity issues are accidental or deliberate but rather to highlight to the user the images that should be reviewed. It is then the responsibility of the forensic editor to examine and investigate the issues.

AI has many capabilities and will continue to improve, but we also cannot rely on the technology to act ethically of its own accord. As the research community increases its understanding of AI and its applications, integrity experts should collaborate to establish clear guidelines and standards for its use in content generation.

In spite of all these efforts, however, paper mills will persist. Publishers need to consider adopting the most suitable technological solutions available at the time for reviewing manuscripts prior to publication. This, of course, should be complemented by a broader endeavour to develop additional methods to prevent the flourishing of paper mills.

Author Bio: Dr Dror Kolodkin-Gal is a life sciences researcher who specialises in new ex-vivo explant models to help understand disease progression and treatments.