Do androids dream of big data domination?


The incredible exponential growth in the volume of research data shows no signs of slowing down.

Take the databases used by the European Bioinformatics Institute, headquartered in Cambridge, which continue to double in size every one to two years. And, thanks to powerful digital processors, advances in materials science and innovations around data transfer, the rate of increase is still accelerating.

Most researchers seem adept at managing these progressively more complex and varied datasets, with some already using algorithms to detect imperceptible patterns and connections. But many organisations have failed to respond – most notably, by not giving staff more resources to wade through this data deluge. As such, the pressure to do so will likely result in cut corners, unintentional errors and possibly even wrongdoing.

Researchers must have more “thinking time” to analyse data. Asking them to work longer hours is not practical, however, and will lead to a marked decline in publication quality and personal well-being. One solution could simply be longer project timelines. But there is also a case for more team science, incorporating greater specialisation among the individuals involved.

Recent research from Gallup on workplace productivity shows that professional talent varies, and there may be merit in playing to one’s strengths. Professorial tracks in “thinkery” (hypothesis development and data analysis), “tinkery” (experimental practice and generating data), “linkery” (collaboration relations and facilitating idea exchange) and “inkery” (communication) might lead to more efficient project management, with complementary skills provided by different scholars.

The single all-powerful project leader is already being replaced by dual or multiple PIs with varying disciplinary expertise, so it’s not a stretch to consider cognitive experts. The power of this approach is demonstrated by the internet of things, where connected specialist technologies provide greater benefit than the sum of the parts.

The hive mind may also help to manage the increasing data load. Two heads are better than one, and 10 would offer a step change in data analysis efforts. However, these in-depth exchanges will require time, while research contracts rarely allow for assisting colleagues with their research questions. Furthermore, competition for resources and reward tends to oppose such a sharing culture. But Twitter is already home to a burgeoning online sharing culture, with researchers tapping into collective knowledge; this public sharing of ideas, questions and suggestions will increase in academia as trust grows in the security and protection technologies that will ensure provenance, authorial credit and time-stamping.

Technology can offer us far more, however. It could be routinely used for non-cognitive tasks, freeing up time for things only humans can do.

Generic project management software can be adapted for planning and tracking milestones and generating budgets. Virtual assistants are available to schedule appointments, and dictation apps that convert speech to text can reduce the time spent on tedious yet necessary tasks such as record-keeping.

For research, several artificial intelligence-powered “speed-readers” now perform literature searches and compile reading lists, even suggesting potential areas of investigation. This presents an opportunity to review how researchers could be more efficient at cognitive tasks and how they can use technology to expand and augment their research capacity. Computer systems already have greater processing power than the human brain. They cannot yet perform the nuanced contextual thinking required for data analysis, drawing conclusions and devising hypotheses, but that time is coming.

IBM’s Watson arguably represents the state-of-the-art of cognitive computing, which aims to develop systems that adapt, iterate, interact with and understand contextual elements. Life sciences research is a major application target for Watson Health. The “first machine-generated” book was published in April 2019, with the AI named Beta Writer authoring Lithium-Ion Batteries: A Machine-Generated Summary of Current Research. Hypothesis generation has been demonstrated by the Adam and Eve robot scientists and commercial AI enterprises such as Euretos.

However, clarity on principles and ethical considerations must be a priority. Now is the time to begin thinking seriously about what research tasks we want machines to perform, and why. Is it desirable for a machine to cognitively outperform a researcher? And where would responsibility reside if a machine determines research directions, particularly when outcomes are irreversible and have significant impact on humans and other sentient beings?

We need to determine the sweet spot between overstretched humans and overpowerful machines. And that is very much a research project on which humans should take the lead.

Author Bio: Kristen Sadler is an independent adviser, combining multinational expertise in university research, teaching and leadership with interests in emerging tech and “Society 5.0”.