December 2020

Do you pronounce “data” like “data” or “data”? teaser image
In my high school history class, my teacher asked us what piece of modern technology had influenced and shaped the lives of everyone as much as the advent of the automobile. This was prior to 2010, and we offered “TV!” “the internet!” or “Computers!” He wouldn’t accept any as the answer because automobiles have had such an extensive impact on our daily lives, such as where and how we live, how our cities are built, where we work, where we travel, and so on and so forth. At that time, it didn’t seem like anything else could have that kind of impact.

But it’s 2020. The connection brought by the internet has been creeping into our lives for years, but it still seemed like a choice to engage with it or not. You could still choose paper, you could still pay with cash. From what I can see, this year is the beginning of some hefty and life-altering changes in our daily lives—how we interact and learn and shop and, not insignificantly, how we conduct science.

With all that technology can offer, it’s giving us an opportunity to question the former ways things have been done. Jobs, businesses, and conferences have all been based on people’s needs to be present and in person for the job to get done. Phone calls and video chats are well established but have not been utilized in the way that they are now. Now, people from age 4 to 94 are logging into Zoom to work, to chat with friends, to learn at school, etc.
 
Being forced into this, any elementary school teacher would tell you, is not really working. For lack of a better analogy, we are putting new wine into old wineskins, and it is high time we reinvent the existing framework to accommodate something entirely new; particularly to collaborate and share scientific data.

The National Institutes of Health (NIH), ever at the forefront of research, is planning to confront these problems. One of the five basic tenets of science is reporting! What good is all the well-conducted science in the world if it’s never published and no one can see the results? No good. That’s one reason the phrase “publish or perish” exists, because reporting findings is essential to the health of a robust scientific community. Furthermore, in the light of the current race to a COVID-19 vaccine, it is essential that the most recent data is not only up-to-date, but easily accessible.

In recent years, NIH has been doing the work to update their Policy for Data Management and Sharing so that it is current and effective. On October 29, they released the updated policy, which will take effect in early 2023. Of course, their data management policy only affects those who are funded by the NIH, but as the premier biomedical science-funding organization, they fund approximately $42 billion worth of research annually. There are a few points of their policy that I’d like to highlight:
  • NIH requires researchers to prospectively plan for how scientific data will be preserved and shared through submission of a Data Management and Sharing Plan.
  • NIH strongly encourages the use of established repositories to the extent possible for preserving and sharing scientific data.
  • Shared scientific data should be made accessible as soon as possible, and no later than the time of an associated publication, or the end of performance period, whichever comes first.
  • Data sharing enables researchers to rigorously test the validity of research findings, strengthen analyses through combined datasets, reuse hard-to-generate data, and explore new frontiers of discovery. 
Previous data management policies have insisted that projects funded by the NIH lead to published articles. Like any investor, they want to see a return on their investments, and publications are scientific currency. These articles are peer-reviewed and often tell a significant and interesting story. It’s typically somewhat of a challenge to get published (You’ll always have that one scientist who gets accepted as is… . *eye roll *).
 
Because of the high standards set forth by most publications, it’s not uncommon for researchers to withhold data that is not viewed as impactful, is statistically insignificant, or can be an integral part of a different, future study. It is a tried-and-true practice, but one that takes time to accomplish, as it is a product of an earlier time that was much slower paced than the current atmosphere. Publications have made changes over the years, shifting to a heavy online presence, and there are increasingly accessible ways to publish data with or without the peer review process. But it has not been enough to meet the rigors and demands of modern science.

The new Policy for Data Management and Sharing defines data as “the recorded factual material commonly accepted in the scientific community as of sufficient quality to validate and replicate research findings, regardless of whether the data are used to support scholarly publications.” I believe the addition of data that is not intended for publication is an addition since the previous policy (circa 2008). In my novice opinion, this could have some interesting ramifications. In the light of medical and COVID-related advancements, as well as the connected “at-your-fingertips” way we live our lives, it is justifiable that accurate information be available as soon as it can be. I think it has the capacity to encourage greater collaboration and synergy to reach new heights, scientifically and medically. I think there needs to be a way to access even “unpublishable” data, because I believe non-significance tells just as great a story as significance.

In addition, this new policy is going to be a massive driving force to create avenues for making these data available to the public. Sure, there are sites like PubMed to make journal articles accessible, but it seems that what the NIH is asking for will require a great deal of innovation and creativity to initiate and implement. As the saying goes, “necessity is the mother of invention,” and I think the new standards set forth will instigate some inventive problem solving. I also hope that these changes will contribute to bringing scientific publishing into a format that will not only fit the current way we live our lives, but be sustainable for a future of continued advancement.

With the hope of a paradigm shift in the way scientific data is published and made available, there are also some components that as scientists, we should be vigilant to preserve. The idea of peer review is not just a fiery hoop to jump through because researchers worldwide felt they had too much free time, so they wanted to fill it with reading and critiquing other scientists’ papers. It is an essential rigor of the field, forcing scientists and researchers to be held accountable and to produce work that is meaningful, reproducible, and novel. As scientists, we need to make sure that their requirement to make all data available does not result in a loss of rigor and accuracy. Additionally, published research includes careful, critical analysis and conclusive results. Making all data available to the public could provide many more opportunities for wild speculation, so it is essential that NIH policymakers bear this in mind.

Overall, I feel that the scientific community is on the brink of transformation in data sharing. While publishing will continue to evolve and adapt with the times, I do believe it is time to leave behind the current framework and develop something new. I don’t think that my high school teacher can ignore the effects of the internet on our lives any longer!

To read the NIH Policy for Data Management and Sharing yourself, click here.

-Kalen Johnson

Kalen is a doctoral student in the Department of Educational Psychology.

Related Content

Explore Grad Aggieland

News

Texas A&M Set to Host Young Scientists Selected to Participate in the 2024 Lindau Nobel Laureate Meeting

Per an agreement between the Council for the Lindau Nobel Laureate Meetings and Texas A&M, the university is set to support 21 exceptional undergraduates, graduate students and postdocs, eight of whom are Aggies, to attend the upcoming annual 73rd Nobel Laureate Meeting in Lindau, Germany from June 30 - July 5, 2024. On May 16 and 17, Texas A&M will host these 21 scholars for a pre-Lindau Meeting preparatory workshop. They will be joined by an additional 11 young scientists supported by Amgen, allowing for an intellectual exchange between all 32 individuals.

View All News
Blog

The grad school arc

If you’re just starting your Ph.D., especially in a STEM field, Serina talks in her latest post about the differences between each year of a 5-year Ph. D. program.

View All Blogs
Defense Announcement

Deep Learning for Molecular Geometry and Property Analysis

View All Defense
Announcements