Topics Toward Standardization of PHRs From the Viewpoint of Quality Data Collection
Masami Morita, Senior Researcher, Pharmaceuticals and Industrial Policy Research Institute (PIIPRI)
Takayuki Sasaki, Senior Researcher, Pharmaceuticals and Industrial Policy Research Institute (PIIPRI)
Yasuhiko Nakatsuka, Senior Researcher, Pharmaceuticals and Industrial Policy Research Institute (PIIPRI)
Introduction
With the evolution of digital devices including IoT and the development of the communication environment, an environment for the accumulation of personal health-related data ( PHR1) ) is being constructed. In anticipation of future paradigm shifts in healthcare, i.e., from "groups" to "individuals," from "treatment" to "prevention," and from "things" to "services," it is expected that a wide variety of stakeholders in healthcare will utilize PHRs to provide solutions. In such "data-driven healthcare," data will play a leading role as "second oil," so to speak, but due to the nature of the data that will be used by a wide variety of stakeholders, the quality and security of the data for secondary use, as well as the "accessibility" of the data for use, will be important. Accessibility" is also important.
As mentioned in Policy Research Institute News No.57, the "quality" of data in data-driven healthcare, which assumes the use of AI, is not only the quality from the perspective of conventional data science ("closed data" perspective), but also "appropriate quality" that meets the needs of data users, and "interoperability (In)/Output (In)/Output (Out)/Output (In)/Output (Out)/Output (Out), which is a precondition for utilization across industries and countries. The "open data" perspective, such as "Fit for AI" that takes into account the characteristics of AI, is necessary. In particular, for the improvement of interoperability, standardization is a major key to improving interoperability, as seen, for example, in the introduction of the HL7 FHIR2 standard for medical information worldwide.
This paper focuses on the standardization of PHRs, which are expected to be utilized more widely in the future, and examines what measures will be necessary, especially in terms of data "quality.
Standardization of PHR
What is standardization?
According to the Japanese Standards Association ( JSA), standardization is "the process of reducing, simplifying, and ordering 'things' and 'matters' that would otherwise become diversified, complicated, and disorganized if left unregulated. The roles (benefits) of standardization include ensuring compatibility and quality, improving production efficiency, promoting mutual understanding, and disseminating technology. In today's world of business globalization and cross-industry collaboration, the role of standardization is becoming increasingly important.
Scope of PHR
Although there is a consensus that the scope of PHR broadly refers to "personal health and medical information," a clear definition has not yet been established. As an example, the Ministry of Health, Labour and Welfare ( MHLW) has proposed in its "Study Group on Promotion of PHR for National Health Promotion" a draft definition of PHR as "a system for individuals and their families to accurately grasp health information such as the results of personal health examinations and medication history as electronic records. In addition, as a related matter, the Future Investment Strategy 2018 aims to start providing data on specific health checkups and infant health checkups, in addition to vaccination history, as part of the construction of PHRs5).
However, from the perspective of data distribution for effective health promotion and behavioral change promotion, as well as for the promotion of healthcare-related industries as a whole, it is important to consider not only health checkup and medication information, but also data acquired at medical institutions, "deep" biological data such as genomics and omics, and even digital tools such as mobile applications and wearable devices. PHRs should be viewed as "life course data" of individuals that are accumulated throughout their lives, including not only health checkup information and medication information, but also "deeper" biological data such as genomic and omics data, and sensor data and lifestyle data that are obtained outside of medical institutions through digital tools such as mobile applications and wearable devices (Fig. 1).
Current status of PHR standardization
If PHRs are to be treated as data that includes genomic and behavioral data, the scope will be extremely broad, and it will be difficult to standardize all PHRs simultaneously.
On the other hand, medical examinations and health checkups are already widely conducted by the national government, local governments, companies, etc., and the infrastructure for collecting life-long data is in place. However, the results of individual medical examinations and health checkups are stored discretely and dispersed, and some of the information is not digitized, making it difficult for even the individual to collect and utilize the data, even when a disease occurs (Figure 2). Therefore, efforts are being made to standardize the data collection infrastructure for medical checkups and health examinations, which is already in place. For example, the Japan Medical Health Care Evaluation Council, which is comprised of 10 organizations involved in health checkups, has distributed a standard format for health checkups, which it claims will enable centralized management of health checkup data from childhood to old age6). The features of this standard format for medical checkups are that it can be used for all medical checkups, that it is expressed in independent records for each examinee, and that it aims to standardize terminology for imaging findings and other data. The conversion tool to the standard format can convert CSV files created by medical checkup providers to the standard format, and is distributed free of charge by NIHS, but since items are set independently by each medical checkup organization and the corresponding tables must be created individually, only the cost of setting up the tables is charged (50,000 yen before tax). (The cost of setting up the correspondence table is charged (50,000 yen before tax).
Separately, the National Institute of Health Sciences provides free of charge standardization software for specific health checkups and health guidance7). When utilizing this standardization software, it is possible to standardize not only basic elements such as physical measurements, medical examinations, and blood pressure data, but also data not included in medical examinations and health checkups such as cancer screening, biometric tests, and immunological tests (Figure 3). In addition, this format will be output in XML format in accordance with international standards for medical data, which will enable collaboration among insurers and utilization in medical care.
In its report "Report on Standardization of Health Care Information Linkage," the PHR Association of Japan cites medical checkup information as the most voluminous and nationally accumulated personal health record, and is working to standardize medical checkup information, taking into account the finite retention obligations of businesses, etc. and the need for permanent retention8). The association has provided specifications for recording the details of medical checkup results and data format conversion tools.
In the area of medical care, last year, four academic societies (the Japan Diabetes Society, the Japan Society of Hypertension, the Japan Society of Arteriosclerosis, and the Japan Society of Nephrology), which have established guidelines for the treatment of lifestyle-related diseases (diabetes, hypertension, dyslipidemia, and chronic kidney disease); the Japan Society of Laboratory Medicine, which is concerned with standardizing measurement methods and data for specimen tests; and the Japan Society of Medical Examiners, which promotes the standardization and utilization of medical information overall. (the Japan Diabetes Society, the Japan Society of Blood Pressure, the Japan Arteriosclerosis Society, and the Japan Society of Nephrology), the Japan Society of Clinical Laboratory Medicine, which is concerned with the standardization of measurement methods and data for specimen tests, and the Japan Medical Informatics Society, which promotes the standardization and utilization of medical information in general. This recommended setting is characterized by the development of a set of lifestyle-related disease self-management items for both those who have not yet developed lifestyle-related diseases and those who have developed lifestyle-related diseases (Figure 4).
As described above, standardization of PHRs has begun to be implemented in medical examinations and health checkups widely conducted by the national government, local governments, and companies, such as birth and other health checkups and specified health checkups, as well as in some medical fields such as lifestyle-related diseases. On the other hand, these standardization efforts are mainly focused on the standardization of formats and items for linking data, in other words, the standardization of "packages" for storing data, which, if applied to the concept of the Japanese Standards Association, can be said to be efforts centered on the perspective of "ensuring compatibility. On the other hand, when it comes to integrating and analyzing data, the "content" of the data is also crucial. Therefore, from the perspective of whether the accuracy of the data can be maintained, the viewpoint of "standardization to ensure data quality" is also necessary.
Necessity of Standardization of Measurement Methods
Several years have passed since the utilization of big data came to be called for, and it has been laughably pointed out that big data of poor quality causes big mistakes, or that deep learning is deeply wrong10).
This perspective is equally true with regard to healthcare data. For example, many of you have probably experienced the measurement of lung capacity during a physical examination. The accuracy of respiratory function tests is known to vary depending on the patient and the technologist, and it has been shown that some indices, such as EV (Extrapolated Volume) and PEF-time (Peak-Flow-Time), vary depending on the years of experience of the technologist11) . 11) This is an interrater difference in data management. However, there are many other factors that must be considered in terms of data quality in order to conduct meaningful analysis, such as differences between instruments, devices, facilities, consumable lots, daily differences, and accuracy of records.
In another example, when clustering data from the U.S. National Cancer Institute's cohort data (TCIA; The Cancer Imaging Archive) with data from Japan's National Cancer Institute with respect to two types of cancer, there are cases where the data are classified by facility rather than cancer type, and cases where data from multiple In other cases, when AI learns from data that spans multiple facilities, "facility" is extracted as a feature12). In another case, when three types of wristband-type wearable devices were worn on the wrist at the same time to measure vital data, all three showed different values7). In other words, if data is collected and fed back within a controllable range under limited conditions, such as in a specific facility or under the use of a specific device, it is still necessary to keep the "quality of measurement" within an acceptable range when the data is collected and analyzed widely from the public and used for individual intervention and outcome evaluation. In order to keep the "quality of measurement" within an acceptable range, it is necessary to standardize procedures and equipment to the extent possible, and to have data users indicate the acceptable quality and discuss it with the data collectors.
Current Status of Standardization of Measurement Methods
Measurement methods for PHRs obtained using digital tools have been established mainly by companies that provide wearable devices, sensors, etc., due to the nature of the technology. Their usefulness as biomarkers has been confirmed by ensuring the robustness of the measurement individually and by verifying their consistency with existing non-digital measurement methods. However, many wearable devices for obtaining vital data, for example, have not been reviewed by the FDA, and it has been pointed out that consumers may not be able to determine whether their vital data, which they collect for the purpose of managing their daily health, is reliable in the eyes of their physicians. (13).
On the other hand, biomarkers, which are relatively "deep" information, are information inside a living body and have a strong biotechnological component. Therefore, wet factors such as collection of biological samples and assurance of robustness of analytical methods are important. Standardization of methods for handling biological samples and analytical methods, including pretreatment, is being promoted by first "describing the conditions under which the analysis was conducted. International efforts to standardize biobank sample metadata include SPREC (a hierarchical and common code for information such as time from sample collection to frozen storage, centrifugation conditions, processing temperature, etc.), BRISQ (a framework for improving the quality of research reports), MIABIS ( The introduction of international standards such as BRISQ (a framework for improving the quality of research reports) and MIABIS (minimum data items to standardize biobank information) is under discussion in Japan14).
In Japan, on the other hand, ensuring the quality of biospecimen information is still considered insufficient. For example, according to a survey conducted by the Japan Agency for Medical Research and Development (AMED), the lack of quality control information and disclosure is ranked second as a bottleneck in the use of biological specimens15). In the Clinical Innovation Network Promotion Project promoted by the MHLW and AMED, the quality of disease registries is an important perspective to promote their utilization by companies conducting research and development of pharmaceuticals, etc. If banking of biological specimens is conducted, accompanying information such as collection methods In the case of banking of biological specimens, it is desirable to enhance the accompanying information such as collection methods. The MHLW also supports Biobank Japan (BBJ), the National Center Biobank Network (NCBN), and Tohoku Medical Megabank16), and the Biobank Liaison Committee and others are jointly discussing initiatives to obtain high-quality biospecimens as described above. The "methods for obtaining high-quality biological data" accumulated in these efforts should be applied to individual disease registries, for example, through the establishment of registry management standards, which will be a necessary element for high-quality research in the future.
Enhancement of metadata
In the above section, we introduced the current status of the handling of biological samples, especially from the viewpoint of standardization of measurement methods, including preprocessing. On the other hand, considering PHRs in general, the range of users is much wider, and the quality required by data users differs significantly depending on their purposes. Therefore, in practical terms, it is important to always associate ancillary information such as sampling, preprocessing, device type, measurement method, and measurement environment, in other words, "data for data (metadata)" with the main data, and to standardize the structure and description method (common language: Ontology) of the metadata. The first step is to standardize the structure of metadata and the method of explanation (common language: Ontology). In other words, a system that allows data users to refer to metadata and decide whether it is of usable quality for their own purposes, rather than standardizing everything.
A typical example of an effort to make this possible is the use of RDF (Resource Description Framework) in the field of genome science, which is a type of Semantic Web17) and originally an international standard format to make information on the Internet easy to process by computers. RDF is a type of Semantic Web17) , originally proposed by the World Wide Web Consortium (W3C) as an international standard format to make information on the Internet easier to process by computers, and has been increasingly used in the bioscience field in recent years. In Japan, the Database Center for Life Science Integration (DBCLS) and the National Bioscience Database Center (NBDC) are promoting data integration using RDF, which has also been incorporated into the Japanese Genome Diversity Integration Database (TogoVar) that began operation in 2018 (18). In order to share life science knowledge and clinical information, it is necessary to prepare and analyze data in machine-readable form, including even the metadata held by the data, and the description style of metadata may become an important factor in ensuring quality data in the future.
Summary
As the paradigm shift in healthcare progresses, the quality of PHRs as personal health-related data will also become important.
When looking at PHRs from the viewpoint of quality, there are two perspectives: the "package," which includes format, structure, and terminology (i.e., ensuring compatibility from an informatics perspective), and the "content," which includes the data itself, and in order to reconcile the implementation aspects such as cost, time, and complexity, it is necessary first of all to In order to deal with implementation aspects such as cost, time, and complexity, it is important to first make data collection more convenient for data collectors (administrators) and to enhance "data accompanying the data (metadata)" to enable data users to make decisions on the use of data.
In building such a data collection platform, it is necessary to construct a platform with interoperability in order to spread the concept of quality data and prevent the proliferation of different standards, formats, etc. Unlike medical care and medical checkups, there are few existing platforms and a wide range of utilization in other industries is expected. In particular, unlike medical care and health and medical checkups, there are few existing platforms, and PHRs are expected to be used in a wide range of other industries.
In addition, as the potential for development of the healthcare industry expands worldwide, it will be necessary to address cyber security and architecture (design of systems, laws, and regulations, etc.) while promoting harmonization with international standards with a view to international expansion and international collaboration19). We hope that such discussions will progress in Japan as well.
-
As of December 2023The National Institute of Biomedical Innovation Policy established the "Big Data Utilization and Study Group in the Medical and Health Fields" within the institute in July 2015 to study issues related to the pharmaceutical industry of big data. This report is based on the research and study of the "Study Group," including a lecture by Dr. Hiroshi Mizushima of the Research Information Support and Research Center, National Institute of Health Sciences.
-
1) Number of reports and countries from which data was obtainedPersonal Health Record
-
2)Fast Healthcare Interoperability Resources
-
3)What is Standardization?" (viewed Sep 18, 2019)
-
4)
-
5)'Future Investment Strategy 2018.' (viewed on 2019.9.18)
-
6)'On the Standard Format for Health Examinations' (viewed on 2019.9.18)
-
7)From materials provided by Dr. Hiroshi Mizushima, Research Information Support and Research Center, National Institute of Health Sciences
-
8)
-
9)HP of the Japan Society for Medical Sciences (viewed on 2019.9.18)
-
10)From the lectures by Sadahide Ezaki, Ministry of Economy, Trade and Industry, "The 3rd Digital Health Symposium" and "The 1st Symposium of the Open Innovation Promotion Council in the Age of IoT/AI.
-
11)"Validity index of effort lung capacity measurement and its relationship with patients and laboratory technicians," Medical Laboratory Vol. 63, No. 6, 2014
-
12)From the lecture by Ryuji Hamamoto and Jun Zeze at the HiDEP Special Symposium (Sept. 13, 2019)
-
13)
-
14)The 4th Biobank Liaison Meeting (held on June 9, 2018)
-
15)From the 7th Biobank Liaison Meeting (held on 2019.4.20) Japan Agency for Medical Research and Development materials. Note that the No. 1 issue was "difficulty in ethics application procedures".
-
16)From the Ministry of Health, Labour and Welfare, 9th Health Science Council (Clinical Research Subcommittee) materials (held on 2019.1.23).
-
17)A system that enables computers to collect and process information autonomously by adding metadata to web pages according to certain rules.
-
18)Japan Science and Technology Agency (JST) Press Release (viewed on September 18, 2019)
-
19)Policy Research Institute News No. 58 "Overview of "Security and Privacy" Regulations for Health Data Platforms in Europe and the United States"
