[DGIST Series] The Growing Data Demands in Healthcare

Aging populations around the world are placing a strain on countries’ healthcare systems. In South Korea, figures from Statistics Korea show that life expectancy in the country increased steadily from 62.3 years in 1970 to reach 83.6 years in 2021. This has contributed to the nation officially becoming an aged society as more than 17.5% of the population were over 65 as of 2022, with this figure set to rise to 20.6% by 2025. Although South Korea has one of the fastest aging societies in the world, many other countries are also facing a similar trend in their populations.

Consequently, there is heightened interest in taking care of one’s health and the development of medical technologies to alleviate the demands on healthcare networks. These technologies include personalized health monitoring systems like point-of-care ultrasound devices that can constantly observe individual health statuses and AI systems such as 3D ultrasound imaging which can analyze and predict risk factors so people can rapidly receive treatments. Such advancements have increased the amount of data that needs to be processed, leading to increased demands in memory and, also, semiconductor memories that are used in data centers.

In this latest article in our series from DGIST faculty, Professor Jaesok Yu from the Department of Robotics and Mechatronics Engineering explains the need for solutions such as semiconductor technologies that can manage the data bandwidth of advanced medical systems, including 3D ultrasound imaging and AI diagnostic tools.

Advancement of Ultrasound Imaging Technology

Figure 1. The evolution of ultrasound imaging systems

Ultrasound is the most suitable medical imaging technology for point-of-care testing¹. Its affordability, safety, and ability to be miniaturized enable people to regularly use ultrasound equipment at home. Other medical imaging technologies such as magnetic resonance imaging (MRI), computed tomography (CT), and positron emission tomography (PET) which are commonly used in clinical practice are difficult to miniaturize. Moreover, CT and PET also use radiation which makes them a safety risk for home use. In contrast, ultrasound imaging technology can not only be miniaturized but is also comparatively safe as it is a non-radiological alternative. In addition, as an ultrasound provides a real-time view inside a patient’s body, it is particularly suitable for monitoring vascular diseases such as strokes that require rapid diagnoses.

¹ Point-of-care testing: A medical test carried out near or in the presence of the patient and without the need to send a sample to a laboratory.

Recent research and development in ultrasound imaging technology has been divided between premium, high-performance systems used in hospitals and point-of-care systems which are portable and used in a variety of settings. In particular, the market for miniaturized devices has grown significantly in recent years and has gained even more attention since the start of the COVID-19 pandemic. For instance, an American startup developed a miniaturized point-of-care ultrasound imaging system called Butterfly IQ which can connect to a smartphone.

Supporting the Data Bandwidth of State-of-the-art Ultrasound Imaging

Cutting-edge, ultrasound imaging technologies such as volumetric imaging or ultrafast imaging technologies increasingly demand a significant amount of data. This trend in data demand is caused by three-dimensional imaging that is required for enhanced personalized care.

Today, the most significant limitations of ultrasound systems are that they can only display 2D cross-sectional images and the results can vary greatly based on the operator’s proficiency—known as operator dependency. Consequently, there has been research on developing 3D imaging technology that would reduce operator dependency as much as possible. Although a limited form of 3D ultrasound imaging technology can now be implemented, a number of technical challenges makes it difficult to achieve high-quality 3D images in real-time.

One of the main obstacles is data bandwidth. For portable ultrasound systems to produce 3D images, the number of elements in an ultrasound array transducer² needs to be increased from an ‘n’ number of one-dimensional (1D) linear arrays to an ‘n^2’ number of 2D planar arrays. Given that n, or the number of 1D linear arrays, is typically between 128 and 256, it is clear that 3D imaging requires significantly more data to be processed. Typical 2D ultrasound imaging systems, which receive data by connecting an analog-to-digital converter (ADC, typically 40~60 MHz) to each element, have a maximum data bandwidth of just several gigabytes per second. It can therefore be roughly estimated that 3D ultrasound systems would require bandwidths that reach hundreds of gigabytes per second.

² Transducers: A device that uses piezoelectric materials to convert sound waves into electrical signals and vice-versa. It transmits and receives waves simultaneously.

Figure 2. A comparison of 2D and 3D ultrasound imaging systems

Therefore, numerous techniques that provide 3D imaging while reducing data usage are being studied including sparse array imaging, compressive sensing, and deep learning-based image reconstruction. There are also a number of studies on various techniques aimed at fundamentally improving bandwidth. It is, therefore, critical to overcome the limitations of 3D ultrasound imaging by developing technologies that can efficiently transmit and process the exploding amount of data.

The Potential and Challenges Facing AI in the Medical Field

Once normalized 3D image data is obtained, it needs to be interpreted in conjunction with relevant monitoring data and be ready to be used immediately for early diagnosis and prediction. Currently, medical professionals take on these duties, which are time-consuming and costly. In light of this, it is anticipated that AI will partially replace or assist these doctors as the technology evolves. However, this integration of AI into the medical field is unlikely to be smooth. Although AI has produced remarkable results in many areas, it faces significant challenges to become widely used in the medical field.

Among the various reasons for the shortcomings of AI medical diagnoses, the main factors were the complexity of data related to certain diseases and the subjectivity in interpreting this data. In most cases, the establishment of well-refined training datasets determines how well AI performs in the healthcare sector. However, in general only highly trained doctors are able to select, label, and create these datasets. This proved to be the biggest challenge in developing AI for healthcare. Additionally, on top of the limited amount of medical data that exists, it also takes an enormous amount of time and costs to refine the data. This, in turn, makes it difficult to obtain high-quality datasets. Even if such datasets are obtained, the data may be interpreted differently as there is a degree of subjectivity when a person generates the training datasets.

An additional challenge is synchronizing the massive amount of patient background data such as race, nationality, and culture that is collected to train the diagnostic medical AI. This seems to be the main reason why the diagnosis alignment rates of medical AI systems vary depending on the country of use. There was also the hassle of having to manually enter all of the electronic medical record (EMR) information into the system, which was very inconvenient and made it difficult to learn new information. This issue of securing data in the medical field has led to the development of technologies such as unsupervised learning that do not require reference data but only need large amounts of data to learn. Recent studies have shown that these technologies are attracting much attention by offering performance comparable to supervised learning³ systems.

³ Supervised learning: A subcategory of machine learning and AI which uses labeled datasets to train algorithms to classify data or predict outcomes accurately.

The Essence of Data in Medical AI

As it was established that the interpretation and synchronization of data presents issues for AI’s integration into healthcare, improving the collection and utilization of data will be key to advancing AI in the medical field. In the U.S., the importance of healthcare data has been recognized, and multi-institutional collaborations are underway for the collection of medical data. A prime example of such a collaboration between industry, academia, and hospitals is the Pittsburgh Health Data Alliance. Consisting of Carnegie Mellon University, the University of Pittsburgh, and the University of Pittsburgh Medical Center, the alliance collects healthcare data in the region to support research related to AI in the medical field.

Figure 3. The process of collecting and processing data for AI medical systems

As data is growing exponentially in both volume and importance, the most critical aspects of an AI system are efficient computation and the ability to process large amounts of data and bandwidth. In fact, the main reason for AI being in the spotlight today is due to the fact that it is backed by huge amounts of data and hardware technology that can handle this data. However, this was not always the case as AI has faced major challenges in its path to mainstream adoption.

Although AI’s fundamental principles were first proposed in the 1940s, the technology did not perform as expected due to limited hardware resources and minimal data available, causing the development of neural networks to stall. However, around 70 years later a convolutional neural network (CNN) called AlexNet demonstrated the incredible potential of AI. Backed by deep neural networks and big data, AlexNet showcased its capabilities in the 2012 ImageNet Large Scale Visual Recognition Challenge (ILSVRC), which evaluates algorithms for object detection and image classification. The system’s most significant feature was its ability to learn from training data.

In the past, machine learning with large volumes of data was challenging due to difficulties in data collection, limited computing power, and bandwidth. However, the development of parallel computing and high-bandwidth hardware made it possible to process huge amounts of data and train complex neural networks. Not only has the amount of data collected increased enormously, but techniques such as data augmentation have greatly increased the amount of training data by reproducing existing data to improve performance. Indeed, AlexNet has 60 million network parameters and uses two GPUs to perform massive operations efficiently. These networks are getting bigger as evident in the ultra-large artificial neural networks such as ChatGPT-3 that have as many as 175 billion parameters, nearly a 3,000-fold increase in size over the past decade. As the size of networks and training data increases, so does the need for hardware that can handle them efficiently.

Semiconductor Technologies’ Role in Next-Generation Healthcare

In the future, healthcare technology is expected to become personalized, allowing for constant monitoring of an individual’s health and early diagnosis of diseases. This will lead to an exponential increase in the amount of data while the importance of data will also grow, especially when combined with AI. As a result, the future of healthcare technology depends on the ability to collect and efficiently process well-refined data. Hardware, especially innovative semiconductor technology, is essential in this process. However, as current semiconductor memory technologies have reached the stage where nanometer (nm) processes are possible, performance improvements are beginning to reach their limits.

Consequently, global semiconductor companies are researching and investing in innovative next-generation semiconductors to meet this challenge. Depending on the data requirements of the respective systems, different types of semiconductor memories are used. Most systems such as point-of-care typically use Double Data Rate (DDR) 2 DRAM through DDR4 DRAM, while systems that process heavy loads of computation with GPUs use Graphics Double Data Rate 5. Additionally, systems that have extremely large data such as data centers for cloud computing use High Bandwidth Memory 3 (HBM3) or memory with built-in PIM-based accelerators. The former has rapid operating speeds and high bandwidth, enabling faster data processing for AI applications, and the latter allows computation to be performed directly in memory. These features make both technologies a good fit for training diagnostic medical AI down the road.

Need for a Paradigm Shift in Healthcare

To manage the growing data in the healthcare field, there needs to be further research into semiconductor technologies beyond transistors, and it will become critical to develop innovative solutions that can fundamentally change the current paradigm. Such game-changing advancements could include accelerators based on silicon photonics that use light for computing and data transmission. As the need for a new age of personalized healthcare grows with the aging of societies around the world, advancements in AI and data collection will prove to be key to realizing the next generation of healthcare.

<Other articles from this series>

[DGIST Series] How the Quest for AI Led to Next-Generation Memory & Computing Processors

[DGIST Series] How Broadband Interface Circuits Are Evolving for Optimal Data Transfer

[DGIST Series] The Role of Semiconductor Technologies in Future Robotics

[DGIST Series] Silicon Photonics: Revolutionizing Data Transfers by Unleashing the Power of Light

[DGIST Series] AI-Powered Micro/Nanorobots to Revolutionize Medical Field

[DGIST Series] Sensor Interfaces and ADC Circuits: Bridging the Physical and Digital Worlds