Leading machine learning tools, how will deep learning change the field of medical imaging?

Deep learning is showing a growing trend in data analysis and is known as one of the 10 breakthrough technologies in 2013 [1]. It is an improvement of the neural network, including more layers of computation, enabling higher levels of abstraction and prediction in the data [2]. So far, it is becoming the leading machine learning tool in the field of general imaging and computer vision.

In particular, Convolutional Neural Networks (CNN) have proven to be an advantageous tool for many computer vision tasks. Deep CNN automatically learns intermediate and advanced abstractions derived from raw data (eg, images). Recent results indicate that the generic descriptor extracted from CNN is very effective in object recognition and localization of natural images. Medical image analysis groups around the world are rapidly entering the field and applying CNN and other deep learning methods to a wide range of applications. Many good results are emerging.

In the field of medical imaging, accurate diagnosis or assessment of disease depends on image acquisition and image interpretation. In recent years, with the development of technology, devices can collect data at a faster rate and more powerful resolution, which greatly improves the quality of image acquisition. However, improvements in image interpretation by computer technology are just beginning. Currently, most medical image interpretations are performed by doctors. However, image interpretation by humans is often one-sided because of its subjectivity, large changes in different interpreters, and fatigue. Many diagnostic tasks require an initial search process to detect anomalies and quantify changes in measurements and time. Computerized tools, especially image analysis and machine learning, play a key role in improving diagnosis. They support expert workflows by helping identify areas that require treatment. Among these tools, deep learning is quickly confirmed as a basis for superiority and accuracy. It has also opened up new areas of data analysis and is evolving at an unprecedented rate.

A. Historical network

The basic ideas behind neural networks and deep learning have existed for decades [3]. They usually have only a few layers. The emergence of backpropagation algorithms has led to a significant increase in the performance of neural networks. However, performance is still not enough. Other classifiers have evolved, including decision trees, boosTIng, and support vector machines. Each of them has been applied to medical image analysis, especially for detecting anomalies, and they have also been applied in other related fields such as segmentaTIon. Despite this development, relatively high false positive rates are still common.

As early as 1996 in the work of Sahiner et al., CNN (Convolutional Neural Network) was applied to medical image processing [4]. In this work, ROIs (Region of Interests) containing biopsy-confirmed masses or normal tissues were extracted from mammograms. The CNN contains an input layer, two hidden layers, and an output layer, as well as backpropagation used. In this pre-GPU era, training time was described as "calculatively intensive" but did not give specific time. In 1993, CNN was used for lung nodule detection [5]. In 1995, CNN was used to detect microcalcifications in mammograms [6].

A typical CNN for image processing consists of a series of layers of convolution filters interspersed with a series of data compression or pooling layers. A convolution filter (convoluTIon filter) processes a small block of the input image. Similar to low-level pixel processing of the human brain, convolution filters can detect highly correlated image features, such as lines or circles that can represent sharp edges (for example, for organ detection) or circles (such as objects for circles, Like colon polyps, then high-order features such as local or global shapes and textures. The output of the CNN is typically a label of one or more probabilities or categories corresponding to the image. The convolution filter can learn directly from the trained data. This is exactly what people need because it reduces the need for time-tagged features that take time. If a convolution filter is not used, then in the pre-processed image phase, filters designed for a particular application and some features that need to be computed are inseparable from these artifacts.

CNN is a highly parallelized algorithm. A large part of the utility of using CNN is derived from the huge speed increase (about 40 times) that is facilitated by the image processing unit (GPU) compared to separate CPU processing. An early paper describing the value of GPUs for training CNN and other machine learning techniques was published in 2006 [8]. In medical image processing, GPUs are first introduced for segmentation, reconstruction, and registration, and then machine learning [9], [10]. Interestingly, although Eklund et al. [10] talked extensively about convolution in their 2013 paper, convolutional neural networks and deep learning were not mentioned at all. This highlights how rapidly the major reforms in deep learning have rapidly adjusted medical image processing research.

B. Today's network

Deep neural networks have recently gained considerable commercial interest due to the development of new variants of CNN and the emergence of efficient parallel solvers for modern GPU optimization. The power of CNN benefits from its deep architecture, which allows it to extract a range of distinguishing features at different levels of abstraction. Training a deep convolutional neural network from scratch is a huge challenge. First, CNN requires a large amount of tagged data, which is difficult to achieve in the medical field. This is because it is very expensive to ask an expert to mark it, and samples of diseases (such as lesions) are very rare. Second, training deep CNN requires a lot of computational and memory resources. Without them, the training process can be very time consuming. Third, training a deep CNN often complicates due to overfitting and convergence problems, often requiring repeated adjustments to the learning parameters or architecture of the network to ensure that all layers learn at a fairly speed. In view of the above difficulties, some new learning programs called "transfer learning" and "fine-tuning" have been proposed to provide solutions and are accepted by more and more people. These will be discussed further in Section II-C.

C. Network in the medical field

The domain deep learning method is most effective when applied to large training sets, but in the medical field, it is not always possible to obtain large data sets. Therefore, we face a series of major challenges, including: (a) Can deep neural networks be used effectively in medical tasks? (b) Is it relevant to transfer learning from general imagery to the medical field? (c) Can we rely solely on the characteristics of learning, or can we combine them with hand-crafted functions to accomplish the task? This IEEE Imaging (IEEE-TMI) special problem for deep learning of medical imaging focuses on the advancement of this new era of machine learning and its role in the field of medical image processing. This question describes the recent achievements of CNN and other deep learning applications in medical tasks. It contains 18 articles selected from 50 papers from various investigators around the world. This is a very high number for the IEEE special issue, and this is the length of time from the publication of the paper to the submission deadline. It used to be realized in a short period of time. The paper focuses on a large number of traditional tasks from detection to categorization (eg, lesion detection, image segmentation, shape modeling, image registration), as well as open and novel applications. It also includes some work focused on network exploration, and gives a perspective on how different tasks, parameters, and training sets should be chosen.