Culture & People

The Past, Present, and Future of Data Science Organization at SK hynix

By September 29, 2020 October 8th, 2020 No Comments

In 2017, SK hynix became the first Korean manufacturing company which established a Data Science (DS) organization. It began with 40 people by gathering data analysts scattered across the company, and now has become in charge of the company-wide DS and artificial intelligence (AI). The organization is performing various tasks such as defect detection and prediction, cause analysis, and yield analysis by applying statistics, machine learning1 and deep learning2 algorithms3.

Semiconductor Industry, Full of New Opportunities for Data Science

Image Download

The main job of data analysts in the past was to assist decision-making through statistical consulting; however, they are now focusing on developing algorithms that can estimate and make decisions by themselves through AI and big data technologies.

Global IT companies including Google, Facebook, and Amazon have already established their own DS organizations, which are now developing AI algorithms to optimize search, recommendation, advertising, and more. Unlike IT companies that have their own online platforms which can collect and process data, manufacturing companies based on production facilities need to secure the required data before developing algorithms and realize the “digital transformation” which promotes the sophistication of IT systems.

In general, semiconductor companies have paid attention to data analysis earlier so that the situation is better than other manufacturing companies. They have been collecting vast but well-refined data for many years, and field engineers are making full use of this data when analyzing the causes of failures. For example, sensor signal data of equipment status and measurement data of wafers after processing are transmitted to computer servers. Based on the data collected this way, engineers check the equipment status and process results to take appropriate actions.

Therefore, the semiconductor sector has full of opportunities to apply the latest AI technologies while discovering new values from equipment, processes, and engineer activity data, rather than from customer activity data.

Footsteps of SK hynix’s Data Science Organization: “Rooting AI Technologies in the Field”

Image Download

DS organization has been performing various tasks in the field to understand domain knowledge related to semiconductors with experience in the AI technology application. At the beginning of its launch, it performed quick-win tasks by applying the latest information and communication technology (ICT), such as AI and big data, while expanding the base of company-wide analysis. After that, the organization selected tasks for technological innovations that could contribute to the yield, productivity, and quality, and collaborated with the field by defining the goal together.

From 2019, the organization had pursued a product-oriented development method that reflects necessary functions by receiving feedback from the field. It also focused on developing and distributing algorithms so that the analysis function can be used on production lines, and then building a system that solves operational issues. Product-oriented development tasks include Intelligent Visual Inspection Analytics4 (IVIA), an image-based defect detection and classification task, and Sherlock5, a task for predicting quality scores of chips based on wafer test results.

In addition, the organization is making efforts to solve the problems of developing tasks for the same purpose in different projects and establish overlapping IT infrastructures. As a part of this effort, it is now developing “Design Analytics Yourself (DAY) 6”, an analytics platform providing a common data analysis service, and “AI Service Platform (AIP)” for operating AI models. These AI platforms provide data analysis experts with an integrated analysis development environment where they can focus only on analysis, without worrying about the operation of AI models. This not only maximizes the productivity of AI tasks but also enables efficient management of resources.

Data Science Organization Operated with Internal and External Talents

Image Download

DS organization consists of a Headquarters Team and a Field Team. Domain DS Team responds to analysis requests from the field, and Company-wide DS Team is in charge of solving problems in the field that cannot be solved by Domain DS Team, as well as analyzing products and establishing platforms.

Domain DS Team is also playing an important role in spreading a field-oriented data-based decision-making culture by collaborating with experienced data analysis experts, who are also known as Citizen Data Scientists (CDSs) in the field. Under the perception of “tasks can be well defined in the field”, SK hynix has been providing data science training for field engineers since 2019 to nurture them into versatile talents with full DS capabilities. A CDS with analysis capabilities is an expert who understands the value and the operation of AI algorithms, and SK hynix is expecting to retain about 300 CDSs this year.

Thousands of papers on AI are published every year7, and companies, universities, and research institutes are continuously developing new AI technologies. Along with the rapid AI-related innovation, DS organization has established an “AI Collaboration Center (AICC)” with universities for AI research and technology application, in order to utilize the AI-related innovative technologies. The main goals of AICC are as follows: Exploring the latest AI technologies; understanding the technology roadmap and development trend; solving field problems with a new perspective; securing a pool of AI researchers who are familiar with semiconductor data through task execution.

This year, the organization established an AICC with KAIST and conducted six AI collaboration studies8, with the theme of securing technology for AI model operation. Next year, it plans to expand the collaboration targets to Seoul National University and Pohang University of Science and Technology (POSTECH) to carry out 22 tasks.

What DS Organization is Doing Now: “Performing Field Analysis and Technological Innovation Tasks”

Image Download

SK hynix possesses data which are meaningful in terms of data volume, velocity, and variety. The company loads tens of petabytes9 of big data collected in the semiconductor process, and processes them in real-time for defect detection and notification. It also collects and analyzes various data from different processes and equipment. The data includes the followings: wafer measurement data for each process; time-series data of sensors mounted on equipment; wafer test data; wafer height data of photo equipment; defect image data of inspection equipment.

Image Download

Based on this data, DS organization is conducting various analysis tasks such as △detection and control of equipment abnormalities △classification of wafer and memory defects △optimization of semiconductor design, and △equipment advancement. In addition, it is carrying out other tasks including △“facility data analysis” to optimize facility operation and reduce energy costs △“news text analysis” to predict demand in the memory market △ searching for reports on defects and recommending related documents △Chatbot-based analysis query and visualization, and △“cohort analysis10” for preemptive management of major problems in the working environment.

More than 300 projects are underway so far in the Company-wide and seven Domain DS Teams this year, including the implementation of technological innovation tasks and the development of internal products and platforms. This includes not only responding to tasks requested by each field such as design, element, process, yield, and equipment, but also marketing of the entire supply chain, procurement, human resource (HR) management, strategies, facilities, and management of the safety, health, and environment (SHE).

SK hynix’s Dream of Becoming an AI-based “Intelligent Company”

The ultimate goal of DS organization is to grow SK hynix into an AI-based “intelligent company”. To realize this, it is necessary to apply AI to the production lines in the field and establish a system where field engineers can operate it on their own. For this, DS organization has accumulated experience of applying AI tasks.

Image Download

IVIA, which is one of the most representative cases, is a task for the automation of the visual inspection on defects, and it has been in progress for two years since 2018. Through this task, an AI model for initial defect detection and classification was developed and a minimum viable product (MVP)11 was achieved. Also, an AI-led inspection system in the production process was completed. The defect detection and classification algorithm technology based on deep learning reduced the existing repetitive manual work by more than 90% and allowed the finding of defects with higher accuracy.

Currently, many cases of using AI are found in the manufacturing industry, but there are not many cases where an AI-based operating system is established. To switch to an AI-led inspection system, an AI performance monitoring system that quickly analyzes the cause and recovers itself in case of performance degradation should be prepared. For this, the AI model requires a function of debugging12 itself and automatically learning new defects when they occur. A distribution system of the AI model should be established as well.

The Future Direction of the Data Science Organization

An AI scientist Andrew Ng, professor at Stanford University said, “If a typical person can do a mental task with less than one second of thought (such as defect detection and classification), we can probably automate it using AI either now or in the near future.” As he suggests, AI will be applied to all industries, bringing innovation, as electricity did in the past. Bringing and applying this innovation to SK hynix is the core task that DS organization has been doing and will continue to do.

Image Download

Currently, DS organization is working to establish a data-based decision-making culture in SK hynix by △ discovering and performing field analysis tasks △strengthening field analysis capabilities, and △supporting the development of analysis automation tools. Furthermore, the organization is establishing a system where AI makes decisions and people manage AI for the development as an intelligent company.

The organization is also currently researching and developing AI technology for field application. It is now introducing and utilizing mature AI technologies, such as vision13 and natural language processing (NLP)14. It has also built an analysis development environment and an AI model operating platform, enhancing efficiency in the AI model development.

As the scope of the introduction of AI expands in the future, the roles and responsibilities of field workers will be different from the previous ones. Therefore, to apply the AI technology to the field and continue to utilize it, it is necessary to consider not only integration with the existing system but also the innovation in the way of working. It is also necessary to think about training for AI model operation and about model management methods so that operational issues can be resolved by themselves in the field when there is a problem with AI.


1A technology or a system (program) which allows AI to learn data and make predictions on its own
2A field of machine learning; a technology or a system (program) which expresses big data in a form that can be processed by a computer, such as a vector or a graph, and builds an abstraction model which learns this data
3A set of rules and procedures defined to resolve a problem
4A task of judging good/defective images from measurement equipment and determining the defect type
5A task of predicting the quality when combined as a memory module, based on the wafer chip unit test results
6A platform which provides the development environment necessary for analysis algorithm developers in a container method; it includes functions as follows: an analysis code editor such as Jupyter Notebook; an in-house analysis algorithm library; workflow design and execution for periodic implementation; a result visualization tool.
7 SPRi AI Brief, Software Policy & Research Institute (SPRi) (2020)
8Data drift detection after applying AI; new defect detection and reclassification (Open-Set Recognition, Multi Task Learning); learning with a limited amount of data (Few Shot Learning).
9A unit expressing the size of data; 1PB = 1024TB (terabyte).
10An analysis technique that compares and analyzes the data of the cohort classified by each criterion
11A product which implements minimal features by receiving customer feedback; it also refers to a progressive method of development which improves the level of completion by receiving requests for additional functions.
12Finding and correcting errors in computer programs
13 A technology which realizes the human visual system through a specific algorithm
14 A technology to analyze and process human’s natural language through a specific algorithm


Fellow, Head of Data Science at SK hynix Inc.