It is a retrospective study and the one of the most important thing is the preparation of the relevant dataset.

Hi everyone! I am Umay Kiraz, ESR11. In this periodic post, I would like to explain how we prepare our dataset.

At the beginning of the research, the one of the most important thing is the preparation of the relevant data-set. Patients applied with different symptoms and found a mass in their breasts to the Stavanger hospital. Afterwards, a biopsy was taken from their masses. Biopsies were evaluated by the pathologist by tumor characteristics and special immunohistochemical staining. After biopsy assessment, they decided on the subtype of breast cancer and noted the features of tumor in their pathology reports.

In my project I am working on Triple Negative Breast Cancer (TNBC) which means there is no staining with ER, PR and HER2 by immunohistochemistry. This is a retrospective study and we included approximately 300 patients with TNBC identified from the database of the pathology department. We aimed to use clinical and histopathological factors such as tumor size, tumor type, tumor grade, lymph node status, hormonal receptor. HER2 status, proliferation markers, tumor-infiltrating lymphocytes (TILs) status, lymphovascular invasion, metastasis status. We retrieved some of these information from the pathology reports. Then, I completed the missing information and updated the dates related to survival, such as the date of last control, the date of metastasis if any, and the date of death if died.

Apart from these, the other important features which I will evaluate are prognostic factors. Proliferation markers (mitotic index, Ki67 and PPH3) and TILs are leading prognostic factors for breast cancer. These are counted from slides under microscope and there are many factors that impair the quality of slides during tissue processing (Figure 1). As a result of this, we see thick-sectioning slides, staining problems or some artifacts as blurry parts, folding in slides that make it difficult to evaluate the proliferation. Therefore, it is important to check the quality of the slides before this evaluation and scanning.

Currently, I am working on quality control of the slides and images (Figure 2) as I mentioned above (Figure 3). I am trying to complete any missing or poor quality slides or images by recutting the corresponding blocks. After everything is going well, we will scan the slides and obtain the whole slide images. This is a brief summary how we prepare a dataset. My next step, counting mitosis and TILs from the slides under microscope, and I am appreciate to explain more about it in my next blog.

Figure 1: TNBC slides with H&E staining

Figure 2: Whole slide Image of TNBC

Figure 3: Checking the quality of WSIs


Umay kiraz – ESR11.