MMSegmentation - IoUMetric 기반의 CustomEvaluationMetric 적용기

카테고리

MMSegmentation

Index

Cloud Computing

MMSegmentation

Diceloss

날짜

2024/07/05

MMSegmentation에서 제공하고 있는 Evaluator에는 대표적으로 IoUMetric과 CityScapesMetric이 있다.

현재 내가 validation 과정에서 사용하고 있는 evaluator는 IoUMetric으로, MMEngine의 BaseMetric을 기반으로 구현된 evaluator이다.

IoUMetric에서는 기본적으로 validation 과정에서 compute_metrics 함수를 통해 aAcc, mAcc, mIoU, mDice, mFscore, mPrecision, mRecall 점수를 제공한다.

MMSegmentation의 IoUMetric Evaluator을 사용한 Swin-Segmenter Finetuning log

현재 Swin-T 모델을 백본으로 사용한 Segmenter 모델을 KiTS21 데이터셋에서 Finetuning 하는 실험 로그를 보면 매 validation마다 개별 클래스에 대한 IoU, Acc 값과 전체 aAcc, mIou, mAcc, 값을 확인할 수 있다.

하지만 MMSegmentation에서 기본적으로 제공하는 IoUMetric에서는 개별 클래스에 대한 IoU값, Dice 값을 활용해서 EarlyStopping 을 적용할 수 없기 때문에 Custom Evaluation Metric을 생성해서 kidney class의 Dice loss에 대해서만 EarlyStopping을 적용해보기로 하였다.

MMSegmentation 의 Custom Evaluation Metric 적용

1. CustomDiceMetric 클래스 생성

먼저 mmseg.evaluation.IoUMetric을 상속받은 CustomDiceMetric 클래스를 생성해서 Metric Registry에 등록을 해주어야한다.

# Copyright (c) OpenMMLab. All rights reserved.
import os.path as osp
from collections import OrderedDict
from typing import Dict, List, Optional, Sequence

import numpy as np
import torch
from mmengine.dist import is_main_process
from mmengine.evaluator import BaseMetric
from mmengine.logging import MMLogger, print_log
from mmengine.utils import mkdir_or_exist
from PIL import Image
from prettytable import PrettyTable

from mmseg.registry import METRICS
from .iou_metric import IoUMetric

@METRICS.register_module()
class CustomDiceMetric(IoUMetric):
    """Custom Dice evaluation metric for a specific class.

    Args:
        target_class_index (int): Index of the class to be monitored.
        ignore_index (int): Index that will be ignored in evaluation.
            Default: 255.
        iou_metrics (list[str] | str): Metrics to be calculated, the options
            include 'mIoU', 'mDice' and 'mFscore'.
        nan_to_num (int, optional): If specified, NaN values will be replaced
            by the numbers defined by the user. Default: None.
        beta (int): Determines the weight of recall in the combined score.
            Default: 1.
        collect_device (str): Device name used for collecting results from
            different ranks during distributed training. Must be 'cpu' or
            'gpu'. Defaults to 'cpu'.
        output_dir (str): The directory for output prediction. Defaults to
            None.
        format_only (bool): Only format result for results commit without
            perform evaluation. It is useful when you want to save the result
            to a specific format and submit it to the test server.
            Defaults to False.
        prefix (str, optional): The prefix that will be added in the metric
            names to disambiguate homonymous metrics of different evaluators.
            If prefix is not provided in the argument, self.default_prefix
            will be used instead. Defaults to None.
    """

    def __init__(self,
                 target_class_index: int,
                 ignore_index: int = 255,
                 iou_metrics: List[str] = ['mIoU'],
                 nan_to_num: Optional[int] = None,
                 beta: int = 1,
                 collect_device: str = 'cpu',
                 output_dir: Optional[str] = None,
                 format_only: bool = False,
                 prefix: Optional[str] = None,
                 **kwargs) -> None:
        super().__init__(ignore_index=ignore_index, iou_metrics=iou_metrics, nan_to_num=nan_to_num, beta=beta, collect_device=collect_device, output_dir=output_dir, format_only=format_only, prefix=prefix, **kwargs)
        self.target_class_index = target_class_index

    def compute_metrics(self, results: list) -> Dict[str, float]:
        """Compute the metrics from processed results.

        Args:
            results (list): The processed results of each batch.

        Returns:
            Dict[str, float]: The computed metrics. The keys are the names of
                the metrics, and the values are corresponding results. The key
                mainly includes aAcc, mIoU, mAcc, mDice, mFscore, mPrecision,
                mRecall.
        """
        logger: MMLogger = MMLogger.get_current_instance()
        if self.format_only:
            logger.info(f'results are saved to {osp.dirname(self.output_dir)}')
            return OrderedDict()
        
        results = tuple(zip(*results))
        assert len(results) == 4

        total_area_intersect = sum(results[0])
        total_area_union = sum(results[1])
        total_area_pred_label = sum(results[2])
        total_area_label = sum(results[3])
        ret_metrics = self.total_area_to_metrics(
            total_area_intersect, total_area_union, total_area_pred_label,
            total_area_label, self.metrics, self.nan_to_num, self.beta)

        class_names = self.dataset_meta['classes']

        ret_metrics_summary = OrderedDict({
            ret_metric: np.round(np.nanmean(ret_metric_value) * 100, 2)
            for ret_metric, ret_metric_value in ret_metrics.items()
        })
        metrics = dict()
        for key, val in ret_metrics_summary.items():
            if key == 'aAcc':
                metrics[key] = val
            else:
                metrics['m' + key] = val

        ret_metrics.pop('aAcc', None)
        ret_metrics_class = OrderedDict({
            ret_metric: np.round(ret_metric_value * 100, 2)
            for ret_metric, ret_metric_value in ret_metrics.items()
        })
        ret_metrics_class.update({'Class': class_names})
        ret_metrics_class.move_to_end('Class', last=False)
        class_table_data = PrettyTable()
        for key, val in ret_metrics_class.items():
            class_table_data.add_column(key, val)

        print_log('per class results:', logger)
        print_log('\n' + class_table_data.get_string(), logger=logger)

        # Target 클래스의 Dice 점수만 반환
        target_dice = ret_metrics['Dice'][self.target_class_index]
        metrics['target_class_dice'] = target_dice * 100 # 소수점 자리수 * 100 

        return metrics
Python
복사
VCMI/mmseg/evaluation/metrics/custom_dice.py

→ 기존 iouMetric 클래스에 target_class_index를 매개변수로 입력받게 해서 추적해야하는 class에 대해서 dice 값을 추가적으로 반환할 수 있도록 하였다.

→ 타켓 클래스의 dice loss 값은 target_class_dice로 로그화면에 출력되게 한다.

그리고 같은 evaluation/metrics 디렉토리에 있는 init .py 에서 새로 생성한 CustomDiceloss를 등록해주어야한다.

# Copyright (c) OpenMMLab. All rights reserved.
from .citys_metric import CityscapesMetric
from .depth_metric import DepthMetric
from .iou_metric import IoUMetric
from .custom_dice import CustomDiceMetric
__all__ = ['IoUMetric', 'CityscapesMetric', 'DepthMetric', 'CustomDiceMetric']
Python
복사
VCMI/mmseg/evaluation/metrics/__init__.py

2. Config에서 CustomDiceLoss Evaluator 적용하기

프로젝트 config 파일에서 val_evaluator, test_evaluator 변경하기

먼저, 현재 프로젝트의 config에서 스케줄러 관련 config 파일에서 val_evaluator, test_evaluator를 모두 CustomDiceMetric evaluator로 변경해주어야한다.

val_evaluator = dict(type='CustomDiceMetric', target_class_index=1,iou_metrics=['mIoU', 'mDice'])
test_evaluator = dict(
    format_only= True,
    keep_results=True,
    output_dir='./work_dirs/Swin-Seg/batch24lr0.01/format_results',
    iou_metrics=['mIoU','mDice'],
    target_class_index=1,
    type='CustomDiceMetric')
Python
복사
VCMI/segmenter_swin-t_mask_8xb1-160k_ade20k-512x512.py 

→ val_evaluator, test_evaluator 모두 type을 ‘CustomDiceMetric’으로 변경해준다.

→ 현재 kidney class의 클래스 인덱스는 1이기 때문에 target_class_index를 1로 변경해주고, 사용 metric으로 iou_metrics=[’mIoU’, ‘mDice’]를 지정해주면 된다.

Scheduler config에서 EarlyStopping 조건 변경하기

그 다음으로는 EarlyStopping 관련 scheduler config를 수정해주어야한다.

기존 EarlyStopping 조건은 mIoU를 기준으로 30 epoch 동안 소수점 둘째자리가 변하지 않으면 조기종료하는 방식으로 진행했었다.

# default hooks including early stopping
default_hooks = dict(
    timer=dict(type='IterTimerHook'),  # IterTimerHook을 유지
    logger=dict(type='LoggerHook', log_metric_by_epoch=True, interval=1),
    param_scheduler=dict(type='ParamSchedulerHook'),
    checkpoint=dict(type='CheckpointHook', by_epoch=True, interval=10),  # save checkpoint every 10 epochs
    sampler_seed=dict(type='DistSamplerSeedHook'),
    visualization=dict(type='SegVisualizationHook'),
    early_stopping=dict(
        type='EarlyStoppingHook',
        monitor='target_class_dice',  # Metric to monitor
        patience=30,  # Number of epochs to wait for improvement
        min_delta=0.01,  # Minimum change to qualify as an improvement
        rule = 'greater'
    )
)
Python
복사
VCMI/configs/base/schedules/schedule.py

kidney class의 dice loss 점수가 30 epoch 동안 소수점 둘째자리가 변하지 않으면 조기종료하는 방식으로 EarlyStopping을 주기 위해서는 monitor 조건을 ‘target_class_dice’ 로 설정하고, patience ‘30’, min_delta ‘0.01’, rule ‘greater’로 지정하면 된다.

3. CustoDiceMetric 적용 결과

CustomDiceMetric을 적용해서 학습을 진행하였을 때 매 validation 과정마다 다음과 같이 로그가 발행하는 것을 확인 할 수 있었다.

각 클래스마다 Dice loss가 추가적으로 측정되고, 최종 metric 결과에서도 두 클래스의 dice loss 값의 평균을 구한 mDice 값과 kidney class 에대한 개별 dice loss 값을 구한 target_class_dice가 기록되는 것을 확인할 수 있다.