인공지능로봇연구실

객체 검출을 위한 미세조정 예제 및 커스텀 데이터 실습

작성자정다훈|작성시간24.01.26|조회수77 목록 댓글 2

# 예제 사이트

https://github.com/greght/Workshop-Torchvision-Object-Detection/tree/main

GitHub - greght/Workshop-Torchvision-Object-Detection

Contribute to greght/Workshop-Torchvision-Object-Detection development by creating an account on GitHub.

github.com

# 미세조정(Fine tuning)과 전이학습(Transfer learning)의 차이점

둘의 차이점

#데이터셋 정의하기

객체 검출, 인스턴스 분할 및 사용자 키포인트(Keypoint) 검출을 학습하기 위한 참조 스크립트를 통해 새로운 사용자 정의 데이터셋 추가를 쉽게 진행할 수 있다. 데이터셋은 표준 torch.utils.data.Dataset 클래스를 상속 받아야 하며, __len__ 와 __getitem__ 메소드를 구현해 주어야 한다

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
#Data_set.py
import os
import numpy as np
import utils
import transforms as T
import torch
import torch.utils.data
from PIL import Image
 
 
class PennFudanDataset(torch.utils.data.Dataset):
    def __init__(self, root, transforms=None):
        self.root = root
        self.transforms = transforms
        # load all image files, sorting them to
        # ensure that they are aligned
        self.imgs = list(sorted(os.listdir(os.path.join(root, "PNGImages"))))
        self.annot = list(sorted(os.listdir(os.path.join(root, "Annotation"))))
 
    def __getitem__(self, idx):
        # load images and masks
        img_path = os.path.join(self.root, "PNGImages", self.imgs[idx])
        annot_path = os.path.join(self.root, "Annotation", self.annot[idx])
        img = Image.open(img_path).convert("RGB")
 
        # get bounding box coordinates for each mask
        boxes = []
        with open(annot_path) as fin:
          for line in fin:
            if 'Xmin' in line:
              bounds = line.replace('(','').replace(')','').replace(',','').replace('-','').split()[11:]
              bounds = [int(x) for x in bounds]
              boxes.append(bounds)
 
        boxes = torch.as_tensor(boxes, dtype=torch.float32)
 
        # area of each bounding box
        area = (boxes[:, 3] - boxes[:, 1]) * (boxes[:, 2] - boxes[:, 0])
 
        # there is only one class (besides background)
        labels = torch.ones((len(boxes),), dtype=torch.int64)
 
        # define id for this image
        image_id = torch.tensor([idx])
 
        # suppose all instances are not crowd
        iscrowd = torch.zeros((len(boxes),), dtype=torch.int64)
 
        # put it into the dict
        target = {}
        target["boxes"] = boxes
        target["labels"] = labels
        target["image_id"] = image_id
        target["area"] = area
        target["iscrowd"] = iscrowd
        #target["img_size"]=img_size
 
        if self.transforms is not None:
            img,target= self.transforms(img,target)
 
        return img, target
 
    def __len__(self):
        return len(self.imgs)
    
def get_transform(train):
    transforms = []
    # converts the image, a PIL image, into a PyTorch Tensor
    transforms.append(T.ToTensor())
    if train:
        # during training, randomly flip the training images
        # and ground-truth for data augmentation
        transforms.append(T.RandomHorizontalFlip(0.5))
    return T.Compose(transforms)
    
 
# use our dataset and defined transformations
path='./Object Detection/PennFudanPed'
dataset =PennFudanDataset(path, get_transform(train=True))
dataset_test = PennFudanDataset(path, get_transform(train=False))
 
# split the dataset in train and test set
torch.manual_seed(1)
indices = torch.randperm(len(dataset)).tolist()
dataset = torch.utils.data.Subset(dataset, indices[:-50])
dataset_test = torch.utils.data.Subset(dataset_test, indices[-50:])
 
# # define training and validation data loaders
data_loader = torch.utils.data.DataLoader(
    dataset, batch_size=2, shuffle=True, num_workers=0,
    collate_fn=utils.collate_fn)
 
data_loader_test = torch.utils.data.DataLoader(
    dataset_test, batch_size=1, shuffle=False, num_workers=0,
    collate_fn=utils.collate_fn)
 
Colored by Color Scripter
cs

# 모델 정의하기

보행자 데이터를 학습하기 위한 모델은 fasterrcnn_resnet50_fpn 모델로 3가지 구성요소로 이루어진 모델이다.

첫번째 구성요소는 Faster R-CNN으로 물체를 검출하기 위한 종합적인 아키텍처로 RPN과 ROI Polling 으로 이루어져있다.

RPN은 후보경계상자를 생성하여 물체가 존재할 가능성이 있는 영역을 제시한다.

ROI Polling은 각 후보 경계 상자에 해당하는 특징을 추출하고 이를 사용하여 물체의 클래스 및 경계 상자를 예측한다.

두번째 구성요소는 Resnet50 모델이다. Resnet은 깊은신경망을 효과적으로 학습할 수 있는 모델이다.

Resnet 50은 Resnet모델의 한종류로서 50개의 레이어로 구성된다. 주로 이미지 추출을 위해 사용된다.

세번째 구성요소는 FPN 이다. FPN은 이미지의 다양한 해상도에서 특징을 추출하는 네트워크 구조이다. 다양한 크기의 객체를 탐지하고 처리하기 위해 여러 해상도의 특징 맵을 결합한다.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
import torchvision
from torchvision.models.detection.faster_rcnn import FastRCNNPredictor
 
      
def get_object_detection_model(num_classes):
    # load an object detection model pre-trained on COCO
    model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
 
    # get the number of input features for the classifier
    in_features = model.roi_heads.box_predictor.cls_score.in_features
    # replace the pre-trained head with a new one
    model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)
 
    return model
 
Colored by Color Scripter
cs

fasterrcnn_resnet50_fpn 모델을 COCO(객체검출 및 분할을 위한) 데이터셋으로 학습되었다.

COCO데이터 셋이란 80개의 클레스로 되어있고 바운딩 박스 정보는 0과 1사이의 값으로 정규화 되어있다.

내가 훈련할 데이터를 COCO데이터셋의 형식에 맞춰서 훈련해줘야한다.

# 훈련

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
from engine import train_one_epoch, eval‎uate
import Data_set as ds
import model as md
from torchvision import transforms
 
device = ds.torch.device('cuda') if ds.torch.cuda.is_available() else ds.torch.device('cpu')
 
# our dataset has two classes only - background and person
num_classes = 2
 
# get the model using our helper function
model = md.get_object_detection_model(num_classes)
# move model to the right device (cpu or gpu)
model.to(device)
 
# construct an optimizer
params = [p for p in model.parameters() if p.requires_grad]
optimizer = ds.torch.optim.SGD(params, lr=0.005,
                            momentum=0.9, weight_decay=0.0005)
 
# and a learning rate scheduler which decreases the learning rate by
# 10x every 3 epochs
lr_scheduler = ds.torch.optim.lr_scheduler.StepLR(optimizer,
                                               step_size=3,
                                               gamma=0.1)
# let's train it for 10 epochs
num_epochs = 10
 
precision = []
recall = []
for epoch in range(num_epochs):
    # train for one epoch, printing every 10 iterations
    train_one_epoch(model, optimizer, ds.data_loader, device, epoch, print_freq=10)
    # update the learning rate
    lr_scheduler.step()
    # eval‎uate on the test dataset
    eval‎uate(model, ds.data_loader_test, device=device)
 
path = 'models/fasterrcnn_resnet50_fpn_Penn.pth'
ds.torch.save(model, path)
 
Colored by Color Scripter
cs

# 테스트

훈련을 통해 얻은 가중치를 가지고 테스트

바운딩 박스의 예측값이 0.8이상이면 바운딩 박스를 그려주도록 함

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
from PIL import ImageDraw
from PIL import Image
import Data_set as ds
import torch
import matplotlib.pyplot as plt
 
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
model = torch.load('./models/fasterrcnn_resnet50_fpn_Penn.pth')
# pick one image from the test set
img, _ = ds.dataset_test[10]
# put the model in eval‎uation mode
model.eval‎()
with torch.no_grad():
    prediction = model([img.to(device)])
 
img = Image.fromarray(img.mul(255).permute(1, 2, 0).byte().numpy())
img1 = ImageDraw.Draw(img)
 
boxes = prediction[0]['boxes']
scores = prediction[0]['scores']
 
for i,b in enumerate(boxes):
  if scores[i] > 0.8:
    img1.rectangle(b.tolist(), outline = "red", width=2)
    img1.rectangle(b.tolist(), outline = "red", width=1)
 
plt.imshow(img)
plt.show()
Colored by Color Scripter
cs

이 튜토리얼에서는 transform Tool, train Tool, eval‎ Tool을 모듈형식으로 만들어서 import했다.

깃허브에서 현제 작업 디렉토리로 가져와서 사용했는데 튜토리얼에서 사용한 버전과 내가 깃허브에서 가져온 버전이 달라서 오류가 났었다.

버전 0.3.0 v 및 특정 모듈 수정한 파일

detection.zip

10.57KB

# 모델의 인풋 데이터 확인하기

input

# 모델의 아웃풋 데이터 확인하기

output

# 훈련결과를 출력하고 모든요소 설명

훈련중 출력

* eta : 훈련 종료까지 예상되는 남은시간을 나타냄 (1 epoch 기준)

* lr(learning rate) : 모델이 가중치를 업데이트할 때 사용되는 스케일링된 값이다.

* loss : 전체 훈련 손실이다.

* loss_classifier : 객체 분류에 대한 손실이다.

* loss_box_reg: 바운딩 박스 위치 예측에 대한 손실이다.

* loss_objectness: 객체 존재 여부에 대한 손실이다.

* loss_rpn_box_reg: RPN의 바운딩 박스 예측에 대한 손실이다

* RPN은 후보 바운딩 박스를 생성함

* time : 10개의 데이터를 학습하는데 걸린 시간이다

* data : data를 로딩하는데 걸린 시간을 말한다

* max mem : GPU 사용량 중 최대값을 알려준다

# input data 요소 출력하고 설명

input 요소는 img데이터와 target(label)데이터가 있다

img 데이터 : 0~1사이의 값으로 정규화된 텐서형태로 존재한다.

* label 데이터

1. 'boxes' : 발견된 객체의 바운딩 박스 좌표를 텐서형태로 저장한다.(영상의 좌표를 저장함 (xmin,ymin,xmax,ymax) 형태)

2. 'labels' : 객체의 클래스 레이블을 저장함 여기서는 사람이라고 생각하면 레이블 클래스를 1로 설정함

3. 'image_id' : 데이터의 인덱스를 말한다. 데이터로더에 적재되어있는 첫번째 데이터의 image_id는 0이다.

4. 'area' : 바운딩 박스의 면적을 계산해서 저장한다.

5. 'iscrowd' : 발견된 객체의 바운딩 박스 개수가 증가하면 tensor안에 0의 갯수도 증가함

#output 데이터를 출력하고 설명

output 데이터

* output 데이터 설명

1. 'boxes' : 모델이 이미지에서 발견한 객체의 바운딩 박스를 텐서 형태로 저장함

2. 'labels' : 객체의 label을 저장

3. 'scores' : 바운딩 박스당 모델이 예측한 결과(신뢰도)를 나타냄

#Custom data(robot) 레이블링

레이블링 전용 프로그램으로 레이블링함

windows_v1.8.1.zip

12.77MB

폴더 구조

data 폴더 아래에 텍스트 파일이 있는데 그 파일 안에는 자신이 구분하고 싶은 클레스 명을 적는다

# xml파일을 파싱해서 사용하는 코드

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
import os
import numpy as np
import utils
import transforms as T
import torch
import torch.utils.data
from PIL import Image
import xml.etree.ElementTree as ET
import os
import matplotlib.pyplot as plt
 
 
class RobotSizeDataset(torch.utils.data.Dataset):
    def __init__(self, root, transforms=None):
        self.root = root
        self.transforms = transforms
        # load all image files, sorting them to
        # ensure that they are aligned
        self.imgs = list(sorted(os.listdir(os.path.join(root, "JPGImages"))))
        self.annot = list(sorted(os.listdir(os.path.join(root, "Annotation"))))
    def __getitem__(self, idx):
        # load images and masks
        img_path = os.path.join(self.root, "JPGImages", self.imgs[idx])
        annot_path = os.path.join(self.root, "Annotation", self.annot[idx])
        img = Image.open(img_path).convert("RGB")
 
        # get bounding box coordinates for each mask
        # for xml_path in annot_path:
        #     filename=os.path.basename(xml_path)
        #     filename=filename.replace(".xml",".jpg")
 
        # xml 파일 파싱
        tree=ET.parse(annot_path)
        root = tree.getroot()
 
        #bounding box save
        bounding=[]
        for bbox in root.iter('bndbox'):
            xmin = int(bbox.find('xmin').text)
            ymin = int(bbox.find('ymin').text)
            xmax = int(bbox.find('xmax').text)
            ymax = int(bbox.find('ymax').text)
            bounding.append([xmin,ymin,xmax,ymax])
 
        boxes = torch.as_tensor(bounding, dtype=torch.float32)
 
        # area of each bounding box
        area = (boxes[:, 3] - boxes[:, 1]) * (boxes[:, 2] - boxes[:, 0])
 
        # small robot(label)=0, big robot(label)=1
        labels=[]
        for label in root.iter('object'):
            if label.find('name').text == "big robot":
                labels.append(1)  
            elif label.find('name').text == "small robot":
                labels.append(0)  
            
        
        labels = torch.as_tensor(labels)
 
        # define id for this image
        image_id = torch.tensor([idx])
 
        # suppose all instances are not crowd
        iscrowd = torch.zeros((len(boxes),), dtype=torch.int64)
 
        # put it into the dict
        target = {}
        target["boxes"] = boxes
        target["labels"] = labels
        target["image_id"] = image_id
        target["area"] = area
        target["iscrowd"] = iscrowd
        #target["img_size"]=img_size
 
        if self.transforms is not None:
            img,target= self.transforms(img,target)
 
        return img, target
 
    def __len__(self):
        return len(self.imgs)
    
def get_transform(train):
    transforms = []
    # converts the image, a PIL image, into a PyTorch Tensor
    transforms.append(T.ToTensor())
    if train:
        # during training, randomly flip the training images
        # and ground-truth for data augmentation
        transforms.append(T.RandomHorizontalFlip(0.5))
    return T.Compose(transforms)
 
path='./Custom_Object_Detection/Robot'
dataset =RobotSizeDataset(path, get_transform(train=False))
data_loader = torch.utils.data.DataLoader(
    dataset, batch_size=1, shuffle=False, num_workers=0,
    collate_fn=utils.collate_fn)
img_batch,label_batch=dataset[0]
print(label_batch)
fig = plt.figure(figsize=(15, 6))
num_epochs = 1
for j in range(num_epochs):
    img_batch, label_batch = next(iter(data_loader))
    img = img_batch[0]
    ax = fig.add_subplot(1, 1, j + 1)
    ax.set_xticks([])
    ax.set_yticks([])
    ax.set_title(f'Epoch {j}:', size=15)
    ax.imshow(img.permute(1, 2, 0))
#plt.savefig('figures/14_16.png', dpi=300)
plt.show()
Colored by Color Scripter
cs

직접 레이블링한 결과(.xml)저장

내가 뽑아내고 싶은 부분만 파싱해서 사용(bndbox 부분과 name 부분)

내가 추가한 부분

bndbox의 4가지 정보를 텐서 형태로 저장

내가 추가한 부분 2

big robot은 1로 레이블링하고 small robot은 0으로 레이블링하여 구분

target(label 정보) 출력

bounding 박스 좌표와 클레스 정보가 실제로 잘들어갔는지 cv2 라이브러리로 확인

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
img = cv2.imread('IMG_7597.JPG')
if img is None:
    print('Image load failed')
    sys.exit()
 
box=label_batch['boxes']
labels=[]
for label in label_batch['labels']:
    if label==0:
        labels.append("small robot")
    elif label==1:
        labels.append("big robot")
 
bounding=[]
for j in range(2):
    for point in box[j]:
        bounding.append(point)
pt1_x=int(bounding[0]) 
pt1_y=int(bounding[1])
pt2_x=int(bounding[2])
pt2_y=int(bounding[3])
pt3_x=int(bounding[4])
pt3_y=int(bounding[5])
pt4_x=int(bounding[6])
pt4_y=int(bounding[7])
color=(0,255,255)
font =  cv2.FONT_HERSHEY_PLAIN
color_font=(0,0,255)
pt1=(pt1_x,pt1_y)
pt2=(pt2_x,pt2_y)
pt3=(pt3_x,pt3_y)
pt4=(pt4_x,pt4_y)    
cv2.rectangle(img,pt1,pt2,color,thickness=4)
cv2.rectangle(img,pt3,pt4,color,thickness=4)
img = cv2.putText(img, labels[0], (pt1_x, pt1_y), font, 8, color_font, 6, cv2.LINE_AA)
img = cv2.putText(img, labels[1], (pt3_x, pt3_y), font, 8, color_font, 6, cv2.LINE_AA)
resized_img_1 = cv2.resize(img, dsize=(880,495), interpolation=cv2.INTER_LINEAR)
cv2.imshow("img",resized_img_1) 
cv2.waitKey(0)
cv2.destroyAllWindows()
Colored by Color Scripter
cs

클레스 이름과 바운딩박스 정보 그림에 찍어보기

#Pytorch 깃허브 최신버전을 이용해서 Custom data를 train 및 valid하기

v0.17.0

https://github.com/pytorch/vision/tree/v0.17.0/references/detection

# 문제점

- train은 가능하지만 깃허브에서 제공해준 eval‎라이브러리를 사용하면 오류가 자꾸난다.

<이유>

다 평가함수에서 오류가 나는데 그중에서도 dataset을 파라미터에 전달하는곳에만 오류가 자꾸난다

지금 내 커스텀데이터(로봇)가 COCO데이터셋하고 형식이 맞지 않아서 생기는 오류인것같다.

이 평가라이브러리를 보면 다 COCO데이터셋으로 맞춰져있다. (대표적으로 파일 형식)

COCO데이터셋을 직접 다운받아서 확인해보겠다.

Pytorch detection 레포지토리에서 직접 사용한 vaild데이터

우선 데이터는 .jpg파일로 되어있고

Pytorch detection 레포지토리에서 직접 사용한 annotation

annotation파일은 확장자가 .json으로 되어있다.

나의 데이터와 비교해보면

img는 확장자가 .jpg

annotation 파일은 확장자가 .xml파일이다

결론: COCO데이터셋과 맞춰주려면 .xml파일을 .json파일로 바꿔야한다.

coco 데이터셋이 원하는 형식

나의 데이터셋 형식

COCO 데이터셋의 형식과 내형식을 비교해보면 target에 들어가는 키값도 다르고 COCO데이터셋에는 심지어 내가 없는 키값까지 원하고 있다. 그래서 저 함수에 나의 데이터셋이 들어가지 않는걸로 보임

https://m.blog.naver.com/yh_park02/222315567498

[Annotation Tool] labelme 사용해보기

https://github.com/wkentaro/labelme#windows 1. labelMe 설치하기(Windows) 설치방법은 간단하다. anac...

blog.naver.com

labelme프로그램은 레이블링을 할 수있고 이후에 레이블링 데이터를 COCO데이터셋으로 변환하는 것도 가능하다고 한다. 위 블로그의 방식대로 데이터셋을 변환 해줘야함.

위 방식 대로 하니까 eval‎uate함수에 오류가 안남

결론: 파이토치 최신버전의 eval‎라이브러리를 사용하려면 validation데이터를 coco데이터셋으로 맞춰줘야한다.

IOU출력확인(1epoch=1출력)

<Pytorch Object detection porting code>

https://github.com/downy25/Pytorch-detection

GitHub - downy25/Pytorch-detection

Contribute to downy25/Pytorch-detection development by creating an account on GitHub.

github.com

# Fasterrcnn_resnet50_fpn 모델을 커스텀 데이터로 학습시킬때 주의점

fasterrcnn 모델은 배경을 하나의 클래스로 본다. 우선 처음엔 배경과 객체를 분리하고 그 후에 객체를 구분한다. 배경은 레이블번호 0 그 후에 객체레이블 번호를 1부터 시작해주면 된다. 모델의 클래스 갯수는 내가 분류하고 싶은 클레스의수 +1로 해주어야한다.

#전이학습 전략

전이학습을 할때 가장중요한 부분이 나의 학습데이터에 맞게 신경망의 층들을 재학습시킬지가 중요한 부분이다.

레이어층별로 확인해보는 코드

fasterrcnn모델을 학습할떄 어느부분만 재학습시킬지를 나타냄 (1)

나의 전이학습 모델을 훈련할 때 어느부분만 재학습시킬지를 나타낸 것인데, 지금보면 layer 1층만 False로 되어있고 나머지 층들은 True로 되어있다. False의 의미는 학습할 때 가중치를 고정하겠다는 의미이고 True는 내가 학습하는 학습데이터로 재학습 하겠다는 의미이다. 그럼 왜? Pytorch에서 가져온 소스코드에서는 왜이렇게 layer1층만 고정해서 썼는지가 궁금해서 찾아봤더니 보통 layer1층은 보통 일반적인 특징을 추출하는 층이고 출력층에 가까워 질수록 세부적인 특징들을 뽑아낸다는 것이었다. 그래서 보통 layer 앞단은 다른 데이터셋을 학습할때도 사용되어질수 있지만 다른층들은 그렇지 않다. 뒷단에 층들은 새로운 문제를 맞이할 때마다 학습을 진행시켜 줘야한다.

https://jeinalog.tistory.com/13

Transfer Learning｜학습된 모델을 새로운 프로젝트에 적용하기

#Transfer Learning #전이학습 #CNN #합성곱 신경망 #Image Classification #이미지 분류 이 글은 원작자의 허락 하에 번역한 글입니다! 중간 중간 자연스러운 흐름을 위해 의역한 부분들이 있습니다. 원 의미

jeinalog.tistory.com

# 첫번째 훈련 (epoch 100, augmentation = "ssd", Layer 1층 fix)

> torchrun --nproc_per_node=1 train_porting.py --dataset coco --model fasterrcnn_resnet50_fpn --epochs 100 --lr-steps 16 22 --aspect-ratio-group-factor -1 --weights-backbone ResNet50_Weights.IMAGENET1K_V1 --data-augmentation "ssd"

라이브러리에서 제공하는 augmentation을 사용함

Iou정확도가 너무 낮음

bounding box

# 두번째 훈련(epoch 100, augmentation="ssd", Layer 1,2층 fix)

데이터셋이 적어서 과적합이 심하므로 고정시키는 층을 늘려가며 비교해보겠다

layer1,2층 고정

모델의 특정레이어를 고정시키는 코드

1,2층을 고정시켰을때 IOU인데 별로 효과는 없는것같다

# layer를 1층만 고정한 가중치와 layer를 1,2층만 고정한 가중치 test

한가지 테스트 데이터로 테스트를 해봤다. 계속해서 문제가 되는 작은로봇을 큰로봇으로 인식하는 문제를 해결하기 위해서 테스트 데이터를 작은로봇만 나와있는 데이터로 실험해봤다.

*layer 1층만 고정한 결과 test

layer 1층만 고정한 결과 (빨간색:small robot , 파란색:big robot)

출력 결과

*layer 1,2층 고정한 결과 test

layer1,2층 고정한 결과

출력 결과

* 실험 결과

결과를 보면 첫번째 layer만 고정한 가중치의 test결과는 작은로봇과 큰로봇의 확률이 굉장히 비슷하게 나온다(=작은로봇과 큰로봇을 구분을 잘못하는것으로 보임) 그러나, layer1,2층을 고정한 가중치의 test결과는 작은로봇과 큰로봇의 확률이 굉장히 차이가 나고 작은로봇이 86.2%를 기록한것으로 보아 layer를 얼마만큼 고정시키냐에 따라서 성능이 증가할 수 있다는것은 맞는 말인것 같다. 실험적으로 layer층을 얼마나 고정시키고 학습시킬지 정해줘야 할거같다.

# layer 1,2층을 고정한 상태에서 작은로봇데이터를 100장 정도 늘려서 훈련

출력결과

작은로봇 데이터를 100장 정도 늘리는 이유: 테스트를 진행했을때 큰로봇과 작은 로봇이 같이있는 사진으로 테스트 했는데 큰로봇은 90%이상 인식을 하는데 작은로봇은 60%의 확률을 기록했다. 그래서 학습할때 작은로봇의 데이터가 부족한것으로 판단되어 작은 로봇데이터 100장정도를 레이블링해서 훈련데이터를 늘려줬다.

loss

# 이전에 테스트했던 이미지로 테스트

정확도 90%이상만 출력한 결과

--> 아주 잘찾는다

첨부파일첨부된 파일이 50개 있습니다.

다음검색

북마크

신고 센터로 신고

댓글 2
댓글쓰기
답글쓰기

댓글 리스트

작성자Sungryul Lee | 작성시간 24.01.27 예제실행시 오류발생원인 : 예제를 실행할때 사용한 파이토치 버전과 객체검출용 라이브러리 버전과 같은 것을 사용해야하는데 모두 다른버전을 사용했기때문에 오류발생 -> 최신버전(파이토치, 객체검출API)으로 작성된 객체검출 예제를 사용하는 것이 제일 좋음
작성자Sungryul Lee | 작성시간 24.01.27 1. 참고한 사이트 주소 추가할것
2. 훈련결과 출력하고 결과의 모든 용어를 설명할것, 또 실시간으로 그래프로 출력해줄것
3. 훈련데이터, 출력데이터의 각 요소의 의미를 설명할것

댓글 전체보기

CAFE

과제게시판

객체 검출을 위한 미세조정 예제 및 커스텀 데이터 실습

댓글

카페 검색