seob2

전체 글

[TensorRT] 2. Serializing & Deserializing tensorrt engine (tensorRT 7.2.3) 2021.08.24
coco label items 2021.08.11
Darknet validationset 학습 포함 여부 2021.07.29

[TensorRT] 2. Serializing & Deserializing tensorrt engine (tensorRT 7.2.3)

seob2 2021. 8. 24. 22:24

2021. 8. 24. 22:24

개인적으로 읽고 쓰는 공부용 리뷰입니다.

틀린 점이 있을 수도 있으니 감안하고 읽어주세요. 피드백은 댓글로 부탁드립니다.

[TensorRT] 1. Build tensorrt engine (tensorRT 7.2.3)

Serializie는 나중에 재사용을 위해 저장하기위한 포맷으로 바꾸는 것을 의미한다. inference에 사용하기 위해서는 그냥 deserializie한 뒤 쓰면 된다. 보통 빌드과정이 시간을 소요하기 때문에 매번 빌드하는 것을 피하기 위해 이 과정을 한다.

// code for serializing ICudaEngine
IHostMemory *serializedModel = engine->serialize();
// store model to disk
// <…>
 serializedModel->destroy();

Deserialize code, The final argument is a plugin layer factory for applications using custom layers. For more information, see [Extending TensorRT With Custom Layers]
별거아니고, 마지막에 nullptr는 iplugin for custom layer인데 없다면 그냥 nullptr넣으면 된다.

// code for deserializing ICudaEngine
IRuntime* runtime = createInferRuntime(gLogger);
ICudaEngine* engine = runtime->deserializeCudaEngine(modelData, modelSize, nullptr)

주의 할점은 trt version, gpu, platform을 항시 잘 체크해야한다.

Serialized engines are not portable across platforms or TensorRT versions.
Engines are specific to the exact GPU model they were built on.

위에는 tensorrt reference의 코드인데 처음에보고 어떻게 더 추가해야할지 몰라서 막막했다.
아래는 실제로 내가 사용하는 serializing & save 코드다.

여기서 engine_ 은 빌드가 성공적으로 된 ICudaEngine이다.

bool saveEngine( std::string &fileName ) const 
{
    std::ofstream engineFile( fileName, std::ios::binary );
    if ( !engineFile ) {
        gLogFatal << "Cannot open engine file : " << fileName << std::endl;
        return false;
    }

    if ( engine_ == nullptr ) {
        gLogError << "Engine is not defined" << std::endl;
        return false;
    }
    nvinfer1::IHostMemory *serializedEngine{engine_->serialize()};
    if ( serializedEngine == nullptr ) {
        gLogError << "Engine serialization failed" << std::endl;
        return false;
    }

    engineFile.write( static_cast<char *>( serializedEngine->data() ),
                      serializedEngine->size() );
    if ( engineFile.fail() ) {
        gLogError << "Failed to save Engine." << std::endl;
        return false;
    }
    std::cout << "Successfully save to : " << fileName << std::endl;
    return true;
}

다음은 저장된 ICudaEngine을 load후 다시 deserializing하는 코드다.

bool Load( const std::string &fileName ) {
    std::ifstream engineFile( fileName, std::ios::binary );
    if ( !engineFile ) {
        std::cout << "can not open file : " << fileName << std::endl;
        return false;
    }
    engineFile.seekg( 0, engineFile.end );
    auto fsize = engineFile.tellg();
    engineFile.seekg( 0, engineFile.beg );

    std::vector<char> engineData( fsize );
    engineFile.read( engineData.data(), fsize );

    Load( engineData.data(), ( long int )fsize );
    return true;
}

bool Load( const void *engineData, const long int fsize ) {
    nvinfer1::IRuntime *runtime = nvinfer1::createInferRuntime( gLogger.getTRTLogger() );
    engine_ = runtime->deserializeCudaEngine( engineData, fsize, nullptr );
    // if u want DLA core setting, then u shoud write code here
    runtime->destroy();
    return true;
}

두가지 방법으로 load할수있어서 오버로딩해놨다.

끝

저작자표시 비영리 변경금지

'Deep Learning > tensorrt' 카테고리의 다른 글

[TensorRT] 1. Build tensorrt engine (tensorRT 7.2.3) (6)	2021.03.24

coco label items

seob2 2021. 8. 11. 18:04

2021. 8. 11. 18:04

coco label 얻기

ids = list(sorted(coco.imgs.keys()))
coco_idx = [ids[idx]] # from index to coco_index
coco.loadAnns(self.coco.getAnnIds(coco_idx))

아래처럼 object별로 7개의 key값을 가짐

segmentation은 annotation type이 두개가있다.
segmentation1 : list[float], polygon 좌표 xy인데 flatten시켰다.
segmentation2 : a run-length-encoded (RLE) bit mask
'counts' :
'size' :

area : 객체 안에 속한 pixel 수

iscrowd : 0 또는 1, 1이면 정확도 측정할 때 제외함

image_id : 현재 객체가 속한 image의 index, 그래서 같은 이미지에 있는 객체들은 값이 같음. (82783: coco2014train)

bbox : bounding box 좌표, 좌상단xy wh로 구성되어있으며 normalizing 안되어있음

category_id : ojbect의 class index 총 80개며 1~90사이의 값을 가짐

id : 각 object에 대한 id인데 어떻게 라벨링 되는지는 모르겠음.

아 category_id가 좀 다를 것이다 0~79의 80개 class가 아니라 1~90으로 나온다. 몇개 삭제된 클래스가 있다. 아래 tuple써서 치환(?)하면 됨
(255, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 255, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 255, 24, 25, 255, 255, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 255, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 255, 60, 255, 255, 61, 255, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 255, 73, 74, 75, 76, 77, 78, 79)

https://github.com/cocodataset/cocoapi/issues/184

https://github.com/facebookresearch/Detectron/issues/100

https://www.immersivelimit.com/tutorials/create-coco-annotations-from-scratch

https://www.youtube.com/watch?v=h6s61a_pqfM

저작자표시 비영리 변경금지

'Deep Learning > 기타' 카테고리의 다른 글

[Dataset] NIH Google (0)	2024.05.24
vscode에서 서버 연결할 때, XHR failed (0)	2021.08.26
Darknet validationset 학습 포함 여부 (0)	2021.07.29

Darknet validationset 학습 포함 여부

seob2 2021. 7. 29. 14:41

2021. 7. 29. 14:41

링크 : darknet pretrained weights with MS-COCO detection dataset

틈틈히 작업하던 yolo to pytorch가 완성단계라 위 링크 weights들의 정확도를 재생산하는 것을 통해 검증하려고 하는데 validationset에 대한 결과가 이상하리만큼 높게 나오는 것을 발견했다.

yolov4 같은 겨우 width=608 height=608 in cfg: 65.7% mAP@0.5 (43.5% AP@0.5:0.95) 라고 적혀 있는데 (testset기준)

coco 2017 validationset(5k)로 직접 evaluation 했더니

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.500
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.741

이 나왔다. 위 결과는 testset(40k)일테니 낮은건 이해하지만 유난히 높게 나왔다는 생각이 들어 조금 알아봤다.

왜 유난히 높다라는 생각이 들었냐면, 후속논문인 sclaed yolov4 에서 validationset을 통해 측정을 한 결과가 다음과 같다. yolov4-P5경우 896x896 이미지를 사용했는데 51.7 / 70.3 (아래 테이블 참조)이다. 따라서 위에서 yolov4로 돌린 결과가 이상하다고 볼 수 있다.

첫번째로 yolov4를 학습할 때, validationset이 포함됐나?를 떠올렸다. 그러면 높은 결과가 이해가 되니까.

그러나 darknet 위키를 찾은결과 학습에는 분명히 validation 5k를 제외하고 학습이 되었다. 정말 이상해서 더 자료를 찾다가 원하는 내용의 issue를 찾았다.

The custom splits (trainvalno5k.txt and 5k.txt) for COCO 2014 are supposed to be the same as the default splits for COCO 2017 but they are not. This could explain why YOLOv4 does not produce the same validation results on both but a significantly better mAP on val2017. Also, this implies YOLO may be trained on a different training set compared with other object detectors. Then a direct comparison might not be fair. Any clarifications?

즉 coco2017과 darknet에서 쓰는 split (train, val, test)가 다르다. 이것은 trainset이 다르단 소리고 결과가 불공정하지않냐? 라는 질문이다.

그러면서 coco2014와 2017의 차이점도 친절하게 official에서 긁어놨다.

The only difference is the splits which COCO 2017 adopts as the long-ime convention from COCO 2014 in early object detection work. The detectron repo explicitly describes the COCO Minival Annotations (5k) as follows:

Our custom minival and valminusminival annotations are available for download here. Please note that minival is exactly equivalent to the recently defined 2017 val set. Similarly, the union of valminusminival and the 2014 train is exactly equivalent to the 2017 train set.

즉 2014와 2017은 이미지가 전혀 바뀐게 없고 split만 바꼈다는 것인데 다음과 같다. (valminusminival의 존재를 처음 앎;)

COCO_2017_train = COCO_2014_train + valminusminival
COCO_2017_val = minival

거기에 대한 답은 이렇다.

- testset은 같다.
- validation이 다르지만 결국엔 똑같은 장수를 validation으로 뺐고 test는 같으니 결과는 문제없다!.
심지어 우리는 최종 weight에 5k를 빼고학습했으니 오히려 마이너스 요소가 될 수 있다. 가 답변이다.

왜 다른 split을 쓰는지는 나도 모르겠지만, test로 성능측정하려면 서버에 결과제출해야해서 귀찮은데..하 아무튼 이런 이슈가 존재한다.

저작자표시 비영리 변경금지

'Deep Learning > 기타' 카테고리의 다른 글

[Dataset] NIH Google (0)	2024.05.24
vscode에서 서버 연결할 때, XHR failed (0)	2021.08.26
coco label items (0)	2021.08.11

PREV 이전 1 2 3 4 5 6 NEXT 다음

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

seob2

전체 글

[TensorRT] 2. Serializing & Deserializing tensorrt engine (tensorRT 7.2.3)

개인적으로 읽고 쓰는 공부용 리뷰입니다.

'Deep Learning > tensorrt' 카테고리의 다른 글

coco label items

'Deep Learning > 기타' 카테고리의 다른 글

Darknet validationset 학습 포함 여부

'Deep Learning > 기타' 카테고리의 다른 글

+ Recent posts

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역