Available pretrained models

Available pre-trained pytorch models are stored in bonseyes_openpifpaf_wholebody/models/pytorch. Models with shufflenetv2k16 and shufflenetv2k30 backbones were trained on COCO wholebody dataset (133 keypoints) and the rest of models were trained on COCO kp dataset (17 keypoints)

models
  |-- pytorch
      |-- mobilenetv2
      |   `-- v3.0_mobilenetv2_default_641x641_fp32.pkl
      |-- mobilenetv3large
      |   `-- v3.0_mobilenetv3large_default_641x641_fp32.pkl
      |-- mobilenetv3small
      |   `-- v3.0_mobilenetv3small_default_641x641_fp32.pkl
      |-- resnet50
      |   `-- v3.0_resnet50_default_641x641_fp32.pkl
      |-- shufflenetv2k16
      |   `-- v3.0_shufflenetv2k16_default_641x641_fp32.pkl
      |-- shufflenetv2k30
      |   `-- v3.0_shufflenetv2k30_default_641x641_fp32.pkl
      |-- swin-b
      |   `-- v3.0_swin-b_default_641x641_fp32.pkl
      |-- swin-s
      |  `-- v3.0_swin-s_default_641x641_fp32.pkl
      |-- swin-t
          `-- v3.0_swin-t_default_641x641_fp32.pkl

Pretrained model summaries

For each backbone, these summaries contain detailed architecture info (csv tables available for download) and tables that hold estimates of:

  • Total number of network parameters

  • Theoretical amount of floating point arithmetics (FLOPs)

  • Theoretical amount of multiply-adds (MAdd)

  • Memory usage

ShuffleNet v2k30

Download information about layers for shufflenetv2k30 pytorch model of input size 384x216 csv

Stats for pretrained shufflenetv2k30 pytorch model with different input sizes

INPUT_SIZE

#PARAMS

GFLOPs

memory

MAdd

MemR+W

128x72

45.3

3.6

97.3

7100

364.6

128x96

45.3

4.6

128.2

9100

426.1

128x128

45.3

6.1

170.9

12130

510.5

256x144

45.3

13.7

384.4

27290

932.7

256x192

45.3

18.2

512.6

36380

1187.8

256x256

45.3

24.3

683.5

48510

1525.8

384x216

45.3

31.2

868.6

62220

1884.2

384x288

45.3

41.0

1153.3

81860

2457.6

384x384

45.3

54.7

1537.8

109150

3215.4

512x288

45.3

54.7

1537.8

109150

3215.4

512x384

45.3

72.9

2050.3

145540

4229.1

512x512

45.3

97.2

2733.8

194050

5580.8

ShuffleNet v2k16

Download information about layers for shufflenetv2k16 pytorch model of input size 384x216 csv

Stats for pretrained shufflenetv2k16 pytorch model with different input sizes

INPUT_SIZE

#PARAMS

GFLOPs

memory

MAdd

MemR+W

128x72

20.5

1.3

40.9

2570

157.0

128x96

20.5

1.6

53.5

3240

181.9

128x128

20.5

2.2

71.3

4330

216.4

256x144

20.5

4.9

160.6

9730

389.1

256x192

20.5

6.5

214.1

12980

492.7

256x256

20.5

8.7

285.4

17310

630.8

384x216

20.5

11.2

363.4

22330

780.6

384x288

20.5

14.6

481.6

29200

1010.7

384x384

20.5

19.5

642.2

38940

1321.0

512x288

20.5

19.5

642.2

38940

1321.0

512x384

20.5

26.0

856.2

51920

1740.8

512x512

20.5

34.7

1141.7

69220

2283.5

ResNet50

Download information about layers for resnet50 pytorch model of input size 384x216 csv

Stats for pretrained resnet50 pytorch model with different input sizes

INPUT_SIZE

#PARAMS

GFLOPs

memory

MAdd

MemR+W

128x72

25.5

3.1

76.3

6170

248.7

128x96

25.5

4.0

101.0

8060

298.0

128x128

25.5

5.4

134.7

10740

365.0

256x144

25.5

12.1

303.1

24170

699.9

256x192

25.5

16.1

404.1

32230

900.8

256x256

25.5

21.5

538.9

42970

1167.4

384x216

25.5

27.4

683.5

54780

1454.1

384x288

25.5

36.3

909.3

72510

1904.6

384x384

25.5

48.4

1212.4

96680

2508.8

512x288

25.5

48.4

1212.4

96680

2508.8

512x384

25.5

64.6

1616.5

128910

3307.5

512x512

25.5

86.1

2155.4

171880

4382.7

MobileNet v2

Download information about layers for mobilenetv2 pytorch model of input size 384x216 csv

Stats for pretrained mobilenetv2 pytorch model with different input sizes

INPUT_SIZE

#PARAMS

GFLOPs

memory

MAdd

MemR+W

128x72

12.1

0.2

15.1

365.3

75.1

128x96

12.1

0.2

19.1

389.5

83.1

128x128

12.1

0.3

25.4

519.4

95.4

256x144

12.1

0.6

57.9

1260.0

157.9

256x192

12.1

0.8

76.2

1560.0

194.0

256x256

12.1

1.1

101.7

2080.0

243.3

384x216

12.1

1.4

130.1

2710.0

298.1

384x288

12.1

1.8

171.6

3510.0

378.9

384x384

12.1

2.4

228.7

4670.0

489.8

512x288

12.1

2.4

228.7

4670.0

489.8

512x384

12.1

3.1

305.0

6230.0

637.7

512x512

12.1

4.2

406.7

8310.0

834.9

MobileNet v3 small

Download information about layers for mobilenetv3small pytorch model of input size 384x216 csv

Stats for pretrained mobilenetv3small pytorch model with different input sizes

INPUT_SIZE

#PARAMS

GFLOPs

memory

MAdd

MemR+W

128x72

1.5

0.1

12.5

130.8

24.7

128x96

1.5

0.1

16.4

164.6

30.7

128x128

1.5

0.1

21.9

219.2

39.0

256x144

1.5

0.2

49.2

492.0

80.8

256x192

1.5

0.3

65.6

655.7

105.9

256x256

1.5

0.4

87.5

873.9

139.3

384x216

1.5

0.6

111.3

1130.0

175.5

384x288

1.5

0.7

147.6

1470.0

231.1

384x384

1.5

1.0

196.8

1970.0

306.3

512x288

1.5

1.0

196.8

1970.0

306.3

512x384

1.5

1.3

262.4

2620.0

406.5

512x512

1.5

1.8

349.8

3490.0

540.1

MobileNet v3 large

Download information about layers for mobilenetv3large pytorch model of input size 384x216 csv

Stats for pretrained mobilenetv3large pytorch model with different input sizes

INPUT_SIZE

#PARAMS

GFLOPs

memory

MAdd

MemR+W

128x72

3.9

0.2

37.9

407.2

77.9

128x96

3.9

0.3

50.0

522.0

98.4

128x128

3.9

0.4

66.7

695.1

126.3

256x144

3.9

0.8

150.1

1560.0

265.5

256x192

3.9

1.1

200.1

2080.0

349.1

256x256

3.9

1.4

266.8

2770.0

460.5

384x216

3.9

1.8

338.7

3550.0

580.0

384x288

3.9

2.4

450.2

4670.0

766.8

384x384

3.9

3.2

600.3

6230.0

1017.5

512x288

3.9

3.2

600.3

6230.0

1017.5

512x384

3.9

4.2

800.4

8310.0

1351.7

512x512

3.9

5.6

1067.2

11080.0

1802.2

When exporting tensorrt fp16 shufflenetv2k30 model (input size 320x320) with enabled DLA usage and GPU fallback, we observed the following distribution of layer execution:

Layers running on DLA vs. GPU

Layers running on DLA:

Layers running on GPU:

{Conv_0, Relu_1, Conv_2, Conv_5, Conv_3, Relu_6, Relu_4}, {Conv_8, Relu_9},

(Unnamed Layer* 1479) [Constant], (Unnamed Layer* 1675) [Constant], (Unnamed Layer* 1677) [Constant], Conv_7, 600 copy, 608 copy, Reshape_32 + Transpose_33, Reshape_38, Split_39, Split_39_1, Conv_40 + Relu_41, Conv_42, Conv_43 + Relu_44, Reshape_67 + Transpose_68, Reshape_73, Split_74, Split_74_1, Conv_75 + Relu_76, Conv_77, Conv_78 + Relu_79, Reshape_102 + Transpose_103, Reshape_108, Split_109, Split_109_1, Conv_110 + Relu_111, Conv_112, Conv_113 + Relu_114, Reshape_137 + Transpose_138, Reshape_143, Split_144, Split_144_1, Conv_145 + Relu_146, Conv_147, Conv_148 + Relu_149, Reshape_172 + Transpose_173, Reshape_178, Split_179, Split_179_1, Conv_180 + Relu_181, Conv_182, Conv_183 + Relu_184, Reshape_207 + Transpose_208, Reshape_213, Split_214, Split_214_1, Conv_215 + Relu_216, Conv_217, Conv_218 + Relu_219, Reshape_242 + Transpose_243, Reshape_248, Split_249, Split_249_1, Conv_250 + Relu_251, Conv_252, Conv_253 + Relu_254, Reshape_277 + Transpose_278, Reshape_283, Conv_284, Conv_287 + Relu_288, Conv_285 + Relu_286, Conv_289, Conv_290 + Relu_291, Reshape_314 + Transpose_315, Reshape_320, Split_321, Split_321_1, Conv_322 + Relu_323, Conv_324, Conv_325 + Relu_326, Reshape_349 + Transpose_350, Reshape_355, Split_356, Split_356_1, Conv_357 + Relu_358, Conv_359, Conv_360 + Relu_361, Reshape_384 + Transpose_385, Reshape_390, Split_391, Split_391_1, Conv_392 + Relu_393, Conv_394, Conv_395 + Relu_396, Reshape_419 + Transpose_420, Reshape_425, Split_426, Split_426_1, Conv_427 + Relu_428, Conv_429, Conv_430 + Relu_431, Reshape_454 + Transpose_455, Reshape_460, Split_461, Split_461_1, Conv_462 + Relu_463, Conv_464, Conv_465 + Relu_466, Reshape_489 + Transpose_490, Reshape_495, Split_496, Split_496_1, Conv_497 + Relu_498, Conv_499, Conv_500 + Relu_501, Reshape_524 + Transpose_525, Reshape_530, Split_531, Split_531_1, Conv_532 + Relu_533, Conv_534, Conv_535 + Relu_536, Reshape_559 + Transpose_560, Reshape_565, Split_566, Split_566_1, Conv_567 + Relu_568, Conv_569, Conv_570 + Relu_571, Reshape_594 + Transpose_595, Reshape_600, Split_601, Split_601_1, Conv_602 + Relu_603, Conv_604, Conv_605 + Relu_606, Reshape_629 + Transpose_630, Reshape_635, Split_636, Split_636_1, Conv_637 + Relu_638, Conv_639, Conv_640 + Relu_641, Reshape_664 + Transpose_665, Reshape_670, Split_671, Split_671_1, Conv_672 + Relu_673, Conv_674, Conv_675 + Relu_676, Reshape_699 + Transpose_700, Reshape_705, Split_706, Split_706_1, Conv_707 + Relu_708, Conv_709, Conv_710 + Relu_711, Reshape_734 + Transpose_735, Reshape_740, Split_741, Split_741_1, Conv_742 + Relu_743, Conv_744, Conv_745 + Relu_746, Reshape_769 + Transpose_770, Reshape_775, Split_776, Split_776_1, Conv_777 + Relu_778, Conv_779, Conv_780 + Relu_781, Reshape_804 + Transpose_805, Reshape_810, Split_811, Split_811_1, Conv_812 + Relu_813, Conv_814, Conv_815 + Relu_816, Reshape_839 + Transpose_840, Reshape_845, Conv_846, Conv_849 + Relu_850, Conv_847 + Relu_848, Conv_851, Conv_852 + Relu_853, Reshape_876 + Transpose_877, Reshape_882, Split_883, Split_883_1, Conv_884 + Relu_885, Conv_886, Conv_887 + Relu_888, Reshape_911 + Transpose_912, Reshape_917, Split_918, Split_918_1, Conv_919 + Relu_920, Conv_921, Conv_922 + Relu_923, Reshape_946 + Transpose_947, Reshape_952, Split_953, Split_953_1, Conv_954 + Relu_955, Conv_956, Conv_957 + Relu_958, Reshape_981 + Transpose_982, Reshape_987, Split_988, Split_988_1, Conv_989 + Relu_990, Conv_991, Conv_992 + Relu_993, Reshape_1016 + Transpose_1017, Reshape_1022, Split_1023, Split_1023_1, Conv_1024 + Relu_1025, Conv_1026, Conv_1027 + Relu_1028, Reshape_1051 + Transpose_1052, Reshape_1057, Conv_1058 + Relu_1059, Conv_1085 || Conv_1060, Reshape_1062 + Transpose_1063, Reshape_1087 + Transpose_1088, Reshape_1065, Reshape_1090, Slice_1066, Slice_1091, Slice_1067, Slice_1092, Reshape_1073 + Transpose_1074, Reshape_1098 + Transpose_1099, Slice_1075, Slice_1077, Slice_1080, Slice_1081, Slice_1100, Slice_1102, Slice_1103, Slice_1108, Slice_1109, Sigmoid_1076, Add_1079, Softplus_1082, Sigmoid_1101, Add_1105, Add_1107, 1922 copy, 1925 copy, 1926 copy, 1928 copy, Transpose_1084, Softplus_1110, 1955 copy, 1959 copy, 1961 copy, 1962 copy, 1964 copy, Transpose_1112,