Wie schnell kann ein STT-System hergestellt werden?







( β€” ) , . , , , state-of-the-art bleeding edge , , , " ". , GPU, , , ( 2 β€” 2.5 GHz AMD 4+ GHz 64 ). , , !







3 :







  • ;
  • , ;
  • SLA ;
  • " " ;


:







  • ?
  • ?
  • ?




. STT , Facebook AI Research , , . , " ".







, 3 , "":







  • Throughput ( ) β€” . "";
  • Real Time Factor (RTF) ( ) β€” , . Real Time Speed (RTS) 1 / RTF, , 1 ;
  • Latency () β€” - ;


FAIR, :







  • FAIR Intel Skylake c 18 ( 2 , - );
  • end-to-end C++ "" ;
  • wav2letter++;
  • , - ;








:







  • "" (RTS * ) . , ;
  • FAIR , .. , TDS-;
  • - ;
  • , , - state-of-the-art ( , );
  • FAIR , "" 40 , 0.26 RTF ( ). , , ;
  • 40 * 1 / 0.26 18 . , 1 1 - 8-9 ;


:







  • : 750 ( );
  • : 8-9 , 40 18 ;
  • : 500 1000 , 750 β€” ;
  • C++ -;




β€” ? , .. . . , - . β€” , , .









end-to-end , FAIR, , . , , end-2-end ( ).







FAIR, .. , C++ . , .







. , , . , " " "" ( ). ( ):







FP32 FP32 + Fused FP32 + INT8 FP32 Fused + INT8 Full INT8 + Fused New Best
1 7.7 8.8 8.8 9.1 11.0 22.6
5 11.8 13.6 13.6 15.6 17.5 29.8
10 12.8 14.6 14.6 16.7 18.0 29.3
25 12.9 14.9 14.9 17.9 18.7 29.8


, , . , CPU . . FAIR , , :







  • -;
  • , , , ;
  • FAIR , ;








, - .









- .







:







  • 1 β€” 7 , "" 4 β€” 8 β€” 16 ;
  • ;
  • -, . 8 latency 500 , , -, , 8 ;
  • , . . -;


GPU



SSD, 256+ GB NVME, 256+ GB
RAM 32 GB 32 GB
8+ 12+
3 GHz+ 3.5 GHz+
2
AVX2
GPU 1 1
8 "" 16 ""
, 280 320
95- , 430 476
99- , 520 592
1000 25.0 43.4
500 12.5 21.7
(1 / RTF) 85.6 145.0
10.7 12.1


3 GPU:







  • GPU Nvidia 1070 8+GB RAM ;
  • GPU Nvidia Quadro 8+GB RAM (TDP 100 β€” 150W) ;
  • Nvidia Tesla T4, , TDP 75W;


, GPU ( " "). , :







  • " (300 ) ". . TDP . TDP 75 150 , - 50-75% ;
  • " ". , 2 (+ );
  • " ". "" Tesla SLA Nvidia. SLA , .. Tesla 2-3 . "" β€” ;
  • " 2+ ". . Quadro T4 1 ;
  • " ". Tesla A100 US$12,500 . Quadro T4 ( ) ;
  • " ". "" β€” 3-4 , . β€” . , Nvidia 24/7 "" , . 100 Nvidia AMD 3 ;


β€” . β€” GPU 2-3 .







CPU



SSD, 256+ GB SSD, 256+ GB
RAM 32 GB 32 GB
8+ 12+
3.5 GHz+ 3.5 GHz+
2
AVX2
4 "" 8 ""
, 320 470
95- , 580 760
99- , 720 890
1000 11.1 15.9
500 5.6 8.0
(1 / RTF) 37.0 53.0
4.6 4.4


C



  • GPU 10 β€” 15 RTS ( RTS GPU - 500 β€” 1,000). CPU 1 GPU ( ), , . - ;
  • CPU- 5 RTS, , latency throughput;
  • . β€” , - ;
  • 50 , 2 GPU (+ ) ;
  • GPU - 2-3 , CPU;




FAIR , 50% . β€” . 20-30 RTS , - 40-50% . :







  • GPU , ;
  • FAIR;
  • , GPU , ;
  • - , ;


? MIT.








All Articles