Open
Description
TorchBench CI has detected a performance signal or runtime regression.
Base PyTorch commit: 0200b1106c4fe80ea0884181dc8d649ef6078ea3
Affected PyTorch commit: 806d1a871ddfd2d38e1791489892009feaec8425
Affected Tests:
- resnet50, ASGD, cuda, default: +124.02993%
- resnet50, ASGD, cuda, maximize: +91.23220%
- resnet50, ASGD, cuda, no_foreach: +123.98829%
- resnet50, ASGD, cuda, differentiable: +119.85504%
- resnet50, ASGD, cuda, foreach: +105.76765%
- mnasnet1_0, ASGD, cuda, default: +115.15906%
- mnasnet1_0, ASGD, cuda, maximize: +93.84647%
- mnasnet1_0, ASGD, cuda, no_foreach: +138.73016%
- mnasnet1_0, ASGD, cuda, differentiable: +121.80861%
- mnasnet1_0, ASGD, cuda, foreach: +104.30969%
- squeezenet1_1, ASGD, cuda, default: +94.39419%
- squeezenet1_1, ASGD, cuda, maximize: +86.72449%
- squeezenet1_1, ASGD, cuda, no_foreach: +119.51013%
- squeezenet1_1, ASGD, cuda, differentiable: +113.43829%
- squeezenet1_1, ASGD, cuda, foreach: +102.96500%
- sam, ASGD, cuda, default: +195.67645%
- sam, ASGD, cuda, maximize: +181.55680%
- sam, ASGD, cuda, no_foreach: +140.95377%
- sam, ASGD, cuda, differentiable: +147.18748%
- sam, ASGD, cuda, foreach: +193.14937%
- vgg16, ASGD, cuda, default: +128.22068%
- vgg16, ASGD, cuda, maximize: +95.99951%
- vgg16, ASGD, cuda, no_foreach: +129.91319%
- vgg16, ASGD, cuda, differentiable: +129.00090%
- vgg16, ASGD, cuda, foreach: +126.44894%
- timm_vision_transformer, Adadelta, cuda, (pt2) foreach: -99.98788%
- timm_vision_transformer, Adam, cuda, (pt2) fused: +82.68205%
- timm_vision_transformer, AdamW, cuda, (pt2) no_foreach: -99.98358%
- timm_vision_transformer, AdamW, cuda, (pt2) foreach: -67.46933%
- timm_vision_transformer, AdamW, cuda, (pt2) fused: +71.99635%
- timm_vision_transformer, ASGD, cuda, default: +106.71975%
- timm_vision_transformer, ASGD, cuda, maximize: +96.59117%
- timm_vision_transformer, ASGD, cuda, (pt2) no_foreach: +143.52541%
- timm_vision_transformer, ASGD, cuda, no_foreach: +91.65669%
- timm_vision_transformer, ASGD, cuda, differentiable: +116.06366%
- timm_vision_transformer, ASGD, cuda, foreach: +115.47567%
- compile_time, timm_vision_transformer, Adadelta, cuda, (pt2) foreach: +37807.53782%
- compile_time, timm_vision_transformer, AdamW, cuda, (pt2) no_foreach: +7077.47102%
- compile_time, timm_vision_transformer, AdamW, cuda, (pt2) fused: +361.28020%
- compile_time, timm_vision_transformer, Adamax, cuda, (pt2) foreach: +217.33371%
- compile_time, timm_vision_transformer, SGD, cuda, (pt2) foreach: +813.99024%
- timm_vovnet, ASGD, cuda, default: +79.70207%
- timm_vovnet, ASGD, cuda, maximize: +81.93381%
- timm_vovnet, ASGD, cuda, no_foreach: +92.76432%
- timm_vovnet, ASGD, cuda, differentiable: +100.97167%
- timm_vovnet, ASGD, cuda, foreach: +79.48599%
- speech_transformer, ASGD, cuda, default: +98.57801%
- speech_transformer, ASGD, cuda, maximize: +92.68815%
- speech_transformer, ASGD, cuda, no_foreach: +113.97402%
- speech_transformer, ASGD, cuda, differentiable: +109.36097%
- speech_transformer, ASGD, cuda, foreach: +99.81050%
- basic_gnn_sage, ASGD, cuda, default: +86.73222%
- basic_gnn_sage, ASGD, cuda, maximize: +75.81285%
- basic_gnn_sage, ASGD, cuda, no_foreach: +116.76823%
- basic_gnn_sage, ASGD, cuda, differentiable: +106.38985%
- basic_gnn_sage, ASGD, cuda, foreach: +96.91066%
- hf_T5, ASGD, cuda, default: +127.93302%
- hf_T5, ASGD, cuda, maximize: +114.74867%
- hf_T5, ASGD, cuda, no_foreach: +131.01003%
- hf_T5, ASGD, cuda, differentiable: +127.07553%
- hf_T5, ASGD, cuda, foreach: +155.85461%
- mobilenet_v2_quantized_qat, ASGD, cuda, default: +95.82244%
- mobilenet_v2_quantized_qat, ASGD, cuda, maximize: +86.96235%
- mobilenet_v2_quantized_qat, ASGD, cuda, no_foreach: +107.97934%
- mobilenet_v2_quantized_qat, ASGD, cuda, differentiable: +116.05228%
- mobilenet_v2_quantized_qat, ASGD, cuda, foreach: +106.87671%
- maml_omniglot, ASGD, cuda, default: +87.10180%
- maml_omniglot, ASGD, cuda, maximize: +75.81007%
- maml_omniglot, ASGD, cuda, no_foreach: +109.66096%
- maml_omniglot, ASGD, cuda, differentiable: +116.56157%
- maml_omniglot, ASGD, cuda, foreach: +90.88092%
- alexnet, ASGD, cuda, default: +142.00344%
- alexnet, ASGD, cuda, maximize: +117.11555%
- alexnet, ASGD, cuda, no_foreach: +148.61658%
- alexnet, ASGD, cuda, differentiable: +147.32024%
- alexnet, ASGD, cuda, foreach: +140.08364%
- opacus_cifar10, ASGD, cuda, default: +88.46431%
- opacus_cifar10, ASGD, cuda, maximize: +77.82195%
- opacus_cifar10, ASGD, cuda, no_foreach: +114.27061%
- opacus_cifar10, ASGD, cuda, differentiable: +109.31896%
- opacus_cifar10, ASGD, cuda, foreach: +84.54530%
- hf_Bert, ASGD, cuda, default: +142.73106%
- hf_Bert, ASGD, cuda, maximize: +117.33846%
- hf_Bert, ASGD, cuda, no_foreach: +117.59002%
- hf_Bert, ASGD, cuda, differentiable: +109.75356%
- hf_Bert, ASGD, cuda, foreach: +137.88056%
- timm_vision_transformer_large, Adadelta, cuda, (pt2) no_foreach: -99.98482%
- timm_vision_transformer_large, Adagrad, cuda, (pt2) default: +31.17705%
- timm_vision_transformer_large, Adam, cuda, (pt2) default: +33.26611%
- timm_vision_transformer_large, Adam, cuda, (pt2) no_foreach: +41.87344%
- timm_vision_transformer_large, Adam, cuda, (pt2) foreach: +64.34802%
- timm_vision_transformer_large, AdamW, cuda, (pt2) default: +37.09363%
- timm_vision_transformer_large, AdamW, cuda, (pt2) foreach: +41.07935%
- timm_vision_transformer_large, ASGD, cuda, default: +123.28451%
- timm_vision_transformer_large, ASGD, cuda, maximize: +140.95934%
- timm_vision_transformer_large, ASGD, cuda, no_foreach: +149.50390%
- timm_vision_transformer_large, ASGD, cuda, differentiable: +218.22416%
- timm_vision_transformer_large, ASGD, cuda, foreach: +124.79011%
- timm_vision_transformer_large, Rprop, cuda, (pt2) default: +31.40632%
- compile_time, timm_vision_transformer_large, Adadelta, cuda, (pt2) no_foreach: -7716.53938%
- hf_T5_large, ASGD, cuda, default: +211.69880%
- hf_T5_large, ASGD, cuda, maximize: +169.96072%
- hf_T5_large, ASGD, cuda, no_foreach: +141.02743%
- hf_T5_large, ASGD, cuda, differentiable: +143.63304%
- hf_T5_large, ASGD, cuda, foreach: +208.19100%
- vision_maskrcnn, ASGD, cuda, default: +100.31787%
- vision_maskrcnn, ASGD, cuda, maximize: +89.12360%
- vision_maskrcnn, ASGD, cuda, no_foreach: +98.99280%
- vision_maskrcnn, ASGD, cuda, differentiable: +99.44593%
- vision_maskrcnn, ASGD, cuda, foreach: +101.92607%
- hf_GPT2_large, ASGD, cuda, default: +193.84464%
- hf_GPT2_large, ASGD, cuda, maximize: +153.89762%
- hf_GPT2_large, ASGD, cuda, no_foreach: +158.69141%
- hf_GPT2_large, ASGD, cuda, differentiable: +163.62273%
- hf_GPT2_large, ASGD, cuda, foreach: +192.48542%
- BERT_pytorch, ASGD, cuda, default: +135.65602%
- BERT_pytorch, ASGD, cuda, maximize: +119.74721%
- BERT_pytorch, ASGD, cuda, no_foreach: +125.09949%
- BERT_pytorch, ASGD, cuda, differentiable: +109.46087%
- BERT_pytorch, ASGD, cuda, foreach: +133.43466%
- detectron2_fasterrcnn_r_50_fpn, ASGD, cuda, default: +122.49460%
- detectron2_fasterrcnn_r_50_fpn, ASGD, cuda, maximize: +103.09817%
- detectron2_fasterrcnn_r_50_fpn, ASGD, cuda, no_foreach: +118.01190%
- detectron2_fasterrcnn_r_50_fpn, ASGD, cuda, differentiable: +120.14865%
- detectron2_fasterrcnn_r_50_fpn, ASGD, cuda, foreach: +121.01315%
- dlrm, ASGD, cuda, default: +78.09405%
- dlrm, ASGD, cuda, maximize: +59.68144%
- dlrm, ASGD, cuda, foreach: +78.35686%
- fastNLP_Bert, ASGD, cuda, default: +124.20659%
- fastNLP_Bert, ASGD, cuda, maximize: +129.74865%
- fastNLP_Bert, ASGD, cuda, no_foreach: +131.86236%
- fastNLP_Bert, ASGD, cuda, differentiable: +124.50400%
- fastNLP_Bert, ASGD, cuda, foreach: +129.15546%
- basic_gnn_gcn, ASGD, cuda, default: +95.32429%
- basic_gnn_gcn, ASGD, cuda, maximize: +87.14749%
- basic_gnn_gcn, ASGD, cuda, no_foreach: +110.78651%
- basic_gnn_gcn, ASGD, cuda, differentiable: +111.55591%
- basic_gnn_gcn, ASGD, cuda, foreach: +107.76394%
- phlippe_resnet, ASGD, cuda, default: +93.23522%
- phlippe_resnet, ASGD, cuda, maximize: +86.87177%
- phlippe_resnet, ASGD, cuda, no_foreach: +130.84100%
- phlippe_resnet, ASGD, cuda, differentiable: +115.97482%
- phlippe_resnet, ASGD, cuda, foreach: +98.55166%
- timm_resnest, ASGD, cuda, default: +103.57621%
- timm_resnest, ASGD, cuda, maximize: +107.16194%
- timm_resnest, ASGD, cuda, no_foreach: +119.73967%
- timm_resnest, ASGD, cuda, differentiable: +129.29690%
- timm_resnest, ASGD, cuda, foreach: +118.03992%
- basic_gnn_gin, ASGD, cuda, default: +115.13085%
- basic_gnn_gin, ASGD, cuda, maximize: +83.92329%
- basic_gnn_gin, ASGD, cuda, no_foreach: +107.37685%
- basic_gnn_gin, ASGD, cuda, differentiable: +111.10815%
- basic_gnn_gin, ASGD, cuda, foreach: +101.07616%
- resnet50_quantized_qat, ASGD, cuda, default: +81.96953%
- resnet50_quantized_qat, ASGD, cuda, maximize: +73.56566%
- resnet50_quantized_qat, ASGD, cuda, no_foreach: +126.73629%
- resnet50_quantized_qat, ASGD, cuda, differentiable: +108.65208%
- resnet50_quantized_qat, ASGD, cuda, foreach: +90.47366%
- Background_Matting, ASGD, cuda, default: +85.40617%
- Background_Matting, ASGD, cuda, maximize: +80.75659%
- Background_Matting, ASGD, cuda, no_foreach: +104.39196%
- Background_Matting, ASGD, cuda, differentiable: +131.27337%
- Background_Matting, ASGD, cuda, foreach: +95.64608%
- tacotron2, ASGD, cuda, default: +119.66504%
- tacotron2, ASGD, cuda, maximize: +107.00849%
- tacotron2, ASGD, cuda, no_foreach: +140.78569%
- tacotron2, ASGD, cuda, differentiable: +129.29204%
- tacotron2, ASGD, cuda, foreach: +121.62964%
- llama, ASGD, cuda, default: -37.89294%
- llama, ASGD, cuda, maximize: -37.69863%
- llama, ASGD, cuda, foreach: -39.76405%
- demucs, ASGD, cuda, default: +133.49940%
- demucs, ASGD, cuda, maximize: +100.55682%
- demucs, ASGD, cuda, no_foreach: +145.58503%
- demucs, ASGD, cuda, differentiable: +136.17159%
- demucs, ASGD, cuda, foreach: +129.51777%
- pytorch_unet, ASGD, cuda, default: +89.85632%
- pytorch_unet, ASGD, cuda, maximize: +88.41714%
- pytorch_unet, ASGD, cuda, no_foreach: +133.38601%
- pytorch_unet, ASGD, cuda, differentiable: +124.86886%
- pytorch_unet, ASGD, cuda, foreach: +95.00974%
- hf_Albert, ASGD, cuda, default: +105.88921%
- hf_Albert, ASGD, cuda, maximize: +95.25225%
- hf_Albert, ASGD, cuda, no_foreach: +113.10521%
- hf_Albert, ASGD, cuda, differentiable: +112.50182%
- hf_Albert, ASGD, cuda, foreach: +121.51764%
- tts_angular, ASGD, cuda, default: +111.68752%
- tts_angular, ASGD, cuda, maximize: +91.98089%
- tts_angular, ASGD, cuda, no_foreach: +106.44684%
- tts_angular, ASGD, cuda, differentiable: +109.35268%
- tts_angular, ASGD, cuda, foreach: +124.90487%
- timm_nfnet, ASGD, cuda, default: +109.12394%
- timm_nfnet, ASGD, cuda, maximize: +99.66265%
- timm_nfnet, ASGD, cuda, no_foreach: +112.02449%
- timm_nfnet, ASGD, cuda, differentiable: +120.72541%
- timm_nfnet, ASGD, cuda, foreach: +97.20528%
- dcgan, ASGD, cuda, default: +80.42478%
- dcgan, ASGD, cuda, maximize: +74.82730%
- dcgan, ASGD, cuda, no_foreach: +93.25956%
- dcgan, ASGD, cuda, differentiable: +106.39970%
- dcgan, ASGD, cuda, foreach: +84.90491%
- moco, ASGD, cuda, default: +85.82898%
- moco, ASGD, cuda, maximize: +79.96736%
- moco, ASGD, cuda, no_foreach: +122.59127%
- moco, ASGD, cuda, differentiable: +116.06429%
- moco, ASGD, cuda, foreach: +100.20179%
- detectron2_maskrcnn_r_101_fpn, ASGD, cuda, default: +121.34472%
- detectron2_maskrcnn_r_101_fpn, ASGD, cuda, maximize: +102.83182%
- detectron2_maskrcnn_r_101_fpn, ASGD, cuda, no_foreach: +119.60952%
- detectron2_maskrcnn_r_101_fpn, ASGD, cuda, differentiable: +106.19279%
- detectron2_maskrcnn_r_101_fpn, ASGD, cuda, foreach: +107.71527%
- detectron2_maskrcnn, ASGD, cuda, default: +117.57517%
- detectron2_maskrcnn, ASGD, cuda, maximize: +103.30191%
- detectron2_maskrcnn, ASGD, cuda, no_foreach: +116.90257%
- detectron2_maskrcnn, ASGD, cuda, differentiable: +117.56160%
- detectron2_maskrcnn, ASGD, cuda, foreach: +110.69690%
- mobilenet_v2, ASGD, cuda, default: +96.95437%
- mobilenet_v2, ASGD, cuda, maximize: +91.66412%
- mobilenet_v2, ASGD, cuda, no_foreach: +105.93441%
- mobilenet_v2, ASGD, cuda, differentiable: +106.24384%
- mobilenet_v2, ASGD, cuda, foreach: +101.98825%
- phlippe_densenet, ASGD, cuda, default: +90.63482%
- phlippe_densenet, ASGD, cuda, maximize: +82.98210%
- phlippe_densenet, ASGD, cuda, no_foreach: +109.94099%
- phlippe_densenet, ASGD, cuda, differentiable: +99.62891%
- phlippe_densenet, ASGD, cuda, foreach: +102.07940%
- stable_diffusion, ASGD, cuda, default: +173.16434%
- stable_diffusion, ASGD, cuda, maximize: +172.79215%
- stable_diffusion, ASGD, cuda, no_foreach: +133.62548%
- stable_diffusion, ASGD, cuda, differentiable: +150.74043%
- stable_diffusion, ASGD, cuda, foreach: +193.07518%
- detectron2_fasterrcnn_r_101_dc5, ASGD, cuda, default: +212.36047%
- detectron2_fasterrcnn_r_101_dc5, ASGD, cuda, maximize: +183.13374%
- detectron2_fasterrcnn_r_101_dc5, ASGD, cuda, no_foreach: +157.73100%
- detectron2_fasterrcnn_r_101_dc5, ASGD, cuda, differentiable: +158.22935%
- detectron2_fasterrcnn_r_101_dc5, ASGD, cuda, foreach: +214.21778%
- Super_SloMo, ASGD, cuda, default: +106.84278%
- Super_SloMo, ASGD, cuda, maximize: +91.77756%
- Super_SloMo, ASGD, cuda, no_foreach: +120.37297%
- Super_SloMo, ASGD, cuda, differentiable: +107.75837%
- Super_SloMo, ASGD, cuda, foreach: +105.20420%
- timm_efficientnet, ASGD, cuda, default: +100.11609%
- timm_efficientnet, ASGD, cuda, maximize: +96.02220%
- timm_efficientnet, ASGD, cuda, no_foreach: +117.61484%
- timm_efficientnet, ASGD, cuda, differentiable: +126.42353%
- timm_efficientnet, ASGD, cuda, foreach: +104.98403%
- shufflenet_v2_x1_0, ASGD, cuda, default: +113.97946%
- shufflenet_v2_x1_0, ASGD, cuda, maximize: +94.74763%
- shufflenet_v2_x1_0, ASGD, cuda, no_foreach: +126.35174%
- shufflenet_v2_x1_0, ASGD, cuda, differentiable: +110.48815%
- shufflenet_v2_x1_0, ASGD, cuda, foreach: +104.07755%
- yolov3, ASGD, cuda, default: +78.25093%
- yolov3, ASGD, cuda, maximize: +77.13986%
- yolov3, ASGD, cuda, no_foreach: +93.41386%
- yolov3, ASGD, cuda, differentiable: +90.89548%
- yolov3, ASGD, cuda, foreach: +80.57876%
- basic_gnn_edgecnn, ASGD, cuda, default: +99.06302%
- basic_gnn_edgecnn, ASGD, cuda, maximize: +83.71059%
- basic_gnn_edgecnn, ASGD, cuda, no_foreach: +113.92414%
- basic_gnn_edgecnn, ASGD, cuda, differentiable: +110.96076%
- basic_gnn_edgecnn, ASGD, cuda, foreach: +98.60346%
- hf_Reformer, ASGD, cuda, default: +91.67268%
- hf_Reformer, ASGD, cuda, maximize: +82.64084%
- hf_Reformer, ASGD, cuda, no_foreach: +125.85424%
- hf_Reformer, ASGD, cuda, differentiable: +105.94022%
- hf_Reformer, ASGD, cuda, foreach: +90.47308%
- fambench_xlmr, ASGD, cuda, default: +201.16636%
- fambench_xlmr, ASGD, cuda, maximize: +188.32381%
- fambench_xlmr, ASGD, cuda, no_foreach: +140.31486%
- fambench_xlmr, ASGD, cuda, differentiable: +148.74302%
- fambench_xlmr, ASGD, cuda, foreach: +203.57251%
- hf_Bert_large, ASGD, cuda, default: +165.08672%
- hf_Bert_large, ASGD, cuda, maximize: +151.30732%
- hf_Bert_large, ASGD, cuda, no_foreach: +128.21581%
- hf_Bert_large, ASGD, cuda, differentiable: +126.48303%
- hf_Bert_large, ASGD, cuda, foreach: +158.99745%
- hf_GPT2, ASGD, cuda, default: +160.72487%
- hf_GPT2, ASGD, cuda, maximize: +160.21525%
- hf_GPT2, ASGD, cuda, no_foreach: +138.23310%
- hf_GPT2, ASGD, cuda, differentiable: +133.31532%
- hf_GPT2, ASGD, cuda, foreach: +173.83932%
- pytorch_stargan, ASGD, cuda, default: +87.11727%
- pytorch_stargan, ASGD, cuda, maximize: +79.06760%
- pytorch_stargan, ASGD, cuda, no_foreach: +114.98173%
- pytorch_stargan, ASGD, cuda, differentiable: +109.09947%
- pytorch_stargan, ASGD, cuda, foreach: +95.94788%
- nanogpt_generate, ASGD, cuda, default: +159.68400%
- nanogpt_generate, ASGD, cuda, maximize: +150.49595%
- nanogpt_generate, ASGD, cuda, no_foreach: +138.45369%
- nanogpt_generate, ASGD, cuda, differentiable: +136.21444%
- nanogpt_generate, ASGD, cuda, foreach: +172.49098%
- resnet152, ASGD, cuda, default: +104.25649%
- resnet152, ASGD, cuda, maximize: +96.57891%
- resnet152, ASGD, cuda, no_foreach: +119.36834%
- resnet152, ASGD, cuda, differentiable: +121.75067%
- resnet152, ASGD, cuda, foreach: +107.04176%
- hf_Whisper, ASGD, cuda, default: +103.45888%
- hf_Whisper, ASGD, cuda, maximize: +89.34926%
- hf_Whisper, ASGD, cuda, no_foreach: +125.70960%
- hf_Whisper, ASGD, cuda, differentiable: +123.69277%
- hf_Whisper, ASGD, cuda, foreach: +109.05051%
- maml, ASGD, cuda, default: +92.78801%
- maml, ASGD, cuda, maximize: +82.35410%
- maml, ASGD, cuda, no_foreach: +100.27183%
- maml, ASGD, cuda, differentiable: +109.62979%
- maml, ASGD, cuda, foreach: +98.58460%
- detectron2_fasterrcnn_r_50_dc5, ASGD, cuda, default: +171.64929%
- detectron2_fasterrcnn_r_50_dc5, ASGD, cuda, maximize: +138.27427%
- detectron2_fasterrcnn_r_50_dc5, ASGD, cuda, no_foreach: +179.65960%
- detectron2_fasterrcnn_r_50_dc5, ASGD, cuda, differentiable: +187.33681%
- detectron2_fasterrcnn_r_50_dc5, ASGD, cuda, foreach: +167.31098%
- hf_Bart, ASGD, cuda, default: +138.82232%
- hf_Bart, ASGD, cuda, maximize: +130.23887%
- hf_Bart, ASGD, cuda, no_foreach: +123.55852%
- hf_Bart, ASGD, cuda, differentiable: +130.70510%
- hf_Bart, ASGD, cuda, foreach: +134.35986%
- cm3leon_generate, ASGD, cuda, default: +198.06865%
- cm3leon_generate, ASGD, cuda, maximize: +151.59751%
- cm3leon_generate, ASGD, cuda, no_foreach: +154.95043%
- cm3leon_generate, ASGD, cuda, differentiable: +165.66967%
- cm3leon_generate, ASGD, cuda, foreach: +197.54174%
- mobilenet_v3_large, ASGD, cuda, default: +104.23360%
- mobilenet_v3_large, ASGD, cuda, maximize: +93.58383%
- mobilenet_v3_large, ASGD, cuda, no_foreach: +119.33540%
- mobilenet_v3_large, ASGD, cuda, differentiable: +123.04459%
- mobilenet_v3_large, ASGD, cuda, foreach: +110.83179%
- hf_T5_base, ASGD, cuda, default: +173.34395%
- hf_T5_base, ASGD, cuda, maximize: +154.96389%
- hf_T5_base, ASGD, cuda, no_foreach: +131.31289%
- hf_T5_base, ASGD, cuda, differentiable: +121.65876%
- hf_T5_base, ASGD, cuda, foreach: +184.47838%
- hf_BigBird, ASGD, cuda, default: +150.36223%
- hf_BigBird, ASGD, cuda, maximize: +140.99390%
- hf_BigBird, ASGD, cuda, no_foreach: +137.36228%
- hf_BigBird, ASGD, cuda, differentiable: +130.69910%
- hf_BigBird, ASGD, cuda, foreach: +163.78429%
- nvidia_deeprecommender, ASGD, cuda, default: +84.22858%
- nvidia_deeprecommender, ASGD, cuda, maximize: +73.59427%
- nvidia_deeprecommender, ASGD, cuda, no_foreach: +38.45586%
- nvidia_deeprecommender, ASGD, cuda, differentiable: +38.19497%
- nvidia_deeprecommender, ASGD, cuda, foreach: +84.71843%
- DALLE2_pytorch, ASGD, cuda, default: +120.79583%
- DALLE2_pytorch, ASGD, cuda, maximize: +101.18644%
- DALLE2_pytorch, ASGD, cuda, no_foreach: +128.45109%
- DALLE2_pytorch, ASGD, cuda, differentiable: +119.58039%
- DALLE2_pytorch, ASGD, cuda, foreach: +104.22114%
- resnet18, Adadelta, cuda, (pt2) no_foreach: -99.98704%
- resnet18, Adagrad, cuda, (pt2) no_foreach: -99.98506%
- resnet18, Adam, cuda, (pt2) no_foreach: +646679.56820%
- resnet18, Adam, cuda, (pt2) foreach: -99.97264%
- resnet18, Adam, cuda, (pt2) fused: +86.64405%
- resnet18, AdamW, cuda, (pt2) fused: +99.35395%
- resnet18, ASGD, cuda, default: +97.33380%
- resnet18, ASGD, cuda, maximize: +96.01251%
- resnet18, ASGD, cuda, (pt2) no_foreach: +135.46526%
- resnet18, ASGD, cuda, no_foreach: +111.06213%
- resnet18, ASGD, cuda, differentiable: +123.23425%
- resnet18, ASGD, cuda, foreach: +97.16978%
- resnet18, SGD, cuda, (pt2) foreach: +32.97657%
- resnet18, Rprop, cuda, (pt2) foreach: -99.99361%
- compile_time, resnet18, Adadelta, cuda, (pt2) no_foreach: +400.19988%
- compile_time, resnet18, Adagrad, cuda, (pt2) no_foreach: +429.48024%
- compile_time, resnet18, Adam, cuda, (pt2) foreach: +66977.29421%
- compile_time, resnet18, Adam, cuda, (pt2) fused: +223.51315%
- compile_time, resnet18, Adamax, cuda, (pt2) foreach: +302.31806%
- compile_time, resnet18, Rprop, cuda, (pt2) foreach: -4850.60281%
- compile_time, resnet18, NAdam, cuda, (pt2) foreach: +317.34996%
- detectron2_maskrcnn_r_101_c4, ASGD, cuda, default: +117.24819%
- detectron2_maskrcnn_r_101_c4, ASGD, cuda, maximize: +105.55289%
- detectron2_maskrcnn_r_101_c4, ASGD, cuda, no_foreach: +124.03172%
- detectron2_maskrcnn_r_101_c4, ASGD, cuda, differentiable: +116.10379%
- detectron2_maskrcnn_r_101_c4, ASGD, cuda, foreach: +111.02508%
- hf_T5_generate, ASGD, cuda, default: +125.90060%
- hf_T5_generate, ASGD, cuda, maximize: +113.89290%
- hf_T5_generate, ASGD, cuda, no_foreach: +127.24467%
- hf_T5_generate, ASGD, cuda, differentiable: +124.64940%
- hf_T5_generate, ASGD, cuda, foreach: +124.15655%
- hf_Longformer, ASGD, cuda, default: +125.81518%
- hf_Longformer, ASGD, cuda, maximize: +113.85564%
- hf_Longformer, ASGD, cuda, no_foreach: +126.27509%
- hf_Longformer, ASGD, cuda, differentiable: +115.95017%
- hf_Longformer, ASGD, cuda, foreach: +131.63718%
- timm_regnet, ASGD, cuda, default: +125.73607%
- timm_regnet, ASGD, cuda, maximize: +100.80882%
- timm_regnet, ASGD, cuda, no_foreach: +120.90401%
- timm_regnet, ASGD, cuda, differentiable: +126.60772%
- timm_regnet, ASGD, cuda, foreach: +109.82537%
- hf_DistilBert, ASGD, cuda, default: +136.46659%
- hf_DistilBert, ASGD, cuda, maximize: +154.20666%
- hf_DistilBert, ASGD, cuda, no_foreach: +133.22829%
- hf_DistilBert, ASGD, cuda, differentiable: +132.68567%
- hf_DistilBert, ASGD, cuda, foreach: +141.10390%
- pytorch_CycleGAN_and_pix2pix, ASGD, cuda, default: +85.50919%
- pytorch_CycleGAN_and_pix2pix, ASGD, cuda, maximize: +79.91551%
- pytorch_CycleGAN_and_pix2pix, ASGD, cuda, no_foreach: +114.22815%
- pytorch_CycleGAN_and_pix2pix, ASGD, cuda, differentiable: +105.53023%
- pytorch_CycleGAN_and_pix2pix, ASGD, cuda, foreach: +99.25474%
Tests that were no longer run on affected commit:
- timm_vision_transformer_large, RAdam, cuda, (pt2) default: 0.07160614579916
- timm_vision_transformer_large, RAdam, cuda, default: 0.08048462790126602
- timm_vision_transformer_large, RAdam, cuda, (pt2) foreach: 0.07142815878614783
- timm_vision_transformer_large, RAdam, cuda, foreach: 0.07780114312966664
- timm_vision_transformer_large, NAdam, cuda, default: 0.07558898767456412
- compile_time, timm_vision_transformer_large, RAdam, cuda, (pt2) default: 0.07164892243842283
- compile_time, timm_vision_transformer_large, RAdam, cuda, (pt2) foreach: 0.06621869125713906
Tests that were newly added on affected commit:
- timm_vision_transformer, ASGD, cuda, (pt2) default: 0.003701934844932773
- timm_vision_transformer, ASGD, cuda, (pt2) foreach: 0.0037623849197256343
- compile_time, timm_vision_transformer, ASGD, cuda, (pt2) default: 54.06454637941594
- compile_time, timm_vision_transformer, ASGD, cuda, (pt2) foreach: 51.20224057789892
- timm_vision_transformer_large, ASGD, cuda, (pt2) default: 0.023793959265781775
- timm_vision_transformer_large, ASGD, cuda, (pt2) no_foreach: 206.14174058521166
- timm_vision_transformer_large, ASGD, cuda, (pt2) foreach: 0.0820496737336119
- compile_time, timm_vision_transformer_large, ASGD, cuda, (pt2) default: 427.85859452995163
- compile_time, timm_vision_transformer_large, ASGD, cuda, (pt2) no_foreach: 13.715244447502016
- compile_time, timm_vision_transformer_large, ASGD, cuda, (pt2) foreach: 458.7825940608357
- doctr_det_predictor, Adadelta, cuda, default: 0.0028802707511931657
- doctr_det_predictor, Adadelta, cuda, maximize: 0.0031606352096423505
- doctr_det_predictor, Adadelta, cuda, no_foreach: 0.017995787295512856
- doctr_det_predictor, Adadelta, cuda, differentiable: 0.021498963190242647
- doctr_det_predictor, Adadelta, cuda, foreach: 0.0025630442099645735
- doctr_det_predictor, Adagrad, cuda, default: 0.003945592548698187
- doctr_det_predictor, Adagrad, cuda, maximize: 0.0039761970192193985
- doctr_det_predictor, Adagrad, cuda, no_foreach: 0.009838954374815028
- doctr_det_predictor, Adagrad, cuda, differentiable: 0.010537376883439718
- doctr_det_predictor, Adagrad, cuda, foreach: 0.0033657571813091635
- doctr_det_predictor, Adam, cuda, default: 0.003894355911761522
- doctr_det_predictor, Adam, cuda, amsgrad, maximize: 0.004776262207888067
- doctr_det_predictor, Adam, cuda, no_foreach: 0.01696098819375038
- doctr_det_predictor, Adam, cuda, differentiable: 0.0349869754165411
- doctr_det_predictor, Adam, cuda, foreach: 0.0039052480412647126
- doctr_det_predictor, Adam, cuda, foreach, maximize, capturable: 0.00634846021886915
- doctr_det_predictor, Adam, cuda, foreach, maximize, capturable, amsgrad: 0.006354622212238609
- doctr_det_predictor, Adam, cuda, fused: 0.0013104215636849403
- doctr_det_predictor, Adam, cuda, fused, amsgrad, maximize: 0.0013829589146189393
- doctr_det_predictor, Adam, cuda, fused, capturable: 0.0013211383065208793
- doctr_det_predictor, Adam, cuda, fused, capturable, amsgrad: 0.0013856540946289896
- doctr_det_predictor, AdamW, cuda, default: 0.003750761039555073
- doctr_det_predictor, AdamW, cuda, amsgrad, maximize: 0.0046320953080430625
- doctr_det_predictor, AdamW, cuda, no_foreach: 0.01873129182495177
- doctr_det_predictor, AdamW, cuda, differentiable: 0.03530704751610756
- doctr_det_predictor, AdamW, cuda, foreach: 0.003599941679276526
- doctr_det_predictor, AdamW, cuda, foreach, maximize, capturable: 0.006230019740760326
- doctr_det_predictor, AdamW, cuda, foreach, maximize, capturable, amsgrad: 0.006564915589988232
- doctr_det_predictor, AdamW, cuda, fused: 0.0013546907948330045
- doctr_det_predictor, AdamW, cuda, fused, amsgrad, maximize: 0.0014247736567631363
- doctr_det_predictor, AdamW, cuda, fused, capturable: 0.001353818892966956
- doctr_det_predictor, AdamW, cuda, fused, capturable, amsgrad: 0.0014276149054057896
- doctr_det_predictor, Adamax, cuda, default: 0.018037584936246277
- doctr_det_predictor, Adamax, cuda, maximize: 0.019045850331895053
- doctr_det_predictor, Adamax, cuda, no_foreach: 0.02449008859694004
- doctr_det_predictor, Adamax, cuda, differentiable: 0.029248994728550314
- doctr_det_predictor, Adamax, cuda, foreach: 0.01793181947432458
- doctr_det_predictor, ASGD, cuda, default: 0.014260424789972602
- doctr_det_predictor, ASGD, cuda, maximize: 0.014247538428753615
- doctr_det_predictor, ASGD, cuda, no_foreach: 0.028070646012201904
- doctr_det_predictor, ASGD, cuda, differentiable: 0.027313575381413102
- doctr_det_predictor, ASGD, cuda, foreach: 0.014710384048521518
- doctr_det_predictor, SGD, cuda, default: 0.0007995309079997241
- doctr_det_predictor, SGD, cuda, maximize: 0.0014174172608181833
- doctr_det_predictor, SGD, cuda, no_foreach: 0.0017253931146115065
- doctr_det_predictor, SGD, cuda, differentiable: 0.0008048358295733729
- doctr_det_predictor, SGD, cuda, foreach: 0.000608943731058389
- doctr_det_predictor, SGD, cuda, foreach, momentum=0.9, nesterov: 0.0009989061842982968
- doctr_det_predictor, SGD, cuda, foreach, momentum=0.9: 0.0008235378485793868
- doctr_det_predictor, RAdam, cuda, default: 0.004357524802908301
- doctr_det_predictor, RAdam, cuda, no_foreach: 0.027767344983294605
- doctr_det_predictor, RAdam, cuda, differentiable: 0.028750475216656923
- doctr_det_predictor, RAdam, cuda, foreach: 0.004138795551843941
- doctr_det_predictor, Rprop, cuda, default: 0.021949287690222263
- doctr_det_predictor, Rprop, cuda, maximize: 0.022139092488214374
- doctr_det_predictor, Rprop, cuda, no_foreach: 0.0354912742972374
- doctr_det_predictor, Rprop, cuda, differentiable: 0.03846609420143068
- doctr_det_predictor, Rprop, cuda, foreach: 0.022724414803087713
- doctr_det_predictor, RMSprop, cuda, default: 0.0019387207855470477
- doctr_det_predictor, RMSprop, cuda, maximize: 0.002601792109198868
- doctr_det_predictor, RMSprop, cuda, no_foreach: 0.00912425147059063
- doctr_det_predictor, RMSprop, cuda, differentiable: 0.009757765879233679
- doctr_det_predictor, RMSprop, cuda, foreach: 0.0017368051270022988
- doctr_det_predictor, NAdam, cuda, default: 0.005110027710907161
- doctr_det_predictor, NAdam, cuda, no_foreach: 0.021710659796372055
- doctr_det_predictor, NAdam, cuda, differentiable: 0.03594747381284833
- doctr_det_predictor, NAdam, cuda, foreach: 0.004948554779402912
- doctr_reco_predictor, Adadelta, cuda, default: 0.0015303045674227179
- doctr_reco_predictor, Adadelta, cuda, maximize: 0.0017265696777030827
- doctr_reco_predictor, Adadelta, cuda, no_foreach: 0.0067323700617998835
- doctr_reco_predictor, Adadelta, cuda, differentiable: 0.00797147342003882
- doctr_reco_predictor, Adadelta, cuda, foreach: 0.0014914630353450775
- doctr_reco_predictor, Adagrad, cuda, default: 0.0014977923524565995
- doctr_reco_predictor, Adagrad, cuda, maximize: 0.0014047086122445762
- doctr_reco_predictor, Adagrad, cuda, no_foreach: 0.0037795773101970552
- doctr_reco_predictor, Adagrad, cuda, differentiable: 0.0040812154300510885
- doctr_reco_predictor, Adagrad, cuda, foreach: 0.001298849075101316
- doctr_reco_predictor, Adam, cuda, default: 0.0014606264233589172
- doctr_reco_predictor, Adam, cuda, amsgrad, maximize: 0.0017359229386784136
- doctr_reco_predictor, Adam, cuda, no_foreach: 0.006150229680351913
- doctr_reco_predictor, Adam, cuda, differentiable: 0.013960111676715313
- doctr_reco_predictor, Adam, cuda, foreach: 0.0014205886446870862
- doctr_reco_predictor, Adam, cuda, foreach, maximize, capturable: 0.002500549927353859
- doctr_reco_predictor, Adam, cuda, foreach, maximize, capturable, amsgrad: 0.0024232029216364028
- doctr_reco_predictor, Adam, cuda, fused: 0.0005264460835605859
- doctr_reco_predictor, Adam, cuda, fused, amsgrad, maximize: 0.0007339034359902144
- doctr_reco_predictor, Adam, cuda, fused, capturable: 0.000526783952023834
- doctr_reco_predictor, Adam, cuda, fused, capturable, amsgrad: 0.000733910609036684
- doctr_reco_predictor, AdamW, cuda, default: 0.0014952924964018166
- doctr_reco_predictor, AdamW, cuda, amsgrad, maximize: 0.0017838270077481866
- doctr_reco_predictor, AdamW, cuda, no_foreach: 0.006765100718475878
- doctr_reco_predictor, AdamW, cuda, differentiable: 0.012846773536875845
- doctr_reco_predictor, AdamW, cuda, foreach: 0.0015132253337651492
- doctr_reco_predictor, AdamW, cuda, foreach, maximize, capturable: 0.0023649467574432493
- doctr_reco_predictor, AdamW, cuda, foreach, maximize, capturable, amsgrad: 0.0025600428320467473
- doctr_reco_predictor, AdamW, cuda, fused: 0.0005436234101653099
- doctr_reco_predictor, AdamW, cuda, fused, amsgrad, maximize: 0.0007554181362502277
- doctr_reco_predictor, AdamW, cuda, fused, capturable: 0.0005439776740968227
- doctr_reco_predictor, AdamW, cuda, fused, capturable, amsgrad: 0.0007581162578426301
- doctr_reco_predictor, Adamax, cuda, default: 0.0063193910196423534
- doctr_reco_predictor, Adamax, cuda, maximize: 0.006546114273369312
- doctr_reco_predictor, Adamax, cuda, no_foreach: 0.008442916348576546
- doctr_reco_predictor, Adamax, cuda, differentiable: 0.010225607780739665
- doctr_reco_predictor, Adamax, cuda, foreach: 0.0063998927315697075
- doctr_reco_predictor, ASGD, cuda, default: 0.005111781670711935
- doctr_reco_predictor, ASGD, cuda, maximize: 0.005300870798528194
- doctr_reco_predictor, ASGD, cuda, no_foreach: 0.00970820013123254
- doctr_reco_predictor, ASGD, cuda, differentiable: 0.009756598388776183
- doctr_reco_predictor, ASGD, cuda, foreach: 0.005147161749191582
- doctr_reco_predictor, SGD, cuda, default: 0.00033027638401836155
- doctr_reco_predictor, SGD, cuda, maximize: 0.0005508650196716189
- doctr_reco_predictor, SGD, cuda, no_foreach: 0.000654176879208535
- doctr_reco_predictor, SGD, cuda, differentiable: 0.00033117017801851035
- doctr_reco_predictor, SGD, cuda, foreach: 0.00024379345728084444
- doctr_reco_predictor, SGD, cuda, foreach, momentum=0.9, nesterov: 0.0006105483700521291
- doctr_reco_predictor, SGD, cuda, foreach, momentum=0.9: 0.0004410154549404979
- doctr_reco_predictor, RAdam, cuda, default: 0.0016630525072105229
- doctr_reco_predictor, RAdam, cuda, no_foreach: 0.009923019562847912
- doctr_reco_predictor, RAdam, cuda, differentiable: 0.010303066140040755
- doctr_reco_predictor, RAdam, cuda, foreach: 0.0016399111156351863
- doctr_reco_predictor, Rprop, cuda, default: 0.008071534298360349
- doctr_reco_predictor, Rprop, cuda, maximize: 0.008054946968331932
- doctr_reco_predictor, Rprop, cuda, no_foreach: 0.0125361071433872
- doctr_reco_predictor, Rprop, cuda, differentiable: 0.013898213882930577
- doctr_reco_predictor, Rprop, cuda, foreach: 0.007811435428448021
- doctr_reco_predictor, RMSprop, cuda, default: 0.0007729522828012705
- doctr_reco_predictor, RMSprop, cuda, maximize: 0.0009816498712946972
- doctr_reco_predictor, RMSprop, cuda, no_foreach: 0.0032276885583996775
- doctr_reco_predictor, RMSprop, cuda, differentiable: 0.0036167620681226255
- doctr_reco_predictor, RMSprop, cuda, foreach: 0.0006942034750245512
- doctr_reco_predictor, NAdam, cuda, default: 0.0020321781514212487
- doctr_reco_predictor, NAdam, cuda, no_foreach: 0.007569201388396323
- doctr_reco_predictor, NAdam, cuda, differentiable: 0.013424197630956768
- doctr_reco_predictor, NAdam, cuda, foreach: 0.0019183275150135158
- resnet18, ASGD, cuda, (pt2) default: 0.0017022616861356516
- resnet18, ASGD, cuda, (pt2) foreach: 0.0017011328324107295
- compile_time, resnet18, ASGD, cuda, (pt2) default: 17.65067977675547
- compile_time, resnet18, ASGD, cuda, (pt2) foreach: 16.845290329307318
Runtime regressions found?
An errors log was found. Please investigate runtime errors by looking into the logs of the workflow linked.
GitHub workflow that triggered this issue: https://github.com/pytorch/benchmark/actions/runs/6128788174
cc @janeyx99