-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loss doesn't decrease #246
Comments
Hi, I also encountered the same problem. Have you solved it? |
The same issue was encountered. Has anyone found the solution to this issue yet? Note: I am trying to train it on a custom dataset and Tiny Imagenet sets. However, I made sure to follow the Imagenet directory structure. |
Are you running the algorithm on multiple GPUs? I don't know why but using only a single GPU resolved the issue for me. |
@Hazarzi I tried both in multiple GPU nd single GPU settings. Same issues. Can you share any of your training details? That might help me and others with their training as well. |
@ComeBackCity I am training on an time-series dataset, so I'm not sure if my settings would apply. My dataset is of shape (128,3) with around 100k samples. I needed to change momentum_teacher parameter to 0.99 for example, else the loss won't decrease. I also use : Maybe you can give those parameters a try. I have succeeded with both a very slightly modified Vit-S and a very small version of it using those parameters. In my case, the momentum_teacher , lr and the batch size had the most impact on loss decrease. I use default parameters for the rest. |
The same issue was encountered when I run this scipt on a X-ray dataset, Did you have solutions? |
Below is my training log, using 6000 X-ray images.
It seems like it is not trained at all.
Is it because of the lack of the number of images or just because the model is not supposed to work well with grayscale datasets?
{"train_loss": 10.379679679870605, "train_lr": 0.0, "train_wd": 0.03999999999999998, "epoch": 0}
{"train_loss": 10.469603419303894, "train_lr": 0.0001111111111111111, "train_wd": 0.04008881913416823, "epoch": 1}
{"train_loss": 10.63415515422821, "train_lr": 0.0002222222222222222, "train_wd": 0.04035518888291112, "epoch": 2}
{"train_loss": 10.58477234840393, "train_lr": 0.00033333333333333343, "train_wd": 0.040798846371445596, "epoch": 3}
{"train_loss": 10.629495024681091, "train_lr": 0.0004444444444444444, "train_wd": 0.04141935376339401, "epoch": 4}
{"train_loss": 10.688157081604004, "train_lr": 0.0005555555555555556, "train_wd": 0.04221609869287518, "epoch": 5}
{"train_loss": 10.751326084136963, "train_lr": 0.0006666666666666669, "train_wd": 0.04318829486883602, "epoch": 6}
{"train_loss": 10.825130939483643, "train_lr": 0.0007777777777777778, "train_wd": 0.04433498285102544, "epoch": 7}
{"train_loss": 10.892325162887573, "train_lr": 0.0008888888888888888, "train_wd": 0.04565503099684637, "epoch": 8}
{"train_loss": 10.949982523918152, "train_lr": 0.001, "train_wd": 0.04714713657815017, "epoch": 9}
{"train_loss": 10.986122488975525, "train_lr": 0.001, "train_wd": 0.04880982706687231, "epoch": 10}
{"train_loss": 11.001526951789856, "train_lr": 0.0009996957180960385, "train_wd": 0.0506414615882394, "epoch": 11}
{"train_loss": 11.009398341178894, "train_lr": 0.0009987832431047822, "train_wd": 0.05264023254011474, "epoch": 12}
{"train_loss": 11.025790929794312, "train_lr": 0.0009972636867364526, "train_wd": 0.05480416737688337, "epoch": 13}
{"train_loss": 11.034379363059998, "train_lr": 0.0009951389003364144, "train_wd": 0.057131130556116516, "epoch": 14}
{"train_loss": 11.036903142929077, "train_lr": 0.000992411472629598, "train_wd": 0.059618825646093776, "epoch": 15}
{"train_loss": 11.044702529907227, "train_lr": 0.0009890847265665358, "train_wd": 0.06226479759210457, "epoch": 16}
{"train_loss": 11.054031491279602, "train_lr": 0.0009851627152748603, "train_wd": 0.06506643513929011, "epoch": 17}
{"train_loss": 11.060300827026367, "train_lr": 0.0009806502171211904, "train_wd": 0.06802097340963725, "epoch": 18}
{"train_loss": 11.065056920051575, "train_lr": 0.0009755527298894294, "train_wd": 0.07112549663057882, "epoch": 19}
{"train_loss": 11.07074248790741, "train_lr": 0.0009698764640825613, "train_wd": 0.07437694101250947, "epoch": 20}
{"train_loss": 11.0760897397995, "train_lr": 0.0009636283353561103, "train_wd": 0.07777209777237565, "epoch": 21}
{"train_loss": 11.078771710395813, "train_lr": 0.0009568159560924792, "train_wd": 0.08130761630035788, "epoch": 22}
{"train_loss": 11.079608798027039, "train_lr": 0.0009494476261264339, "train_wd": 0.08498000746651724, "epoch": 23}
{"train_loss": 11.080602526664734, "train_lr": 0.000941532322633034, "train_wd": 0.08878564706414588, "epoch": 24}
{"train_loss": 11.08201813697815, "train_lr": 0.0009330796891903272, "train_wd": 0.09272077938642143, "epoch": 25}
{"train_loss": 11.08295476436615, "train_lr": 0.0009241000240301346, "train_wd": 0.09678152093283604, "epoch": 26}
{"train_loss": 11.083083391189575, "train_lr": 0.0009146042674912436, "train_wd": 0.10096386424174264, "epoch": 27}
{"train_loss": 11.083047866821289, "train_lr": 0.0009046039886902864, "train_wd": 0.10526368184523582, "epoch": 28}
{"train_loss": 11.083422780036926, "train_lr": 0.0008941113714265576, "train_wd": 0.1096767303424642, "epoch": 29}
{"train_loss": 11.083853363990784, "train_lr": 0.0008831391993379298, "train_wd": 0.11419865458735479, "epoch": 30}
{"train_loss": 11.08417010307312, "train_lr": 0.0008717008403259587, "train_wd": 0.11882499198661645, "epoch": 31}
{"train_loss": 11.08401370048523, "train_lr": 0.0008598102302691563, "train_wd": 0.12355117690378059, "epoch": 32}
{"train_loss": 11.083914041519165, "train_lr": 0.0008474818560442693, "train_wd": 0.12837254516493313, "epoch": 33}
{"train_loss": 11.0840425491333, "train_lr": 0.00083473073787625, "train_wd": 0.1332843386616913, "epoch": 34}
{"train_loss": 11.092216730117798, "train_lr": 0.000981793221263352, "train_wd": 0.051955523230503675, "epoch": 35}
{"train_loss": 11.137936234474182, "train_lr": 0.0009803173800139577, "train_wd": 0.05264023254011474, "epoch": 36}
{"train_loss": 11.109129071235657, "train_lr": 0.0009787852300676173, "train_wd": 0.05334329473420091, "epoch": 37}
{"train_loss": 11.103555560112, "train_lr": 0.00097719695122892, "train_wd": 0.054064632714069916, "epoch": 38}
{"train_loss": 11.098443746566772, "train_lr": 0.0009755527298894294, "train_wd": 0.05480416737688337, "epoch": 39}
{"train_loss": 11.100013732910156, "train_lr": 0.0009738527590058101, "train_wd": 0.05556181762433182, "epoch": 40}
{"train_loss": 11.099824070930481, "train_lr": 0.0009720972380771821, "train_wd": 0.05633750037152768, "epoch": 41}
{"train_loss": 11.096100926399231, "train_lr": 0.0009702863731217106, "train_wd": 0.057131130556116516, "epoch": 42}
{"train_loss": 11.093004941940308, "train_lr": 0.0009684203766524266, "train_wd": 0.05794262114760523, "epoch": 43}
{"train_loss": 11.092452049255371, "train_lr": 0.0009664994676522895, "train_wd": 0.05877188315690568, "epoch": 44}
{"train_loss": 11.092915654182434, "train_lr": 0.0009645238715484873, "train_wd": 0.059618825646093776, "epoch": 45}
{"train_loss": 11.092563390731812, "train_lr": 0.0009624938201859819, "train_wd": 0.06048335573838137, "epoch": 46}
{"train_loss": 11.091467380523682, "train_lr": 0.000960409551800302, "train_wd": 0.06136537862830144, "epoch": 47}
{"train_loss": 11.090512990951538, "train_lr": 0.0009582713109895831, "train_wd": 0.06226479759210457, "epoch": 48}
{"train_loss": 11.090077996253967, "train_lr": 0.0009560793486858651, "train_wd": 0.06318151399836591, "epoch": 49}
{"train_loss": 11.089910387992859, "train_lr": 0.0009538339221256428, "train_wd": 0.0641154273188011, "epoch": 50}
{"train_loss": 11.08969497680664, "train_lr": 0.0009515352948196797, "train_wd": 0.06506643513929011, "epoch": 51}
{"train_loss": 11.089363932609558, "train_lr": 0.0009491837365220806, "train_wd": 0.0660344331711088, "epoch": 52}
{"train_loss": 11.089028596878052, "train_lr": 0.0009467795231986388, "train_wd": 0.06701931526236449, "epoch": 53}
{"train_loss": 11.088770270347595, "train_lr": 0.0009443229369944471, "train_wd": 0.06802097340963725, "epoch": 54}
{"train_loss": 11.088574528694153, "train_lr": 0.0009418142662007886, "train_wd": 0.06903929776982365, "epoch": 55}
{"train_loss": 11.088395953178406, "train_lr": 0.0009392538052213037, "train_wd": 0.07007417667218202, "epoch": 56}
{"train_loss": 11.08823025226593, "train_lr": 0.0009366418545374411, "train_wd": 0.07112549663057882, "epoch": 57}
{"train_loss": 11.088103413581848, "train_lr": 0.0009339787206731942, "train_wd": 0.07219314235593322, "epoch": 58}
{"train_loss": 11.08801817893982, "train_lr": 0.0009312647161591298, "train_wd": 0.07327699676886018, "epoch": 59}
{"train_loss": 11.087947607040405, "train_lr": 0.0009285001594957108, "train_wd": 0.07437694101250947, "epoch": 60}
{"train_loss": 11.087868571281433, "train_lr": 0.0009256853751159192, "train_wd": 0.07549285446559933, "epoch": 61}
{"train_loss": 11.087795495986938, "train_lr": 0.000922820693347182, "train_wd": 0.07662461475564464, "epoch": 62}
{"train_loss": 11.087739109992981, "train_lr": 0.0009199064503726061, "train_wd": 0.07777209777237565, "epoch": 63}
{"train_loss": 11.087707042694092, "train_lr": 0.0009169429881915257, "train_wd": 0.07893517768134883, "epoch": 64}
{"train_loss": 11.087696194648743, "train_lr": 0.0009139306545793669, "train_wd": 0.08011372693774516, "epoch": 65}
{"train_loss": 11.087706685066223, "train_lr": 0.0009108698030468355, "train_wd": 0.08130761630035788, "epoch": 66}
{"train_loss": 11.087712168693542, "train_lr": 0.0009077607927984296, "train_wd": 0.08251671484576434, "epoch": 67}
{"train_loss": 11.087727546691895, "train_lr": 0.0009046039886902864, "train_wd": 0.08374088998268381, "epoch": 68}
{"train_loss": 11.087751865386963, "train_lr": 0.0009013997611873648, "train_wd": 0.08498000746651724, "epoch": 69}
{"train_loss": 11.087775588035583, "train_lr": 0.0008981484863199693, "train_wd": 0.08623393141406899, "epoch": 70}
{"train_loss": 11.087813973426819, "train_lr": 0.0008948505456396204, "train_wd": 0.0875025243184478, "epoch": 71}
{"train_loss": 11.087857961654663, "train_lr": 0.0008915063261742798, "train_wd": 0.08878564706414588, "epoch": 72}
{"train_loss": 11.087905645370483, "train_lr": 0.0008881162203829292, "train_wd": 0.09008315894229474, "epoch": 73}
{"train_loss": 11.087956428527832, "train_lr": 0.0008846806261095138, "train_wd": 0.09139491766609537, "epoch": 74}
{"train_loss": 11.0880126953125, "train_lr": 0.0008811999465362547, "train_wd": 0.09272077938642143, "epoch": 75}
{"train_loss": 11.0880708694458, "train_lr": 0.0008776745901363321, "train_wd": 0.09406059870759415, "epoch": 76}
{"train_loss": 11.088123679161072, "train_lr": 0.0008741049706259501, "train_wd": 0.09541422870332672, "epoch": 77}
{"train_loss": 11.088182091712952, "train_lr": 0.0008704915069157849, "train_wd": 0.09678152093283604, "epoch": 78}
{"train_loss": 11.088242173194885, "train_lr": 0.0008668346230618249, "train_wd": 0.09816232545712106, "epoch": 79}
{"train_loss": 11.088302969932556, "train_lr": 0.0008631347482156039, "train_wd": 0.09955649085540552, "epoch": 80}
{"train_loss": 11.088366150856018, "train_lr": 0.0008593923165738395, "train_wd": 0.10096386424174264, "epoch": 81}
{"train_loss": 11.088428378105164, "train_lr": 0.0008556077673274774, "train_wd": 0.10238429128178098, "epoch": 82}
{"train_loss": 11.088489651679993, "train_lr": 0.0008517815446101522, "train_wd": 0.10381761620968888, "epoch": 83}
{"train_loss": 11.08854877948761, "train_lr": 0.0008479140974460641, "train_wd": 0.10526368184523582, "epoch": 84}
{"train_loss": 11.088610053062439, "train_lr": 0.000844005879697285, "train_wd": 0.10672232961102932, "epoch": 85}
{"train_loss": 11.088668823242188, "train_lr": 0.0008400573500104963, "train_wd": 0.10819339954990406, "epoch": 86}
{"train_loss": 11.088728904724121, "train_lr": 0.0008360689717631634, "train_wd": 0.1096767303424642, "epoch": 87}
{"train_loss": 11.08878219127655, "train_lr": 0.0008320412130091573, "train_wd": 0.11117215932477251, "epoch": 88}
{"train_loss": 11.08883810043335, "train_lr": 0.0008279745464238257, "train_wd": 0.11267952250618915, "epoch": 89}
{"train_loss": 11.088895916938782, "train_lr": 0.0008238694492485232, "train_wd": 0.11419865458735481, "epoch": 90}
{"train_loss": 11.088945388793945, "train_lr": 0.0008197264032346043, "train_wd": 0.11572938897831778, "epoch": 91}
{"train_loss": 11.089001059532166, "train_lr": 0.0008155458945868883, "train_wd": 0.11727155781680224, "epoch": 92}
{"train_loss": 11.089049935340881, "train_lr": 0.0008113284139066002, "train_wd": 0.11882499198661645, "epoch": 93}
{"train_loss": 11.089096069335938, "train_lr": 0.0008070744561337977, "train_wd": 0.12038952113619805, "epoch": 94}
{"train_loss": 11.089142799377441, "train_lr": 0.000802784520489286, "train_wd": 0.12196497369729517, "epoch": 95}
{"train_loss": 11.089187622070312, "train_lr": 0.0007984591104160335, "train_wd": 0.12355117690378059, "epoch": 96}
{"train_loss": 11.089232802391052, "train_lr": 0.0007940987335200904, "train_wd": 0.12514795681059798, "epoch": 97}
{"train_loss": 11.089276313781738, "train_lr": 0.0007897039015110186, "train_wd": 0.1267551383128365, "epoch": 98}
{"train_loss": 11.089312195777893, "train_lr": 0.0007852751301418395, "train_wd": 0.12837254516493313, "epoch": 99}
{"train_loss": 11.089354634284973, "train_lr": 0.0007808129391485102, "train_wd": 0.13, "epoch": 100}
{"train_loss": 11.089390754699707, "train_lr": 0.0007763178521889274, "train_wd": 0.13163732434927466, "epoch": 101}
{"train_loss": 11.089428782463074, "train_lr": 0.0007717903967814763, "train_wd": 0.1332843386616912, "epoch": 102}
{"train_loss": 11.089462876319885, "train_lr": 0.0007672311042431228, "train_wd": 0.13494086232357028, "epoch": 103}
{"train_loss": 11.089493989944458, "train_lr": 0.0007626405096270608, "train_wd": 0.13660671367842483, "epoch": 104}
{"train_loss": 11.08952808380127, "train_lr": 0.0007580191516599224, "train_wd": 0.13828171004688158, "epoch": 105}
{"train_loss": 11.089558601379395, "train_lr": 0.0007533675726785552, "train_wd": 0.13996566774671304, "epoch": 106}
{"train_loss": 11.08958625793457, "train_lr": 0.0007486863185663765, "train_wd": 0.141658402112981, "epoch": 107}
{"train_loss": 11.089615821838379, "train_lr": 0.0007439759386893121, "train_wd": 0.14335972751828693, "epoch": 108}
{"train_loss": 11.089645266532898, "train_lr": 0.0007392369858313252, "train_wd": 0.14506945739312774, "epoch": 109}
{"train_loss": 11.08967399597168, "train_lr": 0.0007344700161295453, "train_wd": 0.14678740424635595, "epoch": 110}
{"train_loss": 11.089695930480957, "train_lr": 0.0007296755890090025, "train_wd": 0.14851337968573952, "epoch": 111}
{"train_loss": 11.089720845222473, "train_lr": 0.0007248542671169767, "train_wd": 0.15024719443862147, "epoch": 112}
{"train_loss": 11.089741706848145, "train_lr": 0.0007200066162569687, "train_wd": 0.15198865837267592, "epoch": 113}
{"train_loss": 11.089766502380371, "train_lr": 0.0007151332053223004, "train_wd": 0.15373758051675793, "epoch": 114}
{"train_loss": 11.089787483215332, "train_lr": 0.0007102346062293521, "train_wd": 0.15549376908184595, "epoch": 115}
{"train_loss": 11.08980917930603, "train_lr": 0.0007053113938504473, "train_wd": 0.15725703148207326, "epoch": 116}
{"train_loss": 11.089828491210938, "train_lr": 0.000700364145946387, "train_wd": 0.15902717435584754, "epoch": 117}
{"train_loss": 11.089845657348633, "train_lr": 0.0006953934430986471, "train_wd": 0.16080400358705502, "epoch": 118}
{"train_loss": 11.089866161346436, "train_lr": 0.0006903998686412462, "train_wd": 0.1625873243263474, "epoch": 119}
{"train_loss": 11.089879035949707, "train_lr": 0.0006853840085922873, "train_wd": 0.16437694101250946, "epoch": 120}
{"train_loss": 11.089900970458984, "train_lr": 0.0006803464515851862, "train_wd": 0.16617265739390438, "epoch": 121}
{"train_loss": 11.089916229248047, "train_lr": 0.0006752877887995934, "train_wd": 0.16797427654999508, "epoch": 122}
{"train_loss": 11.089932680130005, "train_lr": 0.0006702086138920148, "train_wd": 0.1697816009129387, "epoch": 123}
{"train_loss": 11.089946746826172, "train_lr": 0.0006651095229261469, "train_wd": 0.17159443228925217, "epoch": 124}
{"train_loss": 11.089962005615234, "train_lr": 0.0006599911143029221, "train_wd": 0.1734125718815462, "epoch": 125}
{"train_loss": 11.089970588684082, "train_lr": 0.0006548539886902864, "train_wd": 0.1752358203103261, "epoch": 126}
{"train_loss": 11.089985847473145, "train_lr": 0.000649698748952707, "train_wd": 0.1770639776358554, "epoch": 127}
{"train_loss": 11.089998126029968, "train_lr": 0.0006445260000804248, "train_wd": 0.17889684338008197, "epoch": 128}
{"train_loss": 11.090011596679688, "train_lr": 0.0006393363491184544, "train_wd": 0.18073421654862232, "epoch": 129}
{"train_loss": 11.090024948120117, "train_lr": 0.0006341304050953462, "train_wd": 0.1825758956528033, "epoch": 130}
{"train_loss": 11.090031623840332, "train_lr": 0.0006289087789517123, "train_wd": 0.18442167873175724, "epoch": 131}
{"train_loss": 11.090043067932129, "train_lr": 0.0006236720834685338, "train_wd": 0.18627136337456957, "epoch": 132}
{"train_loss": 11.090056419372559, "train_lr": 0.0006184209331952427, "train_wd": 0.18812474674247495, "epoch": 133}
{"train_loss": 11.090062141418457, "train_lr": 0.0006131559443776064, "train_wd": 0.18998162559110157, "epoch": 134}
{"train_loss": 11.090073585510254, "train_lr": 0.0006078777348854067, "train_wd": 0.19184179629275844, "epoch": 135}
{"train_loss": 11.090085983276367, "train_lr": 0.0006025869241399294, "train_wd": 0.19370505485876588, "epoch": 136}
{"train_loss": 11.090092658996582, "train_lr": 0.0005972841330412741, "train_wd": 0.1955711969618252, "epoch": 137}
{"train_loss": 11.090102195739746, "train_lr": 0.0005919699838954871, "train_wd": 0.19744001795842522, "epoch": 138}
{"train_loss": 11.090112566947937, "train_lr": 0.0005866451003415332, "train_wd": 0.19931131291128398, "epoch": 139}
{"train_loss": 11.090118408203125, "train_lr": 0.0005813101072781062, "train_wd": 0.2011848766118223, "epoch": 140}
{"train_loss": 11.090123176574707, "train_lr": 0.0005759656307902962, "train_wd": 0.2030605036026674, "epoch": 141}
{"train_loss": 11.090133666992188, "train_lr": 0.0005706122980761156, "train_wd": 0.2049379882001832, "epoch": 142}
{"train_loss": 11.090142607688904, "train_lr": 0.0005652507373728935, "train_wd": 0.2068171245170263, "epoch": 143}
{"train_loss": 11.090147972106934, "train_lr": 0.0005598815778835508, "train_wd": 0.20869770648472355, "epoch": 144}
{"train_loss": 11.090153694152832, "train_lr": 0.0005545054497027589, "train_wd": 0.2105795278762701, "epoch": 145}
{"train_loss": 11.09016227722168, "train_lr": 0.0005491229837429964, "train_wd": 0.2124623823287441, "epoch": 146}
{"train_loss": 11.090165138244629, "train_lr": 0.0005437348116605086, "train_wd": 0.21434606336593692, "epoch": 147}
{"train_loss": 11.090174674987793, "train_lr": 0.0005383415657811791, "train_wd": 0.21623036442099575, "epoch": 148}
{"train_loss": 11.090179324150085, "train_lr": 0.0005329438790263238, "train_wd": 0.21811507885907574, "epoch": 149}
{"train_loss": 11.090181469917297, "train_lr": 0.0005275423848384162, "train_wd": 0.22, "epoch": 150}
{"train_loss": 11.090190887451172, "train_lr": 0.000522137717106748, "train_wd": 0.2218849211409242, "epoch": 151}
{"train_loss": 11.090194702148438, "train_lr": 0.0005167305100930406, "train_wd": 0.2237696355790042, "epoch": 152}
{"train_loss": 11.090197801589966, "train_lr": 0.0005113213983570112, "train_wd": 0.2256539366340631, "epoch": 153}
{"train_loss": 11.090206146240234, "train_lr": 0.000505911016681905, "train_wd": 0.22753761767125597, "epoch": 154}
{"train_loss": 11.0902099609375, "train_lr": 0.0005005000000000001, "train_wd": 0.22942047212372987, "epoch": 155}
{"train_loss": 11.090211868286133, "train_lr": 0.0004950889833180952, "train_wd": 0.2313022935152764, "epoch": 156}
{"train_loss": 11.090214729309082, "train_lr": 0.0004896786016429891, "train_wd": 0.23318287548297367, "epoch": 157}
{"train_loss": 11.090224266052246, "train_lr": 0.00048426948990695984, "train_wd": 0.2350620117998168, "epoch": 158}
{"train_loss": 11.090226173400879, "train_lr": 0.0004788622828932522, "train_wd": 0.2369394963973326, "epoch": 159}
{"train_loss": 11.090229988098145, "train_lr": 0.00047345761516158404, "train_wd": 0.23881512338817762, "epoch": 160}
{"train_loss": 11.09023666381836, "train_lr": 0.00046805612097367645, "train_wd": 0.24068868708871596, "epoch": 161}
{"train_loss": 11.090239524841309, "train_lr": 0.0004626584342188214, "train_wd": 0.24255998204157472, "epoch": 162}
{"train_loss": 11.090241432189941, "train_lr": 0.0004572651883394916, "train_wd": 0.24442880303817474, "epoch": 163}
{"train_loss": 11.090245127677917, "train_lr": 0.0004518770162570035, "train_wd": 0.246294945141234, "epoch": 164}
{"train_loss": 11.090251803398132, "train_lr": 0.00044649455029724123, "train_wd": 0.24815820370724145, "epoch": 165}
{"train_loss": 11.090253829956055, "train_lr": 0.0004411184221164493, "train_wd": 0.2500183744088984, "epoch": 166}
{"train_loss": 11.090269088745117, "train_lr": 0.0004357492626271066, "train_wd": 0.251875253257525, "epoch": 167}
{"train_loss": 11.09025764465332, "train_lr": 0.00043038770192388453, "train_wd": 0.2537286366254304, "epoch": 168}
{"train_loss": 11.09026050567627, "train_lr": 0.000425034369209704, "train_wd": 0.25557832126824276, "epoch": 169}
{"train_loss": 11.090267181396484, "train_lr": 0.00041968989272189413, "train_wd": 0.2574241043471967, "epoch": 170}
{"train_loss": 11.090269088745117, "train_lr": 0.0004143548996584671, "train_wd": 0.2592657834513777, "epoch": 171}
{"train_loss": 11.09027099609375, "train_lr": 0.000409030016104513, "train_wd": 0.261103156619918, "epoch": 172}
{"train_loss": 11.090271949768066, "train_lr": 0.00040371586695872635, "train_wd": 0.26293602236414454, "epoch": 173}
{"train_loss": 11.090273022651672, "train_lr": 0.00039841307586007096, "train_wd": 0.2647641796896738, "epoch": 174}
{"train_loss": 11.090275764465332, "train_lr": 0.0003931222651145936, "train_wd": 0.2665874281184537, "epoch": 175}
{"train_loss": 11.090282440185547, "train_lr": 0.0003878440556223935, "train_wd": 0.26840556771074775, "epoch": 176}
{"train_loss": 11.09028434753418, "train_lr": 0.0003825790668047575, "train_wd": 0.2702183990870613, "epoch": 177}
{"train_loss": 11.090285301208496, "train_lr": 0.00037732791653146645, "train_wd": 0.2720257234500049, "epoch": 178}
{"train_loss": 11.090287208557129, "train_lr": 0.0003720912210482876, "train_wd": 0.2738273426060957, "epoch": 179}
{"train_loss": 11.090288162231445, "train_lr": 0.00036686959490465427, "train_wd": 0.27562305898749057, "epoch": 180}
{"train_loss": 11.090291023254395, "train_lr": 0.00036166365088154573, "train_wd": 0.2774126756736526, "epoch": 181}
{"train_loss": 11.090296745300293, "train_lr": 0.0003564739999195754, "train_wd": 0.27919599641294507, "epoch": 182}
{"train_loss": 11.09029769897461, "train_lr": 0.0003513012510472932, "train_wd": 0.28097282564415244, "epoch": 183}
{"train_loss": 11.090299606323242, "train_lr": 0.00034614601130971394, "train_wd": 0.28274296851792663, "epoch": 184}
{"train_loss": 11.090300559997559, "train_lr": 0.0003410088856970781, "train_wd": 0.284506230918154, "epoch": 185}
{"train_loss": 11.090301513671875, "train_lr": 0.0003358904770738533, "train_wd": 0.2862624194832421, "epoch": 186}
{"train_loss": 11.090302467346191, "train_lr": 0.0003307913861079851, "train_wd": 0.28801134162732417, "epoch": 187}
{"train_loss": 11.090303421020508, "train_lr": 0.0003257122112004069, "train_wd": 0.28975280556137856, "epoch": 188}
{"train_loss": 11.090306282043457, "train_lr": 0.00032065354841481395, "train_wd": 0.29148662031426054, "epoch": 189}
{"train_loss": 11.090306282043457, "train_lr": 0.00031561599140771286, "train_wd": 0.29321259575364406, "epoch": 190}
{"train_loss": 11.090311527252197, "train_lr": 0.00031060013135875397, "train_wd": 0.2949305426068722, "epoch": 191}
{"train_loss": 11.090312957763672, "train_lr": 0.0003056065569013531, "train_wd": 0.29664027248171315, "epoch": 192}
{"train_loss": 11.090314865112305, "train_lr": 0.0003006358540536134, "train_wd": 0.29834159788701897, "epoch": 193}
{"train_loss": 11.090315818786621, "train_lr": 0.0002956886061495528, "train_wd": 0.30003433225328685, "epoch": 194}
{"train_loss": 11.090316772460938, "train_lr": 0.0002907653937706481, "train_wd": 0.3017182899531185, "epoch": 195}
{"train_loss": 11.090316772460938, "train_lr": 0.00028586679467769996, "train_wd": 0.3033932863215751, "epoch": 196}
{"train_loss": 11.090317726135254, "train_lr": 0.0002809933837430315, "train_wd": 0.3050591376764297, "epoch": 197}
{"train_loss": 11.09031867980957, "train_lr": 0.0002761457328830235, "train_wd": 0.3067156613383088, "epoch": 198}
{"train_loss": 11.09031867980957, "train_lr": 0.00027132441099099767, "train_wd": 0.30836267565072534, "epoch": 199}
{"train_loss": 11.09032154083252, "train_lr": 0.00026652998387045494, "train_wd": 0.31000000000000005, "epoch": 200}
{"train_loss": 11.090327262878418, "train_lr": 0.0002617630141686749, "train_wd": 0.3116274548350668, "epoch": 201}
{"train_loss": 11.090327262878418, "train_lr": 0.00025702406131068825, "train_wd": 0.3132448616871634, "epoch": 202}
{"train_loss": 11.090328216552734, "train_lr": 0.00025231368143362377, "train_wd": 0.31485204318940196, "epoch": 203}
{"train_loss": 11.090328216552734, "train_lr": 0.000247632427321445, "train_wd": 0.3164488230962194, "epoch": 204}
{"train_loss": 11.090330123901367, "train_lr": 0.00024298084834007777, "train_wd": 0.3180350263027048, "epoch": 205}
{"train_loss": 11.090330481529236, "train_lr": 0.0002383594903729394, "train_wd": 0.3196104788638019, "epoch": 206}
{"train_loss": 11.090331077575684, "train_lr": 0.00023376889575687766, "train_wd": 0.32117500801338345, "epoch": 207}
{"train_loss": 11.09033203125, "train_lr": 0.00022920960321852387, "train_wd": 0.3227284421831977, "epoch": 208}
{"train_loss": 11.09033203125, "train_lr": 0.00022468214781107276, "train_wd": 0.3242706110216822, "epoch": 209}
{"train_loss": 11.090332508087158, "train_lr": 0.00022018706085149019, "train_wd": 0.32580134541264516, "epoch": 210}
{"train_loss": 11.090332984924316, "train_lr": 0.00021572486985816056, "train_wd": 0.32732047749381094, "epoch": 211}
{"train_loss": 11.090332984924316, "train_lr": 0.0002112960984889818, "train_wd": 0.32882784067522747, "epoch": 212}
{"train_loss": 11.090333938598633, "train_lr": 0.00020690126647990978, "train_wd": 0.3303232696575357, "epoch": 213}
{"train_loss": 11.090333938598633, "train_lr": 0.00020254088958396676, "train_wd": 0.33180660045009586, "epoch": 214}
{"train_loss": 11.090336799621582, "train_lr": 0.00019821547951071446, "train_wd": 0.33327767038897077, "epoch": 215}
{"train_loss": 11.090336799621582, "train_lr": 0.00019392554386620265, "train_wd": 0.3347363181547642, "epoch": 216}
{"train_loss": 11.090336799621582, "train_lr": 0.00018967158609339995, "train_wd": 0.33618238379031107, "epoch": 217}
{"train_loss": 11.090336799621582, "train_lr": 0.00018545410541311182, "train_wd": 0.3376157087182189, "epoch": 218}
{"train_loss": 11.090336799621582, "train_lr": 0.00018127359676539597, "train_wd": 0.3390361357582573, "epoch": 219}
{"train_loss": 11.090337634086609, "train_lr": 0.00017713055075147715, "train_wd": 0.3404435091445945, "epoch": 220}
{"train_loss": 11.090343475341797, "train_lr": 0.0001730254535761746, "train_wd": 0.3418376745428789, "epoch": 221}
{"train_loss": 11.090343475341797, "train_lr": 0.00016895878699084313, "train_wd": 0.34321847906716396, "epoch": 222}
{"train_loss": 11.090343713760376, "train_lr": 0.0001649310282368368, "train_wd": 0.34458577129667317, "epoch": 223}
{"train_loss": 11.09034538269043, "train_lr": 0.00016094264998950384, "train_wd": 0.34593940129240586, "epoch": 224}
{"train_loss": 11.09034538269043, "train_lr": 0.00015699412030271517, "train_wd": 0.3472792206135785, "epoch": 225}
{"train_loss": 11.09034538269043, "train_lr": 0.00015308590255393615, "train_wd": 0.3486050823339046, "epoch": 226}
{"train_loss": 11.090346336364746, "train_lr": 0.0001492184553898481, "train_wd": 0.3499168410577052, "epoch": 227}
{"train_loss": 11.090346336364746, "train_lr": 0.00014539223267252276, "train_wd": 0.35121435293585407, "epoch": 228}
{"train_loss": 11.090346336364746, "train_lr": 0.00014160768342616083, "train_wd": 0.3524974756815521, "epoch": 229}
{"train_loss": 11.090346336364746, "train_lr": 0.0001378652517843961, "train_wd": 0.353766068585931, "epoch": 230}
{"train_loss": 11.090347051620483, "train_lr": 0.00013416537693817498, "train_wd": 0.3550199925334828, "epoch": 231}
{"train_loss": 11.090347290039062, "train_lr": 0.00013050849308421502, "train_wd": 0.3562591100173162, "epoch": 232}
{"train_loss": 11.090347290039062, "train_lr": 0.00012689502937405017, "train_wd": 0.3574832851542356, "epoch": 233}
{"train_loss": 11.090347290039062, "train_lr": 0.00012332540986366834, "train_wd": 0.35869238369964207, "epoch": 234}
{"train_loss": 11.090347290039062, "train_lr": 0.00011980005346374558, "train_wd": 0.35988627306225485, "epoch": 235}
{"train_loss": 11.090347290039062, "train_lr": 0.0001163193738904864, "train_wd": 0.36106482231865117, "epoch": 236}
{"train_loss": 11.090348243713379, "train_lr": 0.00011288377961707113, "train_wd": 0.36222790222762424, "epoch": 237}
{"train_loss": 11.090348243713379, "train_lr": 0.00010949367382572036, "train_wd": 0.3633753852443554, "epoch": 238}
{"train_loss": 11.090348243713379, "train_lr": 0.00010614945436037967, "train_wd": 0.3645071455344006, "epoch": 239}
{"train_loss": 11.090348243713379, "train_lr": 0.00010285151368003088, "train_wd": 0.36562305898749065, "epoch": 240}
{"train_loss": 11.090348243713379, "train_lr": 9.960023881263519e-05, "train_wd": 0.36672300323113977, "epoch": 241}
{"train_loss": 11.090348839759827, "train_lr": 9.639601130971378e-05, "train_wd": 0.36780685764406673, "epoch": 242}
{"train_loss": 11.090349197387695, "train_lr": 9.323920720157066e-05, "train_wd": 0.3688745033694212, "epoch": 243}
{"train_loss": 11.090349197387695, "train_lr": 9.013019695316482e-05, "train_wd": 0.36992582332781787, "epoch": 244}
{"train_loss": 11.090349197387695, "train_lr": 8.706934542063322e-05, "train_wd": 0.37096070223017635, "epoch": 245}
{"train_loss": 11.090349197387695, "train_lr": 8.405701180847447e-05, "train_wd": 0.3719790265903628, "epoch": 246}
{"train_loss": 11.090349197387695, "train_lr": 8.109354962739411e-05, "train_wd": 0.3729806847376355, "epoch": 247}
{"train_loss": 11.090349555015564, "train_lr": 7.817930665281811e-05, "train_wd": 0.37396556682889115, "epoch": 248}
{"train_loss": 11.090350031852722, "train_lr": 7.531462488408094e-05, "train_wd": 0.3749335648607098, "epoch": 249}
{"train_loss": 11.090351939201355, "train_lr": 7.249984050428927e-05, "train_wd": 0.37588457268119896, "epoch": 250}
{"train_loss": 11.090352058410645, "train_lr": 6.973528384087014e-05, "train_wd": 0.37681848600163415, "epoch": 251}
{"train_loss": 11.090352058410645, "train_lr": 6.702127932680585e-05, "train_wd": 0.3777352024078955, "epoch": 252}
{"train_loss": 11.090352058410645, "train_lr": 6.435814546255893e-05, "train_wd": 0.3786346213716985, "epoch": 253}
{"train_loss": 11.090352058410645, "train_lr": 6.174619477869647e-05, "train_wd": 0.3795166442616186, "epoch": 254}
{"train_loss": 11.090352058410645, "train_lr": 5.918573379921161e-05, "train_wd": 0.3803811743539062, "epoch": 255}
{"train_loss": 11.090352058410645, "train_lr": 5.667706300555308e-05, "train_wd": 0.3812281168430943, "epoch": 256}
{"train_loss": 11.090352058410645, "train_lr": 5.4220476801361406e-05, "train_wd": 0.3820573788523948, "epoch": 257}
{"train_loss": 11.090352058410645, "train_lr": 5.1816263477919393e-05, "train_wd": 0.38286886944388343, "epoch": 258}
{"train_loss": 11.090352058410645, "train_lr": 4.946470518032053e-05, "train_wd": 0.3836624996284723, "epoch": 259}
{"train_loss": 11.090352058410645, "train_lr": 4.716607787435704e-05, "train_wd": 0.3844381823756681, "epoch": 260}
{"train_loss": 11.090352177619934, "train_lr": 4.4920651314135e-05, "train_wd": 0.3851958326231166, "epoch": 261}
{"train_loss": 11.090352654457092, "train_lr": 4.272868901041716e-05, "train_wd": 0.3859353672859301, "epoch": 262}
{"train_loss": 11.090352892875671, "train_lr": 4.059044819969824e-05, "train_wd": 0.38665670526579904, "epoch": 263}
{"train_loss": 11.090355515480042, "train_lr": 3.8506179814018086e-05, "train_wd": 0.3873597674598853, "epoch": 264}
{"train_loss": 11.0903559923172, "train_lr": 3.647612845151274e-05, "train_wd": 0.38804447676949627, "epoch": 265}
{"train_loss": 11.090356826782227, "train_lr": 3.4500532347710366e-05, "train_wd": 0.3887107581085405, "epoch": 266}
{"train_loss": 11.090357542037964, "train_lr": 3.257962334757338e-05, "train_wd": 0.3893585384117606, "epoch": 267}
{"train_loss": 11.090357422828674, "train_lr": 3.071362687828948e-05, "train_wd": 0.3899877466427466, "epoch": 268}
{"train_loss": 11.090357780456543, "train_lr": 2.8902761922817975e-05, "train_wd": 0.3905983138017254, "epoch": 269}
{"train_loss": 11.090357780456543, "train_lr": 2.7147240994190173e-05, "train_wd": 0.3911901729331277, "epoch": 270}
{"train_loss": 11.090357780456543, "train_lr": 2.5447270110570763e-05, "train_wd": 0.39176325913292986, "epoch": 271}
{"train_loss": 11.090357780456543, "train_lr": 2.3803048771080164e-05, "train_wd": 0.3923175095557721, "epoch": 272}
{"train_loss": 11.090357780456543, "train_lr": 2.2214769932382716e-05, "train_wd": 0.3928528634218499, "epoch": 273}
{"train_loss": 11.090357780456543, "train_lr": 2.0682619986042214e-05, "train_wd": 0.3933692620235786, "epoch": 274}
{"train_loss": 11.090357780456543, "train_lr": 1.9206778736648048e-05, "train_wd": 0.39386664873203237, "epoch": 275}
{"train_loss": 11.090357780456543, "train_lr": 1.778741938071434e-05, "train_wd": 0.3943449690031536, "epoch": 276}
{"train_loss": 11.090357780456543, "train_lr": 1.6424708486354728e-05, "train_wd": 0.3948041703837345, "epoch": 277}
{"train_loss": 11.090357780456543, "train_lr": 1.5118805973734518e-05, "train_wd": 0.39524420251716885, "epoch": 278}
{"train_loss": 11.090357780456543, "train_lr": 1.3869865096303689e-05, "train_wd": 0.3956650171489745, "epoch": 279}
{"train_loss": 11.090357780456543, "train_lr": 1.267803242281166e-05, "train_wd": 0.39606656813208513, "epoch": 280}
{"train_loss": 11.090357780456543, "train_lr": 1.1543447820107078e-05, "train_wd": 0.39644881143190996, "epoch": 281}
{"train_loss": 11.090357780456543, "train_lr": 1.0466244436723484e-05, "train_wd": 0.39681170513116404, "epoch": 282}
{"train_loss": 11.090357780456543, "train_lr": 9.44654868725404e-06, "train_wd": 0.39715520943446364, "epoch": 283}
{"train_loss": 11.090357780456543, "train_lr": 8.484480237516079e-06, "train_wd": 0.39747928667269095, "epoch": 284}
{"train_loss": 11.090357780456543, "train_lr": 7.580151990507694e-06, "train_wd": 0.3977839013071249, "epoch": 285}
{"train_loss": 11.090357780456543, "train_lr": 6.733670073158315e-06, "train_wd": 0.3980690199333379, "epoch": 286}
{"train_loss": 11.090357780456543, "train_lr": 5.9451338238740055e-06, "train_wd": 0.39833461128485986, "epoch": 287}
{"train_loss": 11.090357780456543, "train_lr": 5.214635780879842e-06, "train_wd": 0.398580646236606, "epoch": 288}
{"train_loss": 11.090357780456543, "train_lr": 4.542261671360031e-06, "train_wd": 0.39880709780807133, "epoch": 289}
{"train_loss": 11.090357780456543, "train_lr": 3.928090401397437e-06, "train_wd": 0.3990139411662892, "epoch": 290}
{"train_loss": 11.090357780456543, "train_lr": 3.3721940467137573e-06, "train_wd": 0.3992011536285544, "epoch": 291}
{"train_loss": 11.090357780456543, "train_lr": 2.874637844210878e-06, "train_wd": 0.3993687146649108, "epoch": 292}
{"train_loss": 11.090357780456543, "train_lr": 2.435480184315163e-06, "train_wd": 0.39951660590040244, "epoch": 293}
{"train_loss": 11.090357780456543, "train_lr": 2.054772604125246e-06, "train_wd": 0.3996448111170889, "epoch": 294}
{"train_loss": 11.090357780456543, "train_lr": 1.7325597813635968e-06, "train_wd": 0.39975331625582333, "epoch": 295}
{"train_loss": 11.090357780456543, "train_lr": 1.468879529133577e-06, "train_wd": 0.3998421094177945, "epoch": 296}
{"train_loss": 11.090357780456543, "train_lr": 1.2637627914818225e-06, "train_wd": 0.39991118086583166, "epoch": 297}
{"train_loss": 11.090357780456543, "train_lr": 1.1172336397671228e-06, "train_wd": 0.3999605230254723, "epoch": 298}
{"train_loss": 11.090357780456543, "train_lr": 1.0293092698348978e-06, "train_wd": 0.39999013048579224, "epoch": 299}
The text was updated successfully, but these errors were encountered: