You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Rosie stopped tweeting a while back and that was the reason.
In the last weeks @andreformento diagnosed this locally and we tested it in the production infrastructure.
Here's the full traceback for executing python3 rosie.py run chamber_of_deputies in a common 8 vcpus 32gb ram Digital Ocean's Droplet:
2021-08-23 22:32:46,878 - rosie.chamber_of_deputies.adapter - INFO - Updating companies
Downloading 2016-09-03-companies.xz: 100%|████████████████████████████████████████████████████████████████████████████| 4.84M/4.84M [00:00<00:00, 34.5Mb/s]
2021-08-23 22:32:47,051 - rosie.chamber_of_deputies.adapter - INFO - Updating reimbursements from 2009
2021-08-23 22:33:05,802 - rosie.chamber_of_deputies.adapter - INFO - Updating reimbursements from 2010
2021-08-23 22:33:27,758 - rosie.chamber_of_deputies.adapter - INFO - Updating reimbursements from 2011
2021-08-23 22:33:52,820 - rosie.chamber_of_deputies.adapter - INFO - Updating reimbursements from 2012
2021-08-23 22:34:14,875 - rosie.chamber_of_deputies.adapter - INFO - Updating reimbursements from 2013
2021-08-23 22:34:39,627 - rosie.chamber_of_deputies.adapter - INFO - Updating reimbursements from 2014
2021-08-23 22:35:00,156 - rosie.chamber_of_deputies.adapter - INFO - Updating reimbursements from 2015
2021-08-23 22:35:24,343 - rosie.chamber_of_deputies.adapter - INFO - Updating reimbursements from 2016
2021-08-23 22:35:47,603 - rosie.chamber_of_deputies.adapter - INFO - Updating reimbursements from 2017
2021-08-23 22:36:10,159 - rosie.chamber_of_deputies.adapter - INFO - Updating reimbursements from 2018
2021-08-23 22:36:29,338 - rosie.chamber_of_deputies.adapter - INFO - Updating reimbursements from 2019
2021-08-23 22:36:47,928 - rosie.chamber_of_deputies.adapter - INFO - Updating reimbursements from 2020
2021-08-23 22:36:58,705 - rosie.chamber_of_deputies.adapter - INFO - Updating reimbursements from 2021
2021-08-23 22:37:07,120 - rosie.chamber_of_deputies.adapter - INFO - Loading reimbursements from /tmp/serenata-data/reimbursements-2018.csv
2021-08-23 22:37:08,965 - rosie.chamber_of_deputies.adapter - INFO - Loading reimbursements from /tmp/serenata-data/reimbursements-2014.csv
2021-08-23 22:37:11,514 - rosie.chamber_of_deputies.adapter - INFO - Loading reimbursements from /tmp/serenata-data/reimbursements-2010.csv
2021-08-23 22:37:14,283 - rosie.chamber_of_deputies.adapter - INFO - Loading reimbursements from /tmp/serenata-data/reimbursements-2020.csv
2021-08-23 22:37:16,251 - rosie.chamber_of_deputies.adapter - INFO - Loading reimbursements from /tmp/serenata-data/reimbursements-2012.csv
2021-08-23 22:37:19,527 - rosie.chamber_of_deputies.adapter - INFO - Loading reimbursements from /tmp/serenata-data/reimbursements-2013.csv
2021-08-23 22:37:23,982 - rosie.chamber_of_deputies.adapter - INFO - Loading reimbursements from /tmp/serenata-data/reimbursements-2011.csv
2021-08-23 22:37:29,628 - rosie.chamber_of_deputies.adapter - INFO - Loading reimbursements from /tmp/serenata-data/reimbursements-2009.csv
2021-08-23 22:37:33,911 - rosie.chamber_of_deputies.adapter - INFO - Loading reimbursements from /tmp/serenata-data/reimbursements-2015.csv
2021-08-23 22:37:39,087 - rosie.chamber_of_deputies.adapter - INFO - Loading reimbursements from /tmp/serenata-data/reimbursements-2021.csv
2021-08-23 22:37:43,265 - rosie.chamber_of_deputies.adapter - INFO - Loading reimbursements from /tmp/serenata-data/reimbursements-2017.csv
2021-08-23 22:37:50,403 - rosie.chamber_of_deputies.adapter - INFO - Loading reimbursements from /tmp/serenata-data/reimbursements-2019.csv
2021-08-23 22:37:57,065 - rosie.chamber_of_deputies.adapter - INFO - Loading reimbursements from /tmp/serenata-data/reimbursements-2016.csv
2021-08-23 22:38:03,934 - rosie.chamber_of_deputies.adapter - INFO - Loading companies
2021-08-23 22:38:22,833 - rosie.chamber_of_deputies.adapter - INFO - Categorizing reimbursements
2021-08-23 22:38:24,119 - rosie.chamber_of_deputies.adapter - INFO - Coercing issue_date column to date data type
2021-08-23 22:38:25,018 - rosie.chamber_of_deputies.adapter - INFO - Coercing situation_date column to date data type
2021-08-23 22:38:39,961 - rosie.chamber_of_deputies.adapter - INFO - Renaming columns to Serenata de Amor standard
2021-08-23 22:38:39,962 - rosie.chamber_of_deputies.adapter - INFO - Dataset ready! Rosie starts her analysis now :)
2021-08-23 22:39:10,942 - rosie.core - INFO - Running classifier 1 of 6: meal_price_outlier
2021-08-23 22:40:08,740 - rosie.core - INFO - Running classifier 2 of 6: over_monthly_subquota_limit
2021-08-23 22:44:21,321 - rosie.core - INFO - Running classifier 3 of 6: suspicious_traveled_speed_day
Traceback (most recent call last):
File "rosie.py", line 64, in <module>
main()
File "rosie.py", line 60, in main
run(module, arguments['--output'])
File "rosie.py", line 34, in run
module.main(directory)
File "/opt/serenata-de-amor/rosie/rosie/chamber_of_deputies/__init__.py", line 9, in main
core()
File "/opt/serenata-de-amor/rosie/rosie/core/__init__.py", line 45, in __call__
self.predict(model, name)
File "/opt/serenata-de-amor/rosie/rosie/core/__init__.py", line 73, in predict
prediction = model.predict(self.dataset)
File "/opt/serenata-de-amor/rosie/rosie/chamber_of_deputies/classifiers/traveled_speeds_classifier.py", line 70, in predict
is_outlier = self.__applicable_rows(_X) & \
File "/opt/serenata-de-amor/rosie/rosie/chamber_of_deputies/classifiers/traveled_speeds_classifier.py", line 100, in __applicable_rows
X[['latitude', 'longitude']].notnull().all(axis=1)
File "/usr/local/lib/python3.6/dist-packages/pandas/core/frame.py", line 2918, in __getitem__
data = self._take_with_is_copy(indexer, axis=1)
File "/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py", line 3363, in _take_with_is_copy
result = self.take(indices=indices, axis=axis)
File "/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py", line 3348, in take
self._consolidate_inplace()
File "/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py", line 5216, in _consolidate_inplace
self._protect_consolidate(f)
File "/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py", line 5205, in _protect_consolidate
result = f()
File "/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py", line 5214, in f
self._mgr = self._mgr.consolidate()
File "/usr/local/lib/python3.6/dist-packages/pandas/core/internals/managers.py", line 983, in consolidate
bm._consolidate_inplace()
File "/usr/local/lib/python3.6/dist-packages/pandas/core/internals/managers.py", line 988, in _consolidate_inplace
self.blocks = tuple(_consolidate(self.blocks))
File "/usr/local/lib/python3.6/dist-packages/pandas/core/internals/managers.py", line 1909, in _consolidate
list(group_blocks), dtype=dtype, can_consolidate=_can_consolidate
File "/usr/local/lib/python3.6/dist-packages/pandas/core/internals/managers.py", line 1934, in _merge_blocks
new_values = new_values[argsort]
MemoryError
Due to this, the pipeline doesn't propagate further and Jarbas isn't updated. Therefore, no new data was being registered to be tweeted.
A PR (#561) has been opened to solve this temporarily, but any help would be appreciated in how we could reduce the memory consumption.
I created this PR #562 to help to run using only last years 👀
I know that is not a optimization, but it create a possibility to run with less resources
Rosie stopped tweeting a while back and that was the reason.
In the last weeks @andreformento diagnosed this locally and we tested it in the production infrastructure.
Here's the full traceback for executing
python3 rosie.py run chamber_of_deputies
in a common 8 vcpus 32gb ram Digital Ocean's Droplet:Due to this, the pipeline doesn't propagate further and Jarbas isn't updated. Therefore, no new data was being registered to be tweeted.
A PR (#561) has been opened to solve this temporarily, but any help would be appreciated in how we could reduce the memory consumption.
Kudos @andreformento!
The text was updated successfully, but these errors were encountered: