-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cohorte med outcome #904
base: main
Are you sure you want to change the base?
Cohorte med outcome #904
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Det ser super godt ud, Andreas! Har lidt kommentarer ift. brug af loaders og SQL og noget starttidspunktet af kohorten, men ellers ser det rigtig fornuftigt ud! :)
|
||
AGE_COL_NAME = "age" | ||
MIN_AGE = 18 | ||
MIN_DATE = datetime(year=2014, month=1, day=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vi plejer at køre fra 2013/01/01
def get_outpatient_visits_to_psychiatry(write: bool = False) -> pd.DataFrame: | ||
# Load all physical visits data | ||
view = "[FOR_besoeg_fysiske_fremmoeder_inkl_2021_feb2022]" | ||
cols_to_keep = "datotid_start, datotid_slut, dw_ek_borger, psykambbesoeg AS pt_type" | ||
|
||
sql = "SELECT " + cols_to_keep + " FROM [fct]." + view | ||
sql += "WHERE datotid_start > '2012-01-01' AND psykambbesoeg = 1" | ||
|
||
df = pd.DataFrame(sql_load(sql, chunksize=None)) # type: ignore | ||
|
||
df[["datotid_start", "datotid_slut"]] = df[["datotid_start", "datotid_slut"]].apply( | ||
pd.to_datetime | ||
) | ||
|
||
# Subtract 1 day from datotid_start in ambulant dates because we want to make predictions one day prior to visit | ||
# Even if it is the first psychiatric contact we would still like to make a prediction before | ||
# the visit because the patient might have information from somatic visits | ||
df["datotid_predict"] = df["datotid_start"] - timedelta(days=1) # type: ignore | ||
|
||
df = df.drop_duplicates(subset=["dw_ek_borger", "datotid_predict"]) | ||
|
||
if write: | ||
ROWS_PER_CHUNK = 5_000 | ||
|
||
write_df_to_sql( | ||
df=df[["dw_ek_borger", "datotid_predict"]], # type: ignore | ||
table_name="all_psychiatric_outpatient_visits_processed_2012_2021_ANDDAN_SOMATIC_ADMISSION", | ||
if_exists="replace", | ||
rows_per_chunk=ROWS_PER_CHUNK, | ||
) | ||
|
||
#Så sætter jeg det korrekte navn - de andre kalder nedenstående funktion der gør det samme. Måske for at spare tid når datasættet loades | ||
df = df.rename(columns={"datotid_predict": "timestamp"}) | ||
|
||
return df[["dw_ek_borger", "timestamp"]] # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Du kan vist udskifte hele denne funktion med et kald til:
from psycop.common.feature_generation.loaders.raw.load_visits import ambulatory_visits
prediction_times = pl.from_pandas(
ambulatory_visits(
timestamps_only=True,
timestamp_for_output="start",
n_rows=None,
return_value_as_visit_length_days=False,
shak_code=6600,
shak_sql_operator="=",
)
).with_columns(pl.col("timestamp") - pl.duration(days=1))
ambulatory_visits
loaderen henter alle besøg til ambulatoriet, med mulighed for at begrænse til en bestemt shakkode (6600 for psykiatrien)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hej Lasse
Når jeg forsøger at køre koden siger den at 'p1' ikke er defineret. Jeg kan ikke helt gennemskue hvor den skal defineres. kan du hjælpe mig?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ja, der mangler at importeres polars. Så tilføj
import polars as pl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Det fungere strålende - tak :-)
|
||
def outpatient_visits_timestamps() -> pd.DataFrame: | ||
# Load somatic_admissions data | ||
view = "[all_psychiatric_outpatient_visits_processed_2012_2021_ANDDAN_SOMATIC_ADMISSION]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kan se at du har uploaded den df som funktionen ovenfor laver til SQL databasen i dette navn. Jeg vil klart anbefale bare at kalde get_outpatient_visits_to_psychiatry
(eller den funktion jeg har skrevet i kommentaren ovenfor) i stedet - det gør det meget nemmere at læse og vedligeholde.
Denne funktion kan virkeligheden fjernes, da den egentlig bare er et kald til get_outpatient_visits_to_psychiatry
if write: | ||
ROWS_PER_CHUNK = 5_000 | ||
|
||
write_df_to_sql( | ||
df=df[["dw_ek_borger", "datotid_start"]], | ||
table_name="all_psychiatric_outpatient_visits_processed_2012_2021_ANDDAN_SOMATIC_ADMISSION", | ||
if_exists="replace", | ||
rows_per_chunk=ROWS_PER_CHUNK, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Igen vil jeg anbefale ikke at gøre dette og så i stedet bare kalde denne funktion når du skal bruge det. Der kan hurtigt ske fejl hvis man ændrer lidt i den her kode, men ikke får kørt funktionen igen så man kommer til at bruge en gammel tabel :)
def get_contacts_to_somatic_emergency(write: bool = False) -> pd.DataFrame: | ||
# Load contact data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vi har også allerede en loader til dette: psycop.common.feature_generation.loaders.raw.load_visits.emergency_visits
. Jeg kan se at den kalder et andet view ("[FOR_akutambulantekontakter_psyk_somatik_LPR2_inkl_2021_feb2022]"). Hvis der er uoverenstemmelse mellem de to, ville det være fedt at få dem opdateret så det rigtige view bliver kaldt af emergency_visits.
Igen kan der i kaldet til emergency_visits() specificeres en shakkode så det kun er data fra somatikken du får
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Smart. Jeg har valgt at bruge en anden tabel da jeg egentlig ønsker akutte indlæggelser i somatikken. Har skrevet koden så det passer. Jeg kunne sikkert med fordel lave en loader. Men der vil jeg få brug for hjælp. Spørger næste gang jeg mødet en af jer på Børglumvej.
def admissions_onset_timestamps() -> pd.DataFrame: | ||
# Load somatic_admissions data | ||
view = "[all_psychiatric_outpatient_visits_processed_2012_2021_ANDDAN_SOMATIC_ADMISSION]" | ||
cols_to_keep = "dw_ek_borger, datotid_start" | ||
|
||
sql = "SELECT " + cols_to_keep + " FROM [fct]." + view | ||
|
||
admissions_onset_timestamps = pd.DataFrame(sql_load(sql, chunksize=None)) # type: ignore | ||
|
||
admissions_onset_timestamps = admissions_onset_timestamps.rename( | ||
columns={"datotid_start": "timestamp"} | ||
) | ||
|
||
return admissions_onset_timestamps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Denne kan også slettes :)
This PR is stale because it has been open 1+ days with no activity. Feel free to either 1) remove the stale label or 2) comment. If nothing happens, this will be closed in 7 days. |
kun at inkludere akutte somatiske indlæggelser
This PR is stale because it has been open 1+ days with no activity. Feel free to either 1) remove the stale label or 2) comment. If nothing happens, this will be closed in 7 days. |
@andrdani Ift. brug af SQL-writeren tænker jeg, at du har været inspireret af noget af Jakob og Eriks gamle kode. Den har ikke været i brug i lang tid, og for at undgå duplikering, er det blevet fjernet her: https://github.com/Aarhus-Psychiatry-Research/psycop-common/pull/924/files Godt arbejde ovenfor, og helt enig med HLasse! Det er et godt review 👍 |
This PR is stale because it has been open 1+ days with no activity. Feel free to either 1) remove the stale label or 2) comment. If nothing happens, this will be closed in 7 days. |
This PR is stale because it has been open 1+ days with no activity. Feel free to either 1) remove the stale label or 2) comment. If nothing happens, this will be closed in 7 days. |
This PR is stale because it has been open 1+ days with no activity. Feel free to either 1) remove the stale label or 2) comment. If nothing happens, this will be closed in 7 days. |
This PR is stale because it has been open 1+ days with no activity. Feel free to either 1) remove the stale label or 2) comment. If nothing happens, this will be closed in 7 days. |
This PR was closed automatically. Feel free to re-open it if you still want to work on it. |
Har forsøgt at gøre sådan, at I kan slå stalebot fra: 269c927 Har sat label på 👍 |
Looks like your PR modifies shared library files in We highly recommend getting your code reviewed by one of the core maintainers to avoid breaking other projects that depend on these files :-) |
Prædiktionsprojekt.
Jeg ønsker at finde de patienter der bliver akut indlagt i somatikken inden for XX måneder efter et ambulant besøg til psykiatrien. Pt må ikke have være indlagt 2 år forud for det ambulante besøg hvor der laves en prædiktion.
Jeg har skabt kohorten (inkl washout (både flytning og pga tidligere somatisk indlæggelse)) inkl tidspunkter for outcomes.