Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cohorte med outcome #904

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

Cohorte med outcome #904

wants to merge 11 commits into from

Conversation

andrdani
Copy link
Contributor

@andrdani andrdani commented May 14, 2024

Prædiktionsprojekt.
Jeg ønsker at finde de patienter der bliver akut indlagt i somatikken inden for XX måneder efter et ambulant besøg til psykiatrien. Pt må ikke have være indlagt 2 år forud for det ambulante besøg hvor der laves en prædiktion.
Jeg har skabt kohorten (inkl washout (både flytning og pga tidligere somatisk indlæggelse)) inkl tidspunkter for outcomes.

@andrdani andrdani changed the title test Cohorte med outcome May 14, 2024
@andrdani andrdani requested a review from HLasse May 14, 2024 11:03
Copy link
Contributor

@HLasse HLasse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Det ser super godt ud, Andreas! Har lidt kommentarer ift. brug af loaders og SQL og noget starttidspunktet af kohorten, men ellers ser det rigtig fornuftigt ud! :)


AGE_COL_NAME = "age"
MIN_AGE = 18
MIN_DATE = datetime(year=2014, month=1, day=1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vi plejer at køre fra 2013/01/01

Comment on lines 11 to 45
def get_outpatient_visits_to_psychiatry(write: bool = False) -> pd.DataFrame:
# Load all physical visits data
view = "[FOR_besoeg_fysiske_fremmoeder_inkl_2021_feb2022]"
cols_to_keep = "datotid_start, datotid_slut, dw_ek_borger, psykambbesoeg AS pt_type"

sql = "SELECT " + cols_to_keep + " FROM [fct]." + view
sql += "WHERE datotid_start > '2012-01-01' AND psykambbesoeg = 1"

df = pd.DataFrame(sql_load(sql, chunksize=None)) # type: ignore

df[["datotid_start", "datotid_slut"]] = df[["datotid_start", "datotid_slut"]].apply(
pd.to_datetime
)

# Subtract 1 day from datotid_start in ambulant dates because we want to make predictions one day prior to visit
# Even if it is the first psychiatric contact we would still like to make a prediction before
# the visit because the patient might have information from somatic visits
df["datotid_predict"] = df["datotid_start"] - timedelta(days=1) # type: ignore

df = df.drop_duplicates(subset=["dw_ek_borger", "datotid_predict"])

if write:
ROWS_PER_CHUNK = 5_000

write_df_to_sql(
df=df[["dw_ek_borger", "datotid_predict"]], # type: ignore
table_name="all_psychiatric_outpatient_visits_processed_2012_2021_ANDDAN_SOMATIC_ADMISSION",
if_exists="replace",
rows_per_chunk=ROWS_PER_CHUNK,
)

#Så sætter jeg det korrekte navn - de andre kalder nedenstående funktion der gør det samme. Måske for at spare tid når datasættet loades
df = df.rename(columns={"datotid_predict": "timestamp"})

return df[["dw_ek_borger", "timestamp"]] # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Du kan vist udskifte hele denne funktion med et kald til:

from psycop.common.feature_generation.loaders.raw.load_visits import ambulatory_visits

prediction_times = pl.from_pandas(
            ambulatory_visits(
                timestamps_only=True,
                timestamp_for_output="start",
                n_rows=None,
                return_value_as_visit_length_days=False,
                shak_code=6600,
                shak_sql_operator="=",
            )
        ).with_columns(pl.col("timestamp") - pl.duration(days=1))

ambulatory_visits loaderen henter alle besøg til ambulatoriet, med mulighed for at begrænse til en bestemt shakkode (6600 for psykiatrien)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hej Lasse
Når jeg forsøger at køre koden siger den at 'p1' ikke er defineret. Jeg kan ikke helt gennemskue hvor den skal defineres. kan du hjælpe mig?

Copy link
Contributor

@HLasse HLasse May 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ja, der mangler at importeres polars. Så tilføj

import polars as pl

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Det fungere strålende - tak :-)


def outpatient_visits_timestamps() -> pd.DataFrame:
# Load somatic_admissions data
view = "[all_psychiatric_outpatient_visits_processed_2012_2021_ANDDAN_SOMATIC_ADMISSION]"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kan se at du har uploaded den df som funktionen ovenfor laver til SQL databasen i dette navn. Jeg vil klart anbefale bare at kalde get_outpatient_visits_to_psychiatry (eller den funktion jeg har skrevet i kommentaren ovenfor) i stedet - det gør det meget nemmere at læse og vedligeholde.

Denne funktion kan virkeligheden fjernes, da den egentlig bare er et kald til get_outpatient_visits_to_psychiatry

Comment on lines 40 to 48
if write:
ROWS_PER_CHUNK = 5_000

write_df_to_sql(
df=df[["dw_ek_borger", "datotid_start"]],
table_name="all_psychiatric_outpatient_visits_processed_2012_2021_ANDDAN_SOMATIC_ADMISSION",
if_exists="replace",
rows_per_chunk=ROWS_PER_CHUNK,
)
Copy link
Contributor

@HLasse HLasse May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Igen vil jeg anbefale ikke at gøre dette og så i stedet bare kalde denne funktion når du skal bruge det. Der kan hurtigt ske fejl hvis man ændrer lidt i den her kode, men ikke får kørt funktionen igen så man kommer til at bruge en gammel tabel :)

Comment on lines 18 to 19
def get_contacts_to_somatic_emergency(write: bool = False) -> pd.DataFrame:
# Load contact data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vi har også allerede en loader til dette: psycop.common.feature_generation.loaders.raw.load_visits.emergency_visits. Jeg kan se at den kalder et andet view ("[FOR_akutambulantekontakter_psyk_somatik_LPR2_inkl_2021_feb2022]"). Hvis der er uoverenstemmelse mellem de to, ville det være fedt at få dem opdateret så det rigtige view bliver kaldt af emergency_visits.

Igen kan der i kaldet til emergency_visits() specificeres en shakkode så det kun er data fra somatikken du får

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Smart. Jeg har valgt at bruge en anden tabel da jeg egentlig ønsker akutte indlæggelser i somatikken. Har skrevet koden så det passer. Jeg kunne sikkert med fordel lave en loader. Men der vil jeg få brug for hjælp. Spørger næste gang jeg mødet en af jer på Børglumvej.

Comment on lines 55 to 68
def admissions_onset_timestamps() -> pd.DataFrame:
# Load somatic_admissions data
view = "[all_psychiatric_outpatient_visits_processed_2012_2021_ANDDAN_SOMATIC_ADMISSION]"
cols_to_keep = "dw_ek_borger, datotid_start"

sql = "SELECT " + cols_to_keep + " FROM [fct]." + view

admissions_onset_timestamps = pd.DataFrame(sql_load(sql, chunksize=None)) # type: ignore

admissions_onset_timestamps = admissions_onset_timestamps.rename(
columns={"datotid_start": "timestamp"}
)

return admissions_onset_timestamps
Copy link
Contributor

@HLasse HLasse May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Denne kan også slettes :)

Copy link
Contributor

This PR is stale because it has been open 1+ days with no activity. Feel free to either 1) remove the stale label or 2) comment. If nothing happens, this will be closed in 7 days.

@github-actions github-actions bot added the Stale label May 16, 2024
@HLasse HLasse removed the Stale label May 16, 2024
kun at inkludere akutte somatiske indlæggelser
Copy link
Contributor

This PR is stale because it has been open 1+ days with no activity. Feel free to either 1) remove the stale label or 2) comment. If nothing happens, this will be closed in 7 days.

@github-actions github-actions bot added the Stale label May 21, 2024
@MartinBernstorff
Copy link
Contributor

@andrdani Ift. brug af SQL-writeren tænker jeg, at du har været inspireret af noget af Jakob og Eriks gamle kode. Den har ikke været i brug i lang tid, og for at undgå duplikering, er det blevet fjernet her: https://github.com/Aarhus-Psychiatry-Research/psycop-common/pull/924/files

Godt arbejde ovenfor, og helt enig med HLasse! Det er et godt review 👍

@github-actions github-actions bot removed the Stale label May 23, 2024
Copy link
Contributor

This PR is stale because it has been open 1+ days with no activity. Feel free to either 1) remove the stale label or 2) comment. If nothing happens, this will be closed in 7 days.

Copy link
Contributor

github-actions bot commented Jun 4, 2024

This PR is stale because it has been open 1+ days with no activity. Feel free to either 1) remove the stale label or 2) comment. If nothing happens, this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jun 4, 2024
@HLasse HLasse removed the Stale label Jun 4, 2024
Copy link
Contributor

github-actions bot commented Jun 7, 2024

This PR is stale because it has been open 1+ days with no activity. Feel free to either 1) remove the stale label or 2) comment. If nothing happens, this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jun 7, 2024
@HLasse HLasse removed the Stale label Jun 10, 2024
Copy link
Contributor

This PR is stale because it has been open 1+ days with no activity. Feel free to either 1) remove the stale label or 2) comment. If nothing happens, this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jun 12, 2024
Copy link
Contributor

This PR was closed automatically. Feel free to re-open it if you still want to work on it.

@MartinBernstorff
Copy link
Contributor

Har forsøgt at gøre sådan, at I kan slå stalebot fra: 269c927

Har sat label på 👍

Copy link
Contributor

Looks like your PR modifies shared library files in psycop/common/.

We highly recommend getting your code reviewed by one of the core maintainers to avoid breaking other projects that depend on these files :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants