Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add news item about KM/C-PH survival analysis tool using the lifelines package #2112

Merged
merged 3 commits into from
Aug 12, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 6 additions & 4 deletions content/news/2023-08-07-generic-tabular-plotter/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,14 +40,16 @@ This Iris test data is relatively tiny with about 150 rows, so the points
are nicely separated, making the hover information easy to use.

Interactive html plots work best for at most, a few thousand well spread points,
so the hover display is easy to control.
so the hover display is easy to control. They reliably freeze up using a recent firefox build if 10k rows, so this tool
now always fails if >5k rows are chosen for html output. Advice to this effect has been
added to the form.

Interactive html output is available in stand alone format, where 3MB of javascript is included,
For <5k rows of data, interactive html output is available in stand alone format, where 3MB of javascript is included,
allowing it to be viewed offline. Short form html requires an internet connection to download the
javascript into the browser so cannot be viewed offline.

PNG plots are recommended for large numbers of rows, since the the hover function tends to be less useful
when the plot is very crowded.
Only PNG output options will work for large numbers of rows, since the the hover function tends to be less useful
when the plot is very crowded, and large html outputs can make browser windows freeze up.

If the tabular data does not have a header row of column names, the user can supply and use a
comma delimited list, as the "header" parameter on the tool form.
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
74 changes: 74 additions & 0 deletions content/news/2023-08-11-lifelineskmcph/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
---
title: "Survival analysis for right censored data using lifelines"
date: "2023-08-11"
authors: Ross Lazarus
authors_structured:
- github: fubar2
tease: "Kaplan-Meier and Cox proportional hazards models are available for testing in Galaxy"
hide_tease: true
subsites: [all]
---

A tool that wraps the [lifelines](https://lifelines.readthedocs.io/en/latest/Survival%20Analysis%20intro.html) package is available

Any Galaxy tabular data with a column containing time and status in a format suitable for pandas and lifelines can be used as input.
Time might be an integer month since a treatment. Status might be 0 for no failure at observation time, 1 for death or failure.
Other columns can be used as groups for KM, or as covariates for Cox-PH.

If the data has no header row, the default column names are col1,....coln unless a header parameter, containing column names in order
delimited with "," is supplied on the tool form.
Whatever the source of column names, they must match the ones provided as parameters.

### Demonstration with the Rossi recidivism data from lifelines tutorials

Runs Kaplan-Meier and generates a plot. Optional grouping variable.

Plots show confidence intervals

![KM plot sample](lifelines_rossi_km.png)

If 2 groups, runs a log-rank test for difference.

![KM plot sample](lifelines_report.png)

If a comma separated list (for example: prio, age, race, mar, fin) of covariate column names is provided,
a Cox proportional hazards model is run, the assumption of proportionality is tested, and
recommendations made.

![KM plot sample](lifelines_rossi_schoenfeld.png)

Also included are partial plots for each covariate like these from the Rossi recidivism lifelines sample data
used in the tool test.

![C-PH partial plot samples](agepartialrossi.png)

![C-PH partial plot samples](parolepartialrossi.png)

Uses pandas read_csv with tabular delimiters so should work with any tabular data with the required columns - time and status for observations.

Issues to https://github.com/fubar2/lifelines_tool please.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a bullet point list but not like a sentence.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for caring :)
Will adjust shortly. Was mostly cut and paste so editing is good....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bgruening - I trust your meal at Doyles is excellent.
Items are shortened.... ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bgruening - less awful now?

Autogenerated so pull requests are possibly meaningless but regeneration of a new version should work.

### Installation for testing

The [lifelines tool](https://toolshed.g2.bx.psu.edu/view/fubar/lifelines_km_cph_tool/dd5e65893cb8), owned by fubar, is available for testing, in the main Galaxy Toolshed.
It is very new and so not suitable for production use yet. Please let me know if it works for you.

### Tool code

The tool code is available for review at the <a href="https://github.com/fubar2/lifelines_tool">github repository</a> where issues should
be raised when there are problems or suggestions. This is machine generated code, so pull requests don't
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit redundant to the above

make much sense. The generator can be rerun with simple changes easily so please suggest
any useful things you'd like to see.


### Tool made with the [Galaxy ToolFactory](https://github.com/fubar2/galaxy_tf_overlay)

Galaxy Training Network tutorials [ are available here](https://training.galaxyproject.org/training-material/topics/dev/tutorials/tool-generators/tutorial.html)

The github repository contains a Galaxy history that was exported after generating the current version of the tool.
If that history is imported into a ToolFactory instance, the generating ToolFactory form can be recreated
using the redo button. Editing the tool id will make a new tool, so edits to parameters can be made and the
new tool generated and tested.


Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.