Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Planets dataset #36

Open
rlpitts opened this issue Dec 4, 2024 · 2 comments
Open

Update Planets dataset #36

rlpitts opened this issue Dec 4, 2024 · 2 comments

Comments

@rlpitts
Copy link

rlpitts commented Dec 4, 2024

I've been using a more recent and more detailed version of the 'planets' dataset for my teaching material, and I have to import it often enough that I think it would be useful as a replacement for the current planets.csv. It's derived from the same sources as the existing material, but since the data are from early 2023, there are more than 5 times as many planets and almost twice as many parameter columns.

I pared down the EU exoplanet dataset released on Kaggle (removed stellar parameters that most people wouldn't find relevant), sanitized the data (removed a few retracted planets and replaced some whitespaces-as-missing-values), and uploaded it to my personal GitHub page, here: https://github.com/rlpitts/rlpitts/blob/main/planets.csv

I hope you're interested. It's a very nice data set for demonstrating Seaborn, Pandas functions, and more general Matplotlib functions.

@mwaskom
Copy link
Owner

mwaskom commented Jan 10, 2025

Wow, I thought we'd already found all the planets! ;)

Yeah I'd be interested in updating the built in planets dataset, in particular the new planet_type variable in this one is nice because the existing one was a little weak in terms of categorical variables.

@rlpitts
Copy link
Author

rlpitts commented Jan 13, 2025

Glad you like it!

FWIW, the "planet type" variable is just a proxy for certain mass or radius bins. Terrestrial planets have a mass and/or radius <= Earth's, Super-Earths go up to either M10 M_E or R2 R_E, and I think the boundary between Neptune-like and Gas Giants is around R5 R_E or M30 M_E. There were definitely a few misclassified planets to start with, but I think I fixed them.

But the detection method is fairly interesting to look at if you want to see how different methods are biased toward different orbital or physical parameters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants