I am trying to get back into Python so I can do basic exploratory data analysis before building dashboards in JS/D3. But I guess I’ve forgotten more than I realized.
I am following along with this blog to start getting my feet wet: https://medium.com/datadriveninvestor/introduction-to-exploratory-data-analysis-682eb64063ff
The results I get from df.describe()
is different from the authors and I can’t figure out why.
The author gets STD, percentiles, mean, min, max
. As you can see, I get other results.
(Apologies for the very long file; just scroll directly to end and scroll up. I didn’t realize that github would show all results unlike Jupyter which truncates it by default): https://github.com/SabahatPK/TestDataFiles/blob/master/EDA%20on%20All_Data.ipynb
I checked the API reference page: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.describe.html
But that did not shed any light.
I’m using Python3.7…could it be a version problem? I did look up changes to Python but without knowing the author’s version, this is just a shot in the dark. And even if I knew it, these change logs are not the easiest to read.
Help! And thanks!