This post was inspired by Hadley Wickham's notion that "Context-switching is expensive." I have found that statement painfully true from my own experiences as well. To quote from his book, R for Data Science,
To start, I know that not everyone here is going to be in ISTM 615 or an Analytics major. So here are some free, highly recommended Python resources if you just want to learn Python:
- Python's documentation - an up to date and freely available resource. The secret to success in any technology field? Read the manual! The documentation is almost always the most current and accurate source of information.
- Al Sweigart's Creative Commons books, most notably Automate the Boring Stuff with Python is recommended often.
- One of these days I'll do a comparison between his selection.
- Welp. I take that back. Learn Python the Hard Way USED to be free. Luckily for us we have library access to the full version of the book.
- While it's not exactly Python, I have seen edx.org's CS50's Introduction to Computer Science highly recommended on the internet among beginner programmers. My favorite thing about the course is it appears to be updated yearly.
Chapter 2: Core Programming SyntaxReference: https://docs.python.org/3/library/functions.html#input
Chapter 3: Variables and Data TypesReference: https://docs.python.org/3/tutorial/introduction.html
Chapter 4: Writing Conditional CodeReference: https://docs.python.org/3/tutorial/controlflow.html
Chapter 5: Modular CodeReference: https://docs.python.org/3/tutorial/controlflow.html#defining-functions
Chapter 6: Iteration: Writing LoopsReference: https://docs.python.org/3/tutorial/controlflow.html#break-and-continue-statements-and-else-clauses-on-loops
Chapter 7: More About StringsReference: https://docs.python.org/3/tutorial/introduction.html#strings
Regular Expressions: https://developers.google.com/edu/python/regular-expressions
Chapter 8: CollectionsReference: https://docs.python.org/3/tutorial/datastructures.html
See Also: https://docs.python.org/3/library/array.html
Chapter 9: Style GuidelinesReference: https://www.python.org/dev/peps/pep-0008/
We already talked about Pythonic code. PEP 8 is currently accepted style guide. Brace style is a hotly debated topic and IMO it just depends on your personal preference. Usually a good text editor or IDE will minimize confusion and maximize readability despite your personal preferences.
Chapter 10: Input and OutputReference: https://docs.python.org/3/tutorial/inputoutput.html
Chapter 11: Into to DebuggingTracing and commenting are certainly useful. But debug mode and unit tests are so involved that they are their own topic (and even career). I don't know which IDE we will be using in class yet, so no point in elaborating too far. I enjoy Visual Studio Code a lot but it seems like Jupyter Notebook is more widely used in the data science field.
Chapter 12: Into Object Oriented LanguagesReference: https://docs.python.org/3/tutorial/classes.html
Chapter 13 & 14 Advanced TopicsAs mentioned in the course, when done well C is incredibly useful for efficient memory management. One of SAS's greatest advantages is its ability to handle big data and large data sets. I recently discovered that SAS is written in C, which would explain its speed advantage over its competitors.
In 1985, SAS was rewritten in the C programming language. - https://en.wikipedia.org/wiki/SAS_(software)
As another example to illustrate these last two lectures, Hadley Wickam, prolific author of wideley used R libraries, published freely available books Advanced R and R Packages. Both books cover techniques that connect R and C++. His libraries (my favorite is a package collection called tidyverse) is incredibly efficient.
To highlight the importance of packages, R will choke/freeze/take forever on large data sets but teams of programmers are constantly working on R to make it better. Matt Dowle et all's library package data.table package reads large files in a fraction of the time using the fread() function.
You may wonder why I'm not using SAS even though I love it so much. SAS is expensive, closed source software that my current employer doesn't own. Regarding concerns that open source R and Python will overtake SAS in popularity... You heard yourself in the video series established languages like Fortran and Cobol are still widely used despite their waning popularity. I anticipate the same will hold true for SAS. SAS has been the industry leader for decades and other languages will have a lot of catching up to do both in infrastructure and maturity. SAS's power and complexity is one of the reasons why it is still the industry standard and a valuable skill to have. Regardless, one can never have too much knowledge. Even experienced SAS programmers will sometimes go to R for certain tasks, and it is beneficial to know the basics of all three of these languages.
Quick learning resource honorable mentions (paid resources), Muarch's Python Programming is oftenly highly recommended as a great Python resource, and it's unfortunately not in TAMU's library. Learn to Code the Hard Way's recently released C programming book has the description Learn to think like the computer hates you, because it does." Which is simply hilarious. You'd understand if you've ever worked with C. I'm notoriously bad at finishing all the programming books I horde but if I ever get around to them I'll definitely write about it.
I hope this post was helpful! I'm also a Python newbie so feedback is appreciated from anyone more experienced.
Jennifer is a Masters student in the College of Science's Analytics program.