Overview: Data Extraction Made Simple¶
tidyextractors makes extracting data from supported sources as painless as possible, delivering a populated Pandas DataFrame in three lines of code.
tidyextractors was inspired by Hadley Wickham’s (2014) paper which introduces “tidy data” as a conceptual framework for data preparation.
- Extracts data with minimal effort.
- Creates readable code that requires minimal explanation.
- Exports Pandas Dataframes to maximize compatibility with the Python data science ecosystem.
Data Sources Implemented¶
tidyextractors currently has submodules for extracting data from the following sources:
- Local Git Repositories
- Twitter User Data (including Tweets) using the Twitter API
- Emails stored in the