The goal for the midterm is to spend some time reviewing some of the basic processes that we have utilized thus far. Basic concepts, syntax, and processes which will be useful in data science projects.
The midterm will consist of 2 sections.
Part 1: Multiple Choice/Short answer The first section will be closed book/closed notes. You should understand conceptual issues with the class, technologies, packages, CRISP model, etc. If given a piece of code, you should be able to interpret what it does. In this section, you won’t be asked to generate any code though.
Part 2: Coding Like the homework, there will be sections where you need to write specific code responses to questions, similar to the homework. You are allowed to bring a 1 page (both sides) or 2 pages (1 side each) of a crib sheet with syntax for use only in part 2. Examples will largely come from the homeworks and classroom examples.
Part 1 Sample Questions.
_______ is the Python package that enables a data frame object? a) DF b) DATA-FRAME c) Pandas d) Beautiful Soup e) Numpy
Below is an example of what type of encoding? { “year”: 1997, “make”: “Ford”, “model”: “E350”, }, a) Delimited file b) XML c) JSON d) ODBC e) SQL
Part 2 Sample Questions.
(After given a sample dataframe.)
Using python, recode the variable from a character type to numeric and the gender to 0/1 for male/female.
Perform one hot encoding for the x field of the dataframe.
Take the mean of the y variable of the dataframe.
If you feel one of your solutions will work but was marked off, you can do the following:
Take a picture of the question and solution.
Accept the midterm-appeal assignment.
Test if the solution works.
If your solution does work but was marked off, add a markdown cell with the original picture and commit/push the example.
Provide a Slack message to Jason Kuruzovich and Lianlian Jiang with a link to the repository.
Your solution must work 100%. You cannot use this method to request additional partial credit.
Data is here: Repository