SSPS6001 Quantitative Methods in Soc. Sci.

Mid-semester exam     

Due: Friday 4 October, 11:59pm—submit via Canvas.

Please work on this exam alone. Datasets  are personalised so your results may differ from those of others. Feel free to consult the textbook and lecture slides or any other helpful materials. Questions 5-11 require the use of SPSS (or other statistical software). Total points: 100

1.   Look at the 2016 Australian Bureau of Statistics Census Household Form (link). (3 points)

a)   Give two examples of nominal data collected in the census. b)   Give two examples of ordinal data collected in the census.

c)   Give two examples of interval/ratio data collected in the census.

2.   Describe two possible operational definitions of ‘class’ that would allow a researcher to measure it for individual cases. Explain which definition you think would work best for a real research project, and why. (5 points)

3.   Use the ‘Royal Easter Show transport survey responses’ crosstab emailed to you. This records the results of a hypothetical survey for which people at the Royal Easter Show were asked whether they were local residents or visitors, and whether or not they travelled to the event by car.

a)   Calculate lambda for the association between whether or not the respondent was a resident, and whether or not they travelled to the event by car. Treat the latter as the dependent variable. Show your working (i.e., the intermediate calculations). (5 points)

b)   Explain the meaning of the result as precisely as you can. (5 points)

4.   Use the ‘Literacy by school attendance’ crosstab emailed to you. This records the results of a hypothetical study recording whether or not individuals had formal schooling, and an assessment of their level of literacy.

a)   Calculate Somers’ d for the association between school

attendance and literacy. Show your working (i.e., the intermediate calculations). (5 points)

b)   Interpret your result. (5 points)

c)   If you calculated gamma for the same relationship, would you expect it to be higher or lower than Somers’ d? Why? (3 points)

For the rest of the exam, please use the ‘Exam dataset’ SPSS file emailed to you. This file contains imaginary data for 546 house sales in the city of Nowhere, NSW, for the month of September 19XX.

Variable descriptions

Sale price: price the house sold for, in dollars

Lot size: land area of the property in square feet (due to a quirk of history involving the gold prospector who founded the town, Nowhere has never adopted the metric system)

#[feature]: the number of the specified feature the house contains

driveway, rec room, basement, gas, air cond: 1 indicates the house contains this feature; 0 indicates it does not

desire loc: 1 indicates the house is by the lake or on the hill—both considered desirable locations; 0 indicates it is neither

Note that the dataset is personalised so your results may differ from those of others.

5.   Make a table to show the frequency distribution for sale price, with the variable binned appropriately. (8 points)

6.   Use appropriate graphs to show the frequency distribution of:

a)   number of bedrooms (5 points)

b)   sale price (using the original scale data, not your binned data from the previous question) (5 points)

Be sure to present your graphs with all features needed for a reader to accurately interpret them.

7.   Describe the distributions from the previous question, including appropriate measures of centre and dispersion and any other important information.

a)   number of bedrooms (5 points)

b)   sale price (5 points)

8.   Is there a relationship between the number of storeys and the number of bathrooms a house has? If so, how strong is it? (8 points)

9.   Does whether a house is in a desirable location make a difference to the relationship between number of storeys and number of bathrooms? (8 points)

10. Run a linear regression of sale price on number of bedrooms.

a)   Report your findings, including the regression equation and R- squared. Include a scatterplot with the regression line. (Don’t worry about questions of significance or confidence intervals, as we have not yet covered them in this part of the unit.) Interpret your results. (10 points)

b)   Look at the coefficient you have calculated for number of bedrooms. If you were preparing to sell a house in this town at this time, and it cost $1000 less than this to add an extra bedroom, would you be confident this would make you a profit when you sold the house? Why, or why not? (5 points)

11. Conduct a multiple regression with sale price as the dependent variable, adding more variables of your choice alongside number of bedrooms. Report and interpret your findings, comparing them with the simple regression results from the previous question. (10 points)