Note: new exercises coming soon.
data.frameMake a data frame, called mydata, that has 3 variables: var1 which has the numbers 1 to 10, var2 which has the letters A-J, and var3 which has any 10 numbers you want to include.
After you’ve made the data frame, then add a fourth column, month, with the first 10 months:
mydata<-data.frame(var1=1:10, var2=LETTERS[1:10],
var3=c(9,4.5,33,45.6,-14,3,7,0.2,0,7))
mydata$month <- month.name[1:10]
mydata
## var1 var2 var3 month
## 1 1 A 9.0 January
## 2 2 B 4.5 February
## 3 3 C 33.0 March
## 4 4 D 45.6 April
## 5 5 E -14.0 May
## 6 6 F 3.0 June
## 7 7 G 7.0 July
## 8 8 H 0.2 August
## 9 9 I 0.0 September
## 10 10 J 7.0 October
Run each command below to figure out what type of data you get back.
Hint: Use the function typeof() to examine what is returned in each case.
mydata[1]
mydata[[1]]
mydata$var2
mydata["var2"]
mydata[1, 1]
mydata[, 1]
mydata[1,]
mydata[-1,]
data.frameR has some built-n data sets. One of them is called iris. You can just use the iris object (it’s a data.frame) without creating it first.
Hint: if you want iris to show up in the Environment tab, load it into your environment with data(iris). Otherwise, you can still use it, but it may not show in that tab.
Get the dimensions of iris. Then get a list of the names.
Output the first 10 rows of iris.
View the iris data frame in the RStudio data viewer.
Select from the iris data frame the observations where the Sepal.Width is less than 2.5. Do the same, but add the condition that the Sepal.Length must also be less than 5.
Using which.max, select a row with maximum Sepal.Length. Is there only one row with the maximum value?
Rename the column Petal.Width to petalwidth. Challenge: do this without hard coding in the column number.
data.frame: mtcarsAnother built-in data set is called mtcars.
Explore mtcars. Look at the help page for mtcars to see the variable definitions.
subsetLook up the help for the subset function. Can you use it to repeat some of the operations in the two exercises above?
Note: we’ll cover this function later in the workshop.
The repurrrsive package has example list objects in it. Load the package (install first if needed). Then:
sw_people list with the View function. What’s this a list of?A researcher is trying to select observations from the iris data frame for the setosa species, but gets an error:
iris[Species="setosa"]
## Error in `[.data.frame`(iris, Species = "setosa"): unused argument (Species = "setosa")
Correct this expression to select the observations the researcher wants. Hint: there may be more than 1 thing wrong.
Using the built-in data set mtcars:
mpg greater than 30?Hint: you can use the sum function to count the number of TRUE observations in a vector (as we did with sum(is.na(x))). This works because as.numeric(TRUE) == 1 and as.numeric(FALSE) == 0.
Using information from Linear Algebra in R by Søren Højsgaard if needed, create a 10 x 5 matrix of random normal numbers and a 5 x 1 vector of values 1 to 5 and multiply them using matrix multiplication. Transpose your result.
Create a 6 x 6 matrix of random normal draws and take it’s inverse (solve it). Extract the diagonal from the result.
Hint: Generate random normal draws with rnorm: look it up to see the options.
Hint: Matrix multiplication operator is %*%.
Select the vector of letters from the following object.
Do it both using names (hint: printing the object might give you a hint) and using index numbers.
Extra hint: use the object viewer in RStudio to view the list after you create it. See if there’s any functionality to help you with this challenge.
nested_list <- list(level1 = list(level2 = list(letters = LETTERS)))
applyUse the apply function to get the average value of each variable in mtcars. You’ll need to look at the help page to figure out how to use it.