Module lessons (1/2)
Data Frames
Unlike matrices, which can only contain elements of a single type, Data Frames represent the most widely used tabular data structure in R for analyzing real-world data.
A Data Frame is a table in which each column can contain different data types (e.g., a numeric column, a text column, and a logical column), but all columns must have the same length.
Creating a Data Frame
To create a Data Frame, use the data.frame() function, defining the column names and assigning vectors to them:
# Create a sample data frame
students <- data.frame(
name = c("Alice", "Bob", "Charlie"),
grade = c(28, 24, 30),
passed = c(TRUE, TRUE, TRUE)
)
print(students)
Accessing Columns and Rows
There are several ways to extract and manipulate data in a Data Frame:
- Using the dollar operator
$: Allows extracting a single column as a vector.Codegrades <- students$grade # grades will be the vector c(28, 24, 30) - Using square brackets
[row, column]: Similar to matrices.Code# First student (first row) first_student <- students[1, ]
Filtering Data
One of R's strengths is the ability to filter records based on logical conditions:
# Select only students with grade greater than 25
good_students <- students[students$grade > 25, ]
print(good_students)
Dimensions and Modifying Data Frames
To find the dimensions of a Data Frame (the number of rows and columns), we can use the dim() function:
# Returns a vector with two elements: c(rows, columns)
dimensions <- dim(students)
We can also add new columns or update existing ones using the $ operator directly:
# Adds a column with grades increased by 2 points
students$bonus_grade <- students$grade + 2
Try it yourself
Create a Data Frame named students containing two columns: name (with values 'Alice' and 'Bob') and grade (with values 28 and 24).
Show hint
Usa la funzione data.frame: students <- data.frame(name = c('Alice', 'Bob'), grade = c(28, 24))
Solution available after 3 attempts
Given the Data Frame students, extract the grade column using the $ operator and save it in the grades_vector variable.
Show hint
Usa l'operatore $: grades_vector <- students$grade
Solution available after 3 attempts
Filter the Data Frame students to include only students with grade greater than or equal to 26, saving the result in good_students.
Show hint
Usa students[students$grade >= 26, ] senza dimenticare la virgola per selezionare tutte le colonne.
Solution available after 3 attempts
Given the data frame df, get its dimensions using the dim() function and save them in the variable df_dim.
Show hint
Use the dim() function: df_dim <- dim(df)
Solution available after 3 attempts
Given the data frame inventory, add a new column called total_value calculated as the product of the price and quantity columns.
Show hint
Use: inventory$total_value <- inventory$price * inventory$quantity
Solution available after 3 attempts