-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
Some use cases:
A. Are two schedules identical? (Same rooms/times/instructors/etc/etc)?
- Useful for checking that data entry was done correctly.
B. How many sections (or hours) of each course are offered?
- Useful for see how overall allocation of teaching has changed
Possible way to think about this:
-
Pick columns that matter vs columns that will be aggregated.
a. In use case A, all columns matter
b. In use case A, only course level items matter (course title, course number) -
Aggregate over the columns that don't matter
- a simple aggregation would just count the rows in an aggregation group
- but might want to compute some other function on the rows (e.g., sum of student hours, sum of faculty hours, etc.)
- as a first step: just compute the number of rows
- as a second step: compute several different measures (e.g, number of rows, sum of student hour, sum of faculty hours)
- as a third step: provide a way for the user to specify what gets calculated
- Now compare the schedules by identifying rows that differ (and what differs about them).
Here is an example of some R code that could be used to do the aggregation step and the resulting output.
S |> group_by(Term, Prefix, CourseNumber) |>
summarise(sections = n(), hours = sum(MinimumCredits))
Term Prefix CourseNumber sections hours
<chr> <chr> <chr> <int> <dbl>
1 FA ASC 111 1 2
2 FA DATA 545 1 4
3 FA MATH 100 1 4
4 FA MATH 110 1 2
5 FA MATH 171 5 20
6 FA MATH 172 2 8
7 FA MATH 221 1 4
8 FA MATH 222 1 4
9 FA MATH 231 1 4
10 FA MATH 251 2 4
# ℹ 36 more rowsThe comparison could show all the "columns that matter" and versions of the aggregation columns for each schedule being compared. Ideally cells where things differ would be highlighted to make it easy to see what is different.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels