Skip to content

Design schedule comparison functionality #407

@rpruim

Description

@rpruim

Some use cases:

A. Are two schedules identical? (Same rooms/times/instructors/etc/etc)?

  • Useful for checking that data entry was done correctly.

B. How many sections (or hours) of each course are offered?

  • Useful for see how overall allocation of teaching has changed

Possible way to think about this:

  1. Pick columns that matter vs columns that will be aggregated.
    a. In use case A, all columns matter
    b. In use case A, only course level items matter (course title, course number)

  2. Aggregate over the columns that don't matter

  • a simple aggregation would just count the rows in an aggregation group
  • but might want to compute some other function on the rows (e.g., sum of student hours, sum of faculty hours, etc.)
  • as a first step: just compute the number of rows
  • as a second step: compute several different measures (e.g, number of rows, sum of student hour, sum of faculty hours)
  • as a third step: provide a way for the user to specify what gets calculated
  1. Now compare the schedules by identifying rows that differ (and what differs about them).

Here is an example of some R code that could be used to do the aggregation step and the resulting output.

S |> group_by(Term, Prefix, CourseNumber) |> 
    summarise(sections = n(), hours = sum(MinimumCredits))

   Term  Prefix CourseNumber sections hours
   <chr> <chr>  <chr>           <int> <dbl>
 1 FA    ASC    111                 1     2
 2 FA    DATA   545                 1     4
 3 FA    MATH   100                 1     4
 4 FA    MATH   110                 1     2
 5 FA    MATH   171                 5    20
 6 FA    MATH   172                 2     8
 7 FA    MATH   221                 1     4
 8 FA    MATH   222                 1     4
 9 FA    MATH   231                 1     4
10 FA    MATH   251                 2     4
# ℹ 36 more rows

The comparison could show all the "columns that matter" and versions of the aggregation columns for each schedule being compared. Ideally cells where things differ would be highlighted to make it easy to see what is different.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions