# Minimizing Lateness

suggest change

There are numerous problems minimizing lateness, here we have a single resource which can only process one job at a time. Job `j` requires `tj` units of processing time and is due at time `dj`. if `j` starts at time `sj` it will finish at time `fj=sj+tj`. We define lateness `L=max{0,fj-dh}` for all `j`. The goal is to minimize the maximum lateness L.

| 1 | 2 | 3 | 4 | 5 | 6 |

|:—:|:-:|:-:|:-:|:-:|:–:|:–:| | `tj`| 3 | 2 | 1 | 4 | 3 | 2 | | `dj`| 6 | 8 | 9 | 9 | 10 | 11 |

|Job | 3 | 2 | 2 | 5 | 5 | 5 | 4 | 4 | 4 | 4 | 1 | 1 | 1 | 6 | 6 | |:-: |:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:—:|:-:|:-:| | Time | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |10 |11 |12 |13 |14 |15 | | `Lj` |-8 | |-5 | | |-4 | | | | 1 | | |7| | 4 |

The solution `L=7` is obviously not optimal. Lets look at some greedy strategies:

1. Shortest processing time first: schedule jobs in ascending order og processing time j`
2. Earliest deadline first: Schedule jobs in ascending order of deadline `dj`
3. Smallest slack: schedule jobs in ascending order of slack `dj-tj`

Its easy to see that shortest processing time first is not optimal a good counter example is

| 1 | 2 |

|:—:|:-:|:-:| | `tj`| 1 | 5 | | `dj`|10 | 5 |

the smallest stack solution has simillar problems

| 1 | 2 |

|:—:|:-:|:-:| | `tj`| 1 | 5 | | `dj`| 3 | 5 |

the last strategy looks valid so we start with some pseudo code:

1. Sort `n` jobs by due time so that `d1<=d2<=...<=dn`
2. Set `t=0`
3. for `j=1` to `n`
```- Assign job `j` to interval `[t,t+tj]`
- set `sj=t` and `fj=t+tj`
- set `t=t+tj````
1. return intervals `[s1,f1],[s2,f2],...,[sn,fn]`

And as implementation in C++:

```#include <iostream>
#include <utility>
#include <tuple>
#include <vector>
#include <algorithm>

const int jobCnt = 10;

// Job start times
const int processTimes[] = { 2, 3, 1, 4, 3, 2, 3, 5, 2, 1};

// Job end times
const int dueTimes[]     = { 4, 7, 9, 13, 8, 17, 9, 11, 22, 25};

using namespace std;

int main()
{
vector<pair<int,int>> jobs;

for(int i=0; i<jobCnt; ++i)
jobs.push_back(make_pair(processTimes[i], dueTimes[i]));

// step 1: sort
sort(jobs.begin(), jobs.end(),[](pair<int,int> p1, pair<int,int> p2)
{ return p1.second < p2.second; });

// step 2: set t=0
int t = 0;

// step 3:
vector<pair<int,int>> jobIntervals;

for(int i=0; i<jobCnt; ++i)
{
jobIntervals.push_back(make_pair(t,t+jobs[i].first));
t += jobs[i].first;
}

//step 4: print intervals
cout << "Intervals:\n" << endl;

int lateness = 0;

for(int i=0; i<jobCnt; ++i)
{
auto pair = jobIntervals[i];

lateness = max(lateness, pair.second-jobs[i].second);

cout << "(" << pair.first << "," << pair.second << ") "
<< "Lateness: " << pair.second-jobs[i].second << std::endl;
}

cout << "\nmaximal lateness is " << lateness << endl;

return 0;
}```

And the output for this program is:

```Intervals:

(0,2)   Lateness:-2
(2,5)   Lateness:-2
(5,8)   Lateness: 0
(8,9)   Lateness: 0
(9,12)  Lateness: 3
(12,17) Lateness: 6
(17,21) Lateness: 8
(21,23) Lateness: 6
(23,25) Lateness: 3
(25,26) Lateness: 1

maximal lateness is 8```

The runtime of the algorithm is obviously Θ(n log n) because sorting is the dominating operation of this algorithm. Now we need to show that it is optimal. Clearly an optimal schedule has no idle time. the earliest deadline first schedule has also no idle time.

Lets assume the jobs are numbered so that `d1<=d2<=...<=dn`. We say a inversion of a schedule is a pair of jobs `i` and `j` so that `i<j` but j is scheduled before `i`. Due to its definition the earliest deadline first schedule has no inversions. Of course if a schedule has an inversion it has one with a pair of inverted jobs scheduled consecutively.

Proposition: Swapping two adjacent, inverted jobs reduces the number of inversions by one and does not increase the maximal lateness.

Proof: Let `L` be the lateness before the swap and `M` the lateness afterwards. Because exchanging two adjacent jobs does not move the other jobs from their position it is `Lk=Mk` for all `k != i,j`.

Clearly it is `Mi<=Li` since job `i` got scheduled earlier. if job `j` is late, so follows from the definition:

```Mj = fi-dj    (definition)
<= fi-di    (since i and j are exchanged)
<= Li```

That means the lateness after swap is less or equal than before. This concludes the proof.

Proposition: The earliest deadline first schedule `S` is optimal.

Lets assume `S*` is optimal schedule with the fewest possible number of inversions. we can assume that `S*` has no idle time. If `S*` has no inversions, then `S=S*` and we are done. If `S*` has an inversion, than it has an adjacent inversion. The last Proposition states that we can swap the adjacent inversion without increasing lateness but with decreasing the number of inversions. This contradicts the definition of `S*`.

The minimizing lateness problem and its near related minimum makespan problem, where the question for a minimal schedule is asked have lots of applications in the real world. But usually you don’t have only one machine but many and they handle the same task at different rates. These problems get NP-complete really fast.

Another interesting question arises if we don’t look at the offline problem, where we have all tasks and data at hand but at the online variant, where tasks appear during execution.