Skip to content

Deadlock

if every threa is waiting for an event that can be caused only by another thread in the set

can exist iff all of the following conditions hold:

  1. mutual exclusion - at least one resource must be held in a non-shareable mode. If another process requests that resource, the requesting process must be delayed until the resource has been released.
    • make resources sharable
  2. hold and wait - a process holding at least one resource is waiting to acquire additional resources that are currently being held by other processes.
    • make thread cannot request new resource while holding a resource
  3. no preemption - a resource can be released only voluntarily by the process holding it after that process has completed its task.
    • let OS preempt the process and take the resource away from it
  4. circular wait - there exists a set of processes [P1, P2, ..., Pn] such that P1 is waiting for a resource held by P2, P2 is waiting for a resource held by P3, ..., and Pn is waiting for a resource held by P1.
    • Impose an order on all resources, request in order
    • Popular OS implementation technique when using multiple locks

Resource Allocation Graph

  • a directed graph that represents the allocation of resources to processes
flowchart LR
    A((Process A)) -->|requests| B[Resource B]
    C((Thread C)) -->|requests| D[Resource D]
    E((Process E)) -->|requests| F[2 Resource F]

    B -->|holds| C
    D -->|holds| E
    F -->|holds| A
  • if there is only a single resource of each type, a cycle in the graph indicates a deadlock; otherwise, may not be a deadlock

  • the Ostrich algorithm

    • if OS: reboot
    • if device driver: remove device driver and restart
    • if application: kill the process and restart

deadlock avoidance

  • Banker's algorithm - a deadlock avoidance algorithm that tests for the safety of resource allocation before actually allocating resources to a process

    • a process must declare the maximum number of resources it may need before it begins execution
  • Cycle detection - checks for cycles in the RAG (which is a DAG)

  • Deadlock Recovery
    • abort thread - kill the thread and restart it to free up resources, need to run the detection algorithm again
    • preempt resources - forced resource release, need to rollback state of the thread.

CPU scheduling

  • scheduling algorithm - determines what thread to run
  • refer to scheduable entities jobs

scheduling goals

  • turnaround time: \(\(T_\text{turnaround} = T_\text{completion} - T_\text{arrival}\)\)
  • max throughput
  • min average response time: \(\(T_\text{response} = T_\text{first response} - T_\text{arrival}\)\)

starvation - a job is waiting for a resource that is being held by another job, and the other job is not releasing the resource

Preemptive vs Non-preemptive

preemptive - the OS can interrupt a running job and switch to another job non-preemptive - the OS cannot interrupt a running job, and the job must voluntarily yield the CPU to another job

Scheduling policies

first-come first-served (FCFS)

  • schedule jobs in the order of their arrival
  • pro: simply, no starvation
  • cons: convoy effect - short jobs wait for long jobs to finish, long turnaround time, long response time

shortest job first (SJF)

  • schedule jobs in the order of their burst time (the time it takes to complete the job)
  • pro: short turnaround time, short response time
  • cons: starvation - long jobs may never get scheduled, need to know the burst time of the job in advance, difficult to implement

round robin (RR)

  • schedule jobs in a circular order, each job gets a time slice (quantum) to run
  • pro: fair, no starvation, short response time
  • cons: frequent context switching, added overhead

Multi-level feedback queue (MLFQ)

  • schedule jobs in multiple queues, each queue has its own scheduling policy
  • pro: can adapt to different types of jobs, short turnaround time, short response time
  • cons: complex, difficult to implement
10 jobs, each take 100s, what is average turnaround time?

FCFS: (100 + 200 + 300 + ... + 1000) / 10 = 550s
SJF: (100 + 200 + 300 + ... + 1000) / 10 = 550s
RR (if slice time = 1 and no overhead): (991 + 992 + 993 + ... + 1000) / 10 = 995.5s
10 jobs, 1 take 100s, 9 take 10s, what is average turnaround time?
FCFS: (100 + 110 + 120 + ... + 190) / 10 = 145s
SJF: (10 + 20 + 30 + ... + 100) / 10 = 55s
RR (if slice time = 1 and no overhead): (190 + 92 + 93 + ... + 100) / 10 = 105.4s

CPU Utilization

  • the percentage of time the CPU is busy doing useful work
  • \[U = \frac{T_\text{busy}}{T_\text{total}}\]
suppose 3 CPU-bound job, each takes 1ms + context switch 1us
\[U = \frac{3ms}{3ms + 3 * 1us} = \frac{3ms}{3.003ms} = 99.9\%\]
suppose 3 IO-bound job, each takes 20us + context switch 1us
\[U = \frac{3 * 20us}{3 * 20us + 3 * 1us} = \frac{60us}{63us} = 95.2\%\]