• @bleistift2@feddit.de
    link
    fedilink
    English
    11511 months ago

    I find this a highly interesting problem. How do you measure how much CPU time a program needs? The OS has ceded its control of the CPU to the program. Does it just look at the clock after it’s in charge again to derive a program’s load?

    While we’re at it: How does the OS even yank the CPU away from the currently running process?

    • Jajcus
      link
      fedilink
      8111 months ago

      Well behaving programs give control back to the kernel as soon as they are done with what they are doing. If they don’t the control is forcefully taken away after some assigned time.

      It looks something like this:

      Something happens – e.g. a key is pressed – a process waiting for this event is woken up and gets e.g. 100ms to do it stuff. If it can handle the key press in 50ms, kernel notes it used 50 ms of CPU time and can give control to another process waiting for an event or busy with other work. If the key press triggered long computation the process won’t be done in 100ms, the kernel notes it used 100ms of CPU time and gives control to other processes with pending events or busy with other work.
      After one second the kernel may have noted:

      Process A: used 50ms, then nothing, then 100ms, another 100ms and another 100ms
      Process B: was constantly busy doing something, so it got allocated 6 * 100ms in that one second
      Process C: just got one event and handled it in 50ms
      Process D: was not waken at all

      So total of 1000ms was used – the CPU was 100% busy
      Of that 60% was process B, 35% process A and 5% process C.

      And then that information is read from the kernel by top and displayed.

      How does the OS even yank the CPU away from the currently running process?

      Interrupts. CPU has means triggering and interrupt at a specific time. Interrupt means that CPU stops what it is doing and runs selected piece of kernel code. This piece of kernel code can save the current state of user process execution and do something else or restore saved execution of another process.

    • @agent_flounder
      link
      English
      7311 months ago

      Timer based interrupts are the foundation of pre-emptive multitasking operating systems.

      You set up a timer to run every N milliseconds and generate an interrupt. The interrupt handler, the scheduler, decides what process will run during the next time slice (the time between these interrupts), and handles the task of saving the current process’ state and restoring the next process’ state.

      To do that it saves all the CPU registers (incl stack pointer, instruction pointer, etc), updates the state of the process (runnable, running, blocked), and restores the registers for the next process, changes it’s state to running, then exits and the CPU resumes where the next process left off last time it was in a running state.

      While it does that switcheroo, it can add how long the previous process was running.

      The other thing that can cause a process to change state is when it asks for a resource that will take a while to access. Like waiting for keyboard input. Or reading from the disk. Or waiting for a tcp connection. Long and short of it is the kernel puts the process in a blocked state and waits for the appropriate I/O interrupt to put the process in a runnable state.

      Or something along those lines. It’s been ages since I took an OS class and maybe I don’t have the details perfect but hopefully that gives you the gist of it.

    • @xlash123@sh.itjust.works
      link
      fedilink
      2111 months ago

      How do you measure how much CPU time a program needs?

      While I have no specific examples, that is the task of scheduling algorithms. The kernel is responsible for looking at running processes and figuring out how to assign it CPU time efficiently. That can include a variety of metrics, such as past behavior, if it is IO blocked, process priority, etc.

      There is no perfect scheduling algorithm. Each one has tradeoffs depending on what the priority of the system is.

      Also, you don’t have to relinquish all control to the process if you have multiple cores. If you do, I believe that the process is interrupted after some time to allow the kernel to always be able to check in, but again, it depends on implementation.

      How does the OS even yank the CPU away from the currently running process?

      That is called context switching. Simplified, a process is a list of instructions and a bundle of memory. The memory is composed of RAM and CPU registers (again, simplified). The process memory can stay in the same spot in RAM, but the registers need to move out of the way for another process to take its spot. When a process is yanked away, the state of the registers for that process is snapshotted and stored in RAM managed by the kernel. This allows another process to be allocated to that core without deleting important process data. To resume the paused process, you just need to restore the registers back to the snapshotted state and have the core execute the next instruction for that process, and the process would be none the wiser.

      Of course, there’s a lot more that happens internally, but that’s the main gist.

      • Captain Janeway
        link
        411 months ago

        Is there a perfect scheduler that is non-optimal in the Big(O) sense but is optimal if you’re looking at maximizing hardware utilization? In other words, scheduler that takes a long time to determine CPU utilization for each process, but provides an optimal total CPU utilization? I realize that it would not be ideal since we’d essentially have these “sudden stops” as it recalculates the schedule. I’m just more interested in the theory.

        • @myslsl
          link
          4
          edit-2
          11 months ago

          If you have a fixed collection of processes to run on a single processor and unlimited time to schedule them in, you can always brute force all permutations of the processes and then pick whichever permutation maximizes and/or minimizes whatever property you like. The problem with this approach is that it has awful time complexity.

          Edit: There’s probably other subtle issues that can arise, like I/O interrupts and other weird events fwiw.

        • @kbotc
          link
          English
          311 months ago

          How would you deal with iowaits in a system like that? I can perfectly burn 100% of CPU time running a poll(), but that’s not useful work…

          • Captain Janeway
            link
            111 months ago

            ¯\_(ツ)_/¯

            I guess that’s why I asked. I’m just curious if it’s even possible.

    • @AE5NE@lemmy.radio
      link
      fedilink
      17
      edit-2
      11 months ago

      yes, it looks at a fine-grained clock, usually a cycle counter provided by the CPU for this purpose, to aggregate total on-cpu time for each process.

    • @chellomere
      link
      711 months ago

      Look up preemptive task scheduling. Basic processes will get interrupted after using a too long time slice.

    • @henfredemars@infosec.pub
      link
      fedilink
      English
      5
      edit-2
      11 months ago

      How do you measure how much memory a process is using?

      Shared libraries. Shared memory. Virtual. Physical. Memory mapped files. Some applications depend heavily on the availability of the file system cache for good performance, so should you consider that memory technically in use by that process? What if a process is providing a service to other applications and allocates memory on behalf of those apps to provide those services? If I ask the kernel to allocate lots of things for me to do my job should that really be my memory?

      There are popular measures to get an idea of the memory consumed by a process, but none can tell the whole truth and nothing but.

      I.e. could be the dress.

      • @agent_flounder
        link
        English
        3
        edit-2
        11 months ago

        Hmm. It’s been a hot minute (ok 30 years) since I learned about this stuff and I don’t know how it works in Linux in excruciating detail. Just the general idea.

        So I would be curious to know if I am off base… but I would guess that, since the kernel is in charge of memory allocation, memory mapping, shared libraries, shared memory, and virtualization, that it simply keeps track of all the associated info.

        Info includes the virtual memory pages it allocates to a process, which of those pages is mapped to physical memory vs swap, the working set size, mapping of shared memory into process virtual memory, and the memory the kernel has reserved for shared libraries.

        I don’t think it is necessary to find out how much is “technically in use” by a process. That seems philosophical.

        The job of the OS is all just resource management and whether the systems available physical memory is used up or not. Because if the system spends too much time swapping memory pages in and out, it slows down everything. Shared memory is accounted for correctly in managing all that.

        Maybe I am not understanding your point, but the file system is a different resource than memory so a file system cache (like a browser used) has zero impact on or relevance to memory usage as far as resource management goes.