cross-posted from: https://lemmy.dbzer0.com/post/38015770

A washing machine is trapped in a fault state even though all the components function (AFAICT). The controller board has two ports:

  • ISP (to attach an ISP programmer to flash new software)
  • USART (4-pin serial port: 0v, TX, RX, 5v)

I’m guessing the ISP port is useless without whatever proprietary software is needed. But what can the USART do for me? Can that be used to obtain the error code and clear it, or reset the board to the factory state? Has anyone done that, without documentation?

  • @[email protected]OP
    link
    fedilink
    English
    2
    edit-2
    7 days ago

    I appreciate your insights but struggle to reconcile the following with what others say (youtubers and folks in an electronics chat room):

    I doubt many people use eeprom to save any kind of error. … It is far more likely that the script is just a state machine and is reaching an error state because of some missing or bad signal that it needs to continue running the script.

    I asked EE folks how would a controller board sense a fault? Does the controller take resistance measurements on the components? The answer was “highly unlikely - that would be far more sophisticated and costly than what would be realistic in a domestic washing machine”. They said fault detection is based on logic. E.g. if the tacho sensor does not have increasing feedback despite increasing power to the motor, then the controller can detect from that that there is a fault. Or if the water has been filling for a long time and the pressure sensor is not detecting a pressure increase, the machine would know from that activity that the inlet valve has a problem.

    You seem to suggest that the script reruns from a clean state every time and that a “bad signal” would be re-detected each run, which then implies that the machine would repeatedly attempt to fill with water, tumble, drain, etc. But that does not seem to be what I am seeing. The machine will be powered off & unplugged for days, and when powered on it instantly flashes that there is a fault (which is likely only known after attempting to run the various components). This is consistent with what a Youtuber said: the machine (not my particular model but speaking generally) stores the fault code. From there, the machine is trapped in that state until the error code is cleared by pressing a secret sequence of buttons.

    Some leaked tech docs for a different model (same make) mentioned that if a fault occurs 8 times, it then becomes stored in memory. This seems consistent with what I observed. I repeatedly attempted to run the machine. Not sure how many times. Motors would run, failure hits, and then it quits. After doing that so many times (which I regret), the behavior changed. Now the machine will not even attempt to run because it is apparently trapped in an error state.

    So everything seems to point to the error code being stored in EEPROM (which I believe is embedded in the ATmega32L chip). And not just the error code but apparently a count of failed attempts to run a program.

    • @j4k3
      link
      English
      2
      edit-2
      7 days ago

      You are correct and I misspoke in my wording and stated logic. I had intended to constrain my logic to eliminating the potential of the entire script being stored or somehow altered and reloaded. Storing an error code is entirely feasible.

      Now the machine will not even attempt to run because it is apparently trapped in an error state.

      You might trace out the pins that go to any buttons. I am not super familiar with the 32L, but IIRC usually old Atmel chips only have a couple of hardware interrupts available.

      So, when a simple CPU core is running, there are various ways to force it to stop what it is doing and divert attention elsewhere for things that are more important. At the general level there are flags that can be set to indicate higher priority tasks need to be completed inside the CPU. This is stuff like a block of serial communication is received and needs to be processed so that the buffer doesn’t get too full. Or, some timer expired and triggered some code to run next.

      Hardware interrupts are like these flags but are usually setup as the highest priority interrupts in the physical hardware. Like a person can make any Input/Output capable pin into an interrupt by turning it into an input, and simply checking the state of that pin in the code that is running.

      However, the hardware interrupt is very powerful and forces the CPU to only pay attention to whatever code is associated exclusively with that interrupt. Typically in the code, one would only use this hardware interrupt to set a flag somewhere quickly and return to execution of whatever was happening. There are a lot of gotchas that need to be taken care of if one wants to do something more complicated because the hardware interrupt isn’t like multithreading code in a desktop CPU where all the registers and states are saved. This is like, stop in the middle of a word on the exact letter you are pronouncing mid sentence while talking to someone about something important the moment that interrupt happens.

      It is quite likely that the key combo to reset your device is related to one of the hardware interrupt pins. It would be reasonable in the code to check if another pin is low when the interrupt happens.

      You know the device can be reset by someone. It will be just a combination of keys. If there are a lot of keys, this should limit the number of possibilities to something manageable. Write this stuff down and test methodically.

      Also be sure to check Louis Rossman’s new documentation project website and do a search on the EEVBlog forum if you have not already done so.