• @takeheart
    link
    English
    192 months ago

    What’s a systematic algorithm for finding the best approximation (minimal under/overshoot of area) when you are given a raster or vector image representing the “real” borders? Or it just trial and error?

    • @[email protected]
      link
      fedilink
      English
      52 months ago

      For a raster image, you could count the number of true and false positive pixels and true and false negative pixels. Then use statistical metrics for binary classification, like sensitivity and specificity. I guess you could even make an ROC curve by measuring the true positive rate and false positive rate for varying number of edges in the model. I guess for a vector image you could do the same thing, just using the sum of overlapping and non-overlapping areas instead of pixel counts?

      • @takeheart
        link
        English
        22 months ago

        Oh I was thinking about something else and should have worded my question differently: for a given number of vertices, how do you find the coordinates that cover most of the area. So for instance for 3 vertices (triangle): where do you place the three points so that you cover as close as 100% of the area as possible? Overshooting would be allowed, ie a triangle that has an area of 120% of the US would be better than one that has 70%.

        • @[email protected]
          link
          fedilink
          English
          22 months ago

          Yeah, i think minimizing the difference in area would be the primary goal, but you’d need to add additional constraints, like also minimizing the number of times that your edges cross the true perimeter, minimizing the non-overlapping area, or something like that. I dunno for sure, but this sounds like a fun problem. I might give it a shot this weekend. I’m in the early days of trying to learn rust (after years of pure python for work and school), and I’m always looking for toy problems to test myself with!