(btw yes you can add new categories)

if you could standardise one file format for a task, what would it be:

  1. photos .jxl
  2. open domain image data .exr
  3. videos .av1
  4. lossless audio .flac
  5. lossy audio .opus
  6. subtitles srt/ass
  7. fonts .otf
  8. container mkv (doesnt contain .jxl)
  9. plain text utf-8 (many also say markup but disagree on the implementation)
  10. documents .odt
  11. archive files .tar.zst (this one is causing a bloodbath so i picked randomly)
  12. configuration files toml
  13. typesetting typst
  14. interchange format .ora
  15. models .gltf / .glb
  16. daw session files .dawproject
  17. otdr measurement results .xml
  • @Synthead
    link
    1
    edit-2
    11 months ago

    Here’s an example of a PNG:

    https://gitlab.freedesktop.org/xdg/shared-mime-info/-/blob/b7db17480af0aeeb6df5668e8d10e275527d2825/data/freedesktop.org.xml.in#L5345-5353

    This says that PNG files…

    • Have a mime type of image/png
    • Have a full name of “PNG image”
    • Have an acronym of “PNG”
    • Have an expanded acronym of “Portable Network Graphics”
    • From byte 0, all PNG files have “\x89PNG\r\n\x1A\n” in the file content
    • Match a file glob pattern of “*.png” (case insensitive)

    There are hundreds of entries like this in the XML with formal categorization :)

    It sounds like you’re trying to approach a solved problem, though. Why are you building a file list, and what are you going to use it for? What are you ultimately trying to do?