Long story short, I want to build a system that reorders some components in a document file (be it a docx or odt, I don’t have a hard constraint atm).

So my problem input should be a document file, and I need to be able to approximate the number of pages consumed by this document file, I also need to be able to get the height of individual components (like a single paragraph or a table) to have the data I need to rearrange so I can make the document have less pages.

I don’t have a hard constraint on the programming language of the tool either (Python preferred), I prefer not embedding LibreOffice into my system.

Also I’m willing to hear other solutions (maybe my input is not the optimal thing I can use for this problem).

Thanks in advance!

  • @[email protected]
    link
    fedilink
    210 months ago

    Markdown supports images and tables. It may depend on the rendered though. The GitHub flavour of Markdown supports this for example and I expect Latex supports it too. If existing tools don’t exist to get the height of elements you can probably make it yourself fairly easily if you you the specific font and styling the renderer uses. You’d just have to parse the file, which is basically plain text, and run the same calculations the renderer would. For which approximation might be fine depending on the use case

    • @Red1C3OP
      link
      English
      110 months ago

      Yeah that’s what I’m searching for atm :/